Docker Registry for IPFS

I've implemented the IPFS Storage driver for the Docker Registry.

This allows it to act as a gateway to push and pull images through IPFS and benefit from the fact that IPFS can share pieces of identical data quite well.

IPFS shares data between machines via CIDs (Content Identifiers). CIDS are hashes of the data, with no relation to a filename. This means that when we push an image into IPFS, the various CIDS produced for the blob-files for each layer of the Docker Image will be identical between my machine and others.

The sequence of events for pulling an image is currently as follows:

Pulling an Image

1. Start the Gateway Images

Create a network to let the IPFS and the Registry containers communicate:

docker network create --attachable ipfs-network

Start the IPFS image:

docker run -d --rm --name "ipfs" --network "ipfs-network" -p 4001:4001 -p 4004:4004 -p 8080:8080 ipfs/go-ipfs:latest

Start the moritonal/distribution-ipfs-client which is a fork of the Docker Registry compiled with the IPFS gateway:

docker run -d --rm --name "registry" --network "ipfs-network" -p 127.0.0.1:5000:5000 -e REGISTRY_STORAGE_IPFS_ADDRESS=ipfs:5001 moritonal/distribution-ipfs-client

2. Pull an image from the registry

docker login --username test --password test123 127.0.0.1:5000
docker pull 127.0.0.1:5000/bonner.is/hello-world-over-ipfs
I think there is a bug in IPFS which shows the following error: ipfs: lookup bonner.is on 127.0.0.11:53: read udp 127.0.0.1:48486->127.0.0.11:53: i/o timeout, the answer is genuinely run it a few times and see if it resolves.
I also think there is a bug where my not-great Golang file logic means that the Registry caches corrupted data. I only learnt Golang for this project, so cut me some slack.

In the background the Registry is going to be doing the following:

  1. DNS lookup against bonner.is and discover two TXT records of interest.
  2. Use the TXT record ipfsnode= to tell IFPS to connect to the IPFS node on my server (this is bad practice, but also the only way of even getting close to good discovery performance here).
  3. Use the TXT record dnslink= to find the CID for the registry data.
  4. Use IPFS to read the registry data the same way it would a local filesystem.

3. Run the image

docker run --rm 127.0.0.1:5000/bonner.is/hello-world-over-ipfs

And you should see the output, Hello, you downloaded this image over IPFS!

Pushing an Image

Assuming that you'd followed the previous steps.

1. Start the client in "upload" mode

docker rm -f registry
docker run -d --rm --name "registry" --network "ipfs-network" -p 127.0.0.1:5000:5000 -e REGISTRY_STORAGE_IPFS_ADDRESS=ipfs:5001 -e REGISTRY_STORAGE_IPFS_DIRECTION=upload moritonal/distribution-ipfs-client

Unfortunaty, because of the low-level nature of the Storage driver in Registry, we have to explicity tell our gateway whether we are downloading or uploading images. This is something I could imagine there will be a solution to, likely involving a proxy.

2. Tag an image

docker pull hello-world
docker tag hello-world 127.0.0.1:5000/example.com/hello-world

We're just grabbing the latest hello-world image from Docker to test. We're using example.com as the domain here, which you would have to replace with a domain that you can add TXT records to. This is a key part of the decentralisation until we get true IPNS.

3. Push the Image

docker push 127.0.0.1:5000/example.com/hello-world

This will, push the image into the IPFS container, still all on your local machine.

4. Get the CID

docker exec ipfs ipfs files stat /example.com

You should get roughly the following output:

QmPSQZBavMbguNN6nBsw3MeyUB2avNSbcRvYSWwQ1Drd5h
Size: 0
CumulativeSize: 6179
ChildBlocks: 1
Type: directory

An important thing to note is that this CID will represent all the images you've pushed locally under that domain, so when you put this CID online you'll be exposing them all.

The CID outputted might be different because of how Docker treats image repositories as statefull (portnumbers, tags ect) but a key point here is that if you run the following command:

docker exec ipfs ipfs files ls -l /example.com/docker/registry/v2/blobs/sha256/

You should see the following output which will prove we've pushed the same image:

1b/     QmXk41iU88ozP6H5gkHf9H3gtbHMHv24kKzkT6kmDbP6bd 0
92/     QmUcvBe7sJULseWA6ofRc82TbV4DY46gbbhSGAU2LWXGHm 0
fc/     QmQvJzzhvh2kyPh8VsNWpDwZC5mKNoK3wk6UpGChnnbm3g 0

Docker stores blobs as hashed content in it's filesystem, which is what makes it such a perfect candidate for IPFS. The blobs above will build up as you load more images, and common images will naturally have larger seed groups.

5. Add the CID to your DNS

We take the CID QmPSQZBavMbguNN6nBsw3MeyUB2avNSbcRvYSWwQ1Drd5h and add it to our DNS. I have the following CID on my Cloudflare account for bonner.is.

And that's it. When someone else pulls 127.0.0.1:5000/example.com/hello-world they should (if we somehow managed to get that TXT record onto that domain) pull down the hello-world image we pushed up to it.

The Good

Naturally shares layers between images and hosts. Meaning that (fingers-crossed) if two images use the same base image such as alpine or node, then the emergant behaviour of IPFS will cause those images to be seeded between users.

The Bad

The Registry cannot stream-write images to IPFS. Look, judge me later but I only just learnt Golang and have no idea how to handle that darn mime/multipart package! Any help there would likely massively speed up push-times.

The Ugly

Bitshare discovery (finding peers with the CID you're looking for) can be quite slow. That's the reason I implemented the ipfsnode TXT record to speed up discovery.

What could it do in the future:

Here's where I need a hand. I need to implement a few concepts.

  1. Something better than using DNS to publish CIDS. There's gotta be a better way, but pub/sub and IPNS were both so fagile I stuck to DNS.
  2. Private repos. This is doable if you host a private swarm, but then the loose the benefits of shared layering (which is kind of the point). But I'd prefer some alternative involving the encryption of data within the Registry.

Alternatives:

I always try to find alternatives when working on these kinds of projects. Closest I found was the IPDR, the reason I didn't contribute to this project was because they solved the problem with a Proxy and I went for a Storage plugin. Two methods to solve a problem so I figured I'd continue.

Where can I find the code?

The code of the IPFS Storage Driver for the Docker Registry can be found here. This is automatically built and published via Dockerhub (yes I get the irony).

This repo is the client for the Registry that just adds a bit of config. This is built whenever the base-image above changes.