If happiness is just a tear drop away, and insight is just a 404 away, then friends are just a blog post away! I wanted to quickly follow up on a post from a few days ago where I had trouble struggled like a baby penguin to find the configuration metadata from a manifest for the Google Deep Variant container on Google Container Registry.

A Solution to the Mystery

The answer came down to (as it always does) a detail I hadn’t found. Before I show this, let’s review a few possible (common) avenues to asking for manifests (the texty data structures that have information about layers and metadata). In all of these cases to account for different registry endpoints I am taking a brute force approach making three requests with the following Accept headers (note these are saying we will accept a particular mediaType) with the aim to maximize the metadata that I can get returned:

Accept Header (schemaVersion) For what?
application/vnd.docker.distribution.manifest.v1+json (1) config, layers (fsLayers)
application/vnd.docker.distribution.manifest.v2+json (2) layers (layers)
application/vnd.docker.container.image.v1+json (1) config

And not listed is the request for a manifest list, which you could do first to then ask for a particular architecture and operating system combination. From the above, we see that we have a few options! If I am able to get an old version 1 manifest, I get a two for one deal - the metadata that I need can be parsed from a history field, and layers in fsLayers. If I can get version 2, then I can find the layers, and there are nice sizes in there too. I don’t get the metadata. In the case of schemaVersion 2, I would need to rely on (still) having a version 1 manifest, or container image manifest the (third row).

So our issue was that we had a schemaVersion 2 manifest, but we didn’t have either of these fall backs. We just had layers without the context, which is like birthday cake without the candles, presents, atmosphere and friends. You can eat it, but you have lost the meaning of it all. The answer came from one of my favorite places, a Github discussion about this exact same thing! I’ll discuss the fix, and then share some additional resources and thoughts. We start as we did before putting together a request that is specific to a container namespace and tag:

import requests

# URL: This is the url for the manifest
url = 'https://gcr.io/v2/deepvariant-docker/deepvariant/manifests/0.5.0'

# The schema verison is driven by the accept header
headers = {'Accept': 'application/vnd.docker.distribution.manifest.v2+json'}

# Make the call, view the status code, and the manifest
response = requests.get(url, headers=headers)
response.status_code                             # This should be 200
manifest = response.json()

At this point we have our manifest, and we can notice this little entry under the layers:

   ...
   "mediaType":"application/vnd.docker.distribution.manifest.v2+json",
   "schemaVersion":2,
   "config":{  
      "mediaType":"application/vnd.docker.container.image.v1+json",
      "digest":"sha256:c0acf3d54dce5eabf6ae422593a3d266e1d5d61129d53f32ec943d133b395a6c",
      "size":7601
   }
}

I had seen this config section and tried using it in different ways, such as asking for the mediaType shown above with the digest again to the manifest endpoint and getting 404. I also tried setting the value as a Docker-Content-Digest thinking that might be intuitive. Anyway, we DO need that guy! But he’s not a manifest! He’s a blob!


Instead, we need to request it from a different endpoint, the blobs one of course. That is correct - we are going to get back a configuration manifest from the endpoint that traditionally returns an image layer “blob.” The switch is the specification of a particular mediaType and digest. Here is what that looks like:

digest = manifest['config']['digest']
mediaType = manifest['config']['mediaType']

headers = {'Accept': mediaType}
blob_url = "https://gcr.io/v2/deepvariant-docker/deepvariant/blobs/%s" %digest

config = requests.get(blob_url, headers=headers)

If you look above, there is only one change to this call, and that is changing the word manifests to blobs. And guess what we get? A beautiful, config with all the things! And when I plug this fix into the Singularity Global Client we get a fixed example, and avoid all that hacky nonsense we did before!

The Power of Community

One thing that I learned from this experience is perspective. I spent a lot of time thinking about the various user perspectives, and this has great value for both users and maintainers. But I didn’t think of how challenging it must be on the other end - responsible for the development and maintenance of a scaled resource that is widely used and rapidly changing. I’m thankful for this insight from several of the maintainers ( here is the thread). I finally learned about this mysterious Docker-Content-Digest, and I’ll pass forward the information. Specifically:

As far as Docker-Content-Digest is concerned, that is a response-only header that provides secondary verification of fetched resources. This is used in the case where a fetch is made by tag is just provides extra verification between the resource sent and the request headers, preventing backend tampering. It is not required to verify this. @stevvooe

There is great work to come from this team, and many good ideas that have promise to smooth out the space. We will get there! The discussion made me step back and realize the amount of thought and careful attention that has gone into the design of these standards, but also that we are all human. We make mistakes, learn from them, and strive to make things better. The post linked is also an awesome read - it makes me really happy when someone takes these things and packages them in an easily digestable (no pun intended) format. I am overall moved by the support and feedback offered, and now I carry a stronger toolbelt to tackle these issues in the future. I also would encourage others, whether you are academic, software developer, or general user, to reach out to people. Don’t be afraid to ask questions or participate in discussion. It’s fun, and it usually leads to good things. In the worst case scenario if something isn’t fun? You can step aside.

The Bias of Support

I also identified a subtle bias that I have, and perhaps it is a bias based on personal experience. In HPC we usually stand out relative to industry because of the support that we offer. HPC teams exist primarily to serve a small group and do whatever it takes to solve a problem. Most of my “customer support” experiences outside of that are frustrating, because if you make it through the 10 minutes of phone prompts and your call isn’t dropped, you are lucky to get the right person to talk to, let alone someone that can invest a large amount of resources to help you. This might just be a distinction between paid and free support, and I have tended to be on the free side with various webby companies.

But wait a minute - the experience that I just had goes strongly against that. It was just another developer to developer interaction with questions, support, and then a few thumbs up emoticons (which, by the way, seem trivial but are incredibly important for communicating emotion in these places). What you can’t see is that one of the maintainers reached out to me to offer more information and resources, and I was completely floored. Yes, it’s probably part of a job role, but it’s also a reflection of having good values. Regardless of where you work, or what you do, if you remember that we are all people, and put your focus on helping others, good comes from that.