-
Notifications
You must be signed in to change notification settings - Fork 658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace compressed tar
with format for random access
#936
Comments
I'm confused about what you're asking about. For the image layout this is already supported, although I don't know of any tool that will spit out a pre-compressed version of that tree. Registries generally don't accept images laid out in this format, though; they use the distribution API. I've implemented file access over tars so I can definitely commiserate about the shortcomings of the format. |
Would eStargz make sense for your use case: https://github.com/containerd/stargz-snapshotter/blob/main/docs/estargz.md This is being proposed in #877 . |
Without diagrams the image layout spec is hard to grok. For example here a diagram https://github.com/google/go-containerregistry/blob/main/pkg/v1/tarball/README.md#structure Without it I wouldn't even try to contribute any enhancement to @sudo-bmitch seekable tar.gz would not help, because it places TOC at the end of stream, so when the image is piped to stdin, the whole contents needs to be cached to start processing. Not good at all for limited memory devices like RPi. I can guess that people choose Ideal format - index with checksum and size upfront. So that various tools can build indexes and scan contents without downloading content blobs at all. Then the rest of piped data should be laid out so that the most common use case (launch container) should again, occur without recursive writing. The image that follows this layout can be called "well formed" and linters could be provided to check and fix images before submitting them to repos. |
If you're trying to do one pass on stdin, you either have all the files properly ordered or you need to extract/save to a temporary location. My own solution for this says that it supports stdin to input it's own export, but any other input would need to be as a seekable file so it can be rescanned multiple times. I've seen others that support stdin expand it to a temporary directory first. It's not just the top level index.json, you need each manifest, which may point to other manifests, to know which blobs to import, and then you need to import the blobs before importing the manifest into a runtime or registry upstream.
I'm not sure how registries and repos would do the validation. They shouldn't ever see the tar.gz OCI Layout file, only a push of manifests and blobs (the layer tar.gz should be a different thing than what we're talking about). I think forcing the creators of a Layout |
That's exactly what I did in google/go-containerregistry#1274 and now I am trying an approach without temp files google/go-containerregistry#1429. The problem is that
That's why |
I am rewriting code for
go-containers
to allow streaming access to container image, and I'd like to know if it is not too late to add parsing streamed container images into https://github.com/opencontainers/image-spec/blob/main/considerations.md ?The particular problem for parsing image stream is the
tar
format, which, if compressed, requires to be fully processed to read table of contents. Quoting https://en.wikipedia.org/wiki/Tar_(computing)#Random_accessWhich means that if
manifest.json
is located at the end of stream, the whole image needs to be downloaded (often into memory) and decompressed. Analyzing image requires several passes for reconstructing the final usable state. It it impossible to just list files without downloading and decompressing whole layers. Which leads to waste of cloud resources.Adding the requirement of being able to parse image in one go, without necessarily downloading the whole contents, will help to define "well-formed OCI image" format that could be then enforced by registries on push. I've seen that specification mentions
zip
and probably allows for alternative compression methods, but I don't believe many would implement that iftar
continues to be the de-facto standard.Because the spec reached 1.0 version, I am interested to know if changes to the spec are still acceptable?
The text was updated successfully, but these errors were encountered: