Replace compressed `tar` with format for random access #936

abitrolly · 2022-08-15T17:57:56Z

I am rewriting code for go-containers to allow streaming access to container image, and I'd like to know if it is not too late to add parsing streamed container images into https://github.com/opencontainers/image-spec/blob/main/considerations.md ?

The particular problem for parsing image stream is the tar format, which, if compressed, requires to be fully processed to read table of contents. Quoting https://en.wikipedia.org/wiki/Tar_(computing)#Random_access

The tar format was designed without a centralized index or table of content for files and their properties for streaming to tape backup devices. The archive must be read sequentially to list or extract files. For large tar archives, this causes a performance penalty, making tar archives unsuitable for situations that often require random access to individual files.

Which means that if manifest.json is located at the end of stream, the whole image needs to be downloaded (often into memory) and decompressed. Analyzing image requires several passes for reconstructing the final usable state. It it impossible to just list files without downloading and decompressing whole layers. Which leads to waste of cloud resources.

Adding the requirement of being able to parse image in one go, without necessarily downloading the whole contents, will help to define "well-formed OCI image" format that could be then enforced by registries on push. I've seen that specification mentions zip and probably allows for alternative compression methods, but I don't believe many would implement that if tar continues to be the de-facto standard.

Because the spec reached 1.0 version, I am interested to know if changes to the spec are still acceptable?

The text was updated successfully, but these errors were encountered:

hdonnay · 2022-08-15T19:46:41Z

I'm confused about what you're asking about.

For the image layout this is already supported, although I don't know of any tool that will spit out a pre-compressed version of that tree. Registries generally don't accept images laid out in this format, though; they use the distribution API.

I've implemented file access over tars so I can definitely commiserate about the shortcomings of the format.

sudo-bmitch · 2022-08-15T19:48:42Z

Would eStargz make sense for your use case: https://github.com/containerd/stargz-snapshotter/blob/main/docs/estargz.md

This is being proposed in #877 .

abitrolly · 2022-08-18T20:03:50Z

Without diagrams the image layout spec is hard to grok. For example here a diagram https://github.com/google/go-containerregistry/blob/main/pkg/v1/tarball/README.md#structure

Without it I wouldn't even try to contribute any enhancement to crane image handling.

@sudo-bmitch seekable tar.gz would not help, because it places TOC at the end of stream, so when the image is piped to stdin, the whole contents needs to be cached to start processing. Not good at all for limited memory devices like RPi.

I can guess that people choose tar and tar.gz, because it is easy to create, but it also carries some mental burden of the past. Like char: character device, block: block device. Companies invented a dozen of thin protocols like protobuf for efficiently passing messages, but for some reason do not invest enough into optimizing fat protocols, such as images.

Ideal format - index with checksum and size upfront. So that various tools can build indexes and scan contents without downloading content blobs at all. Then the rest of piped data should be laid out so that the most common use case (launch container) should again, occur without recursive writing. The image that follows this layout can be called "well formed" and linters could be provided to check and fix images before submitting them to repos.

sudo-bmitch · 2022-08-18T20:41:12Z

@sudo-bmitch seekable tar.gz would not help, because it places TOC at the end of stream, so when the image is piped to stdin, the whole contents needs to be cached to start processing. Not good at all for limited memory devices like RPi.

If you're trying to do one pass on stdin, you either have all the files properly ordered or you need to extract/save to a temporary location. My own solution for this says that it supports stdin to input it's own export, but any other input would need to be as a seekable file so it can be rescanned multiple times. I've seen others that support stdin expand it to a temporary directory first. It's not just the top level index.json, you need each manifest, which may point to other manifests, to know which blobs to import, and then you need to import the blobs before importing the manifest into a runtime or registry upstream.

Ideal format - index with checksum and size upfront. So that various tools can build indexes and scan contents without downloading content blobs at all. Then the rest of piped data should be laid out so that the most common use case (launch container) should again, occur without recursive writing. The image that follows this layout can be called "well formed" and linters could be provided to check and fix images before submitting them to repos.

I'm not sure how registries and repos would do the validation. They shouldn't ever see the tar.gz OCI Layout file, only a push of manifests and blobs (the layer tar.gz should be a different thing than what we're talking about).

I think forcing the creators of a Layout tar.gz to inject all the files in the correct order is a losing fight. There's just too much out there, including the tar command itself, that won't do this today. And most tooling will opt to accept that input for compatibility. The best you'll get is offering an enhanced experience when input follows a specific pattern (like how estargz enables lazy loading of the layer contents for faster startups, and with a regular tar.gz it falls back to the slow start).

abitrolly · 2022-08-19T04:16:58Z

If you're trying to do one pass on stdin, you either have all the files properly ordered or you need to extract/save to a temporary location.

That's exactly what I did in google/go-containerregistry#1274 and now I am trying an approach without temp files google/go-containerregistry#1429. The problem is that /tmp is usually a RAM disk on Linux, so it is no different that just caching to memory.

think forcing the creators of a Layout tar.gz to inject all the files in the correct order is a losing fight. There's just too much out there, including the tar command itself, that won't do this today.

That's why tar.gz is not the best format for the spec. Too flexible and too inflexible at the same time.

abitrolly mentioned this issue Aug 15, 2022

tarball: Streaming image parser PoC google/go-containerregistry#1429

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace compressed `tar` with format for random access #936

Replace compressed `tar` with format for random access #936

abitrolly commented Aug 15, 2022 •

edited

Loading

hdonnay commented Aug 15, 2022

sudo-bmitch commented Aug 15, 2022

abitrolly commented Aug 18, 2022

sudo-bmitch commented Aug 18, 2022

abitrolly commented Aug 19, 2022

Replace compressed tar with format for random access #936

Replace compressed tar with format for random access #936

Comments

abitrolly commented Aug 15, 2022 • edited Loading

hdonnay commented Aug 15, 2022

sudo-bmitch commented Aug 15, 2022

abitrolly commented Aug 18, 2022

sudo-bmitch commented Aug 18, 2022

abitrolly commented Aug 19, 2022

Replace compressed `tar` with format for random access #936

Replace compressed `tar` with format for random access #936

abitrolly commented Aug 15, 2022 •

edited

Loading