Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

max depth exceeded when trying to load images that have many apt dependencies #36

Open
juanzolotoochin opened this issue Apr 17, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@juanzolotoochin
Copy link

juanzolotoochin commented Apr 17, 2024

To repro:

Expected output: test passing

Actual output: a bunch of tests failing with Error: error setting env vars: Error creating container: no such image

From my conversations with @thesayyn , the underlying error is {"errorDetail":{"message":"max depth exceeded"},"error":"max depth exceeded"}

Googling for that error, seems to be caused by too many docker layers. One workaround is to combine those layers into a single tar:

pkg_tar(
    name = "linux_layers.tar",
    deps = select({
        "@platforms//cpu:arm64": [
            "%s/arm64" % package
            for package in PACKAGES
        ],
        "@platforms//cpu:x86_64": [
            "%s/amd64" % package
            for package in PACKAGES
        ],
    }),
)

Then using that tar on the image instead of each debian package tar.

Presumably, the same should be achievable using rules_distroless flatten rule. However this:

flatten(
    name = "linux_layers_flatten.tar",
    tars = select({
        "@platforms//cpu:arm64": [
            "%s/arm64" % package
            for package in PACKAGES
        ],
        "@platforms//cpu:x86_64": [
            "%s/amd64" % package
            for package in PACKAGES
        ],
    }),
)

does not resolve it. There seems to be an issue with duplicate file paths in flatten:

$ bazel build //examples/apt:tarball

$ docker load < bazel-bin/examples/apt/tarball/tarball.tar
duplicates of file paths not supported
@thesayyn
Copy link
Collaborator

Thank you for filing this!

@lazcamus
Copy link
Contributor

lazcamus commented Aug 7, 2024

WRT flatten() and "duplicates of file paths not supported" from docker:

  • it appears to be a problem only with older versions of docker? I've seen it with docker server version 24.0.4 (gcp COS) but not with server version 26.1.4 (docker desktop on osx)
  • I'm not very seasoned with bazel rules, but I have a workaround rule to launder your tar file. Unfortunately, you have to explode the tar temporarily to dedupe:
load("@aspect_bazel_lib//lib:tar.bzl", "mtree_spec", "tar", "tar_lib")

def _dedupe_tar_impl(ctx):
    """
    Remove duplicate entries in a tarball

    This is kinda a hack: extract the mtree file, explode the tarball (handled
    overwriting the files), then recreate the tarball from the exploded tree.

    This has the effect of removing duplicate entries from the tarball
    """
    bsdtar = ctx.toolchains[tar_lib.toolchain_type]
    bsdtar_bin = bsdtar.template_variables.variables["BSDTAR_BIN"]
    src = ctx.file.src

    output_tar = ctx.actions.declare_file("%s.tar.gz" % ctx.label.name)

    ctx.actions.run_shell(
        outputs = [output_tar],
        inputs = [src],
        command = """
            set -e
            export TMP=$(mktemp -d || mktemp -d -t bazel-tmp)
            trap "rm -rf $TMP" EXIT
            mkdir $TMP/extracted

            {bsdtar} --format=mtree -cf - @{src} | egrep -v '^/. ' | sed -e 's|^./||' > {mtree}
            {bsdtar} -xf {src} -C $TMP/extracted
            {bsdtar} -C $TMP/extracted -caf {output} @{mtree}
        """.format(
            bsdtar = bsdtar_bin,
            src = src.path,
            output = output_tar.path,
            mtree = "$TMP/mtree.txt",
        ),
        tools = [bsdtar.default.files],
    )

    return DefaultInfo(files = depset([output_tar]))

dedupe_tar = rule(
    implementation = _dedupe_tar_impl,
    attrs = {
        "src": attr.label(mandatory = True, allow_single_file = True),
    },
    # XXX: side effect: gzipping the output 
    outputs = {"output_tar": "%{name}.tar.gz"},
    toolchains = [tar_lib.toolchain_type],
)

From the above example, I think this would probably work:

flatten(
    name = "linux_layers_flatten.tar",
    tars = select({
        "@platforms//cpu:arm64": [
            "%s/arm64" % package
            for package in PACKAGES
        ],
        "@platforms//cpu:x86_64": [
            "%s/amd64" % package
            for package in PACKAGES
        ],
    }),
)

dedupe_tar(
    name = "linux_layers_deduped.tar",
    src = "linux_layers_flatten.tar",
)

@bd-jczarnowski
Copy link

bd-jczarnowski commented Oct 11, 2024

@lazcamus Thank you for the workaround! I'm hitting exactly the same issue while trying to introduce rules_distroless + rules_oci into our builds.

pkg_tar does deduplication but prints a warning for each occurence (and there's a lot..)

Aside from not being able to load so many layers into docker, is creating separate layers for all the debs by default efficient? (re: the "default" behavior from the example")

@thesayyn
Copy link
Collaborator

Okay i have a PR, that does not extract files to deduplicate; #119

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants