Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Support conda-lock files #66

Open
itamarst opened this issue Sep 22, 2021 · 11 comments
Open

[FEATURE] Support conda-lock files #66

itamarst opened this issue Sep 22, 2021 · 11 comments
Assignees
Labels
enhancement New feature or request

Comments

@itamarst
Copy link

  • What are you trying to do?

https://github.com/conda-incubator/conda-lock/ includes a transitive pinned list of packages to install in a Conda environment, to allow for reproducible builds. I am fairly certain it contains the same information as conda list.

It would be nice to be able to scan for vulnerabilities using this file, because then one wouldn't have to actually install the packages to check for vulnerabilities.

Unlike environment.yml, it should contain a complete list of packages that will be installed, so there's no worry about extra dependencies being installed and not scanned for vulnerable releases.

  • How could we solve this issue? (Not knowing is okay!)

Write a parser for conda-lock output files. Should be pretty simple:

# platform: linux-64
# env_hash: 27bd039b2991103d63cefc823705756d66514e1c6bf6f156bc6eb3bd87679676

@EXPLICIT

https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2#d7c89558ba9fa0495403155b64376d81
https://conda.anaconda.org/conda-forge/linux-64/ca-certificates-2021.5.30-ha878542_0.tar.bz2#6a777890e94194dc94a29a76d2a7e721
https://conda.anaconda.org/conda-forge/linux-64/ld_impl_linux-64-2.35.1-hed1e6ac_0.tar.bz2#d0cf77c331382475133dc6c34e7461d7
https://conda.anaconda.org/conda-forge/linux-64/libgfortran5-9.3.0-he4bcb1c_17.tar.bz2#0c15349375fc3d0cb2114fcabe2f0aba
  • Anything else?

Thank you for writing this tool! I'm writing a blog post about it right now.

cc @bhamail / @DarthHater

@itamarst itamarst added the enhancement New feature or request label Sep 22, 2021
@bollwyvl
Copy link

@itamarst This would be great!

I am fairly certain it contains the same information as conda list.

Yes, this following are mostly the same:

  • conda-lock [--file ...] (where file can be a couple of formats)
  • conda list --explicit in an existing environment

A notable departure: conda-lock adds a few more comment lines which capture the relevant platform. It can also make use of mamba, which can be quite a bit snappier, especially for complex windows solves.

extra dependencies being installed and not scanned for vulnerable releases.

Right: the format of the list of packages from either tool is interesting, as it's not only a set of packages, but also a topological sorting of their install order, which can be exploited for caching schemes, resolving duplicate paths, etc.

As you're calling out conda-forge... it doesn't look like jake can even handle those packages yet... this issue might be mis-filed, but has some of the thoughts we came up with, as well as this discussion issue.

And, of note, one can install jake<1 from conda-forge... we're working on jake==1 but will have some back-filling to do for various new upstreams.

@DarthHater
Copy link
Member

This is a cool idea, all for it. Stoked y'all dig the tool! @allenhsieh and I wrote this a few years ago because we really like jake the snake (just kidding, or maybe?!)

madpah added a commit that referenced this issue Oct 21, 2021
fix: character encoding issues on Windows #67

Signed-off-by: Paul Horton <phorton@sonatype.com>
@madpah
Copy link
Collaborator

madpah commented Oct 28, 2021

@itamarst - FYI we've added Conda support in jake when generating an SBOM:

conda list --explicit | jake sbom -t CONDA

We're looking next into supporting Conda and other input formats when running ddt and iq subcommands (which currently just reads what's installed in your current Python Environment).

FYI: @DarthHater , @bollwyvl

@bollwyvl
Copy link

Nice: I'm making some progress towards getting the conda-forge package up and running. Of note, during a self-test, I found some more exotic package names aren't very well supported:

https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2

See conda-forge/jake-feedstock#3 (comment)

@madpah
Copy link
Collaborator

madpah commented Jan 13, 2022

Thanks @bollwyvl - will take a look at that package... - can you share a complete output that includes the above package from either conda list --explicit and/or conda list --json?

Thanks

@bollwyvl
Copy link

Sure, here are a bunch of widely-used lockfiles that get deployed thousands of time a day:

https://github.com/jupyterhub/repo2docker/blob/main/repo2docker/buildpacks/conda/environment.py-3.9.lock

@madpah
Copy link
Collaborator

madpah commented Jan 17, 2022

@bollwyvl - I've done a bit more digging on this, and specifically the example you've provided above.

FYI - the parsing of Conda lock files is actually handled by a parent library to jake - cylconedx-python.

This project includes a parser for parsing conda lock files and already has a unit test specifically for the example you have above, which passes: https://github.com/CycloneDX/cyclonedx-python/blob/master/tests/test_utils_conda.py#L112

Am I missing something, or have you perhaps provided the incorrect example (before I go down a rabbit hole!)?

Thanks

@bollwyvl
Copy link

Yeah, as a downstream packager of these packages, I'm only just keeping up with the recent spate of package renamings and versions, and haven't evaluated whether lockfiles work in a while. Once these land, I'll have a better idea:

I've added the test case i tried in october to the latter, so we'll probably know more later this week.

@bollwyvl
Copy link

Well, we've shipped jake 1.4.0 on conda-forge, and it looks like it can successfully generate an sbom for its own environment... sorta.

Of note, there are a great number of packages that aren't python-related in conda(-forge), so blanket assuming a lot of stuff is in the pypi namespace is probably inaccurate, a la pkg:pypi/-libgcc-mutex@0.1 (or more humorously, pkg:pypi/python@3.10.2), but that's probably more akin to the even further-upstream problem.

Meanwhile, when a package does correspond to one in pypi, but has a different name, there is a semi-authoritative mapping. I don't see any good examples here, but its common for things where the pypi name is a pun for the underlying c library, e.g. msgpack -> msgpack-python.

@madpah
Copy link
Collaborator

madpah commented Jan 19, 2022

Thanks @bollwyvl - as ever, super insightful info and feedback. I'll ponder the two key points and see if there are any options we can employ to help.

@madpah
Copy link
Collaborator

madpah commented Jan 19, 2022

On the point about generating an SBOM for it's own environment, can you share a little more, or can we consider #66 closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants