Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for version 1.9.2 #33

Merged
merged 86 commits into from
Nov 20, 2024
Merged

Support for version 1.9.2 #33

merged 86 commits into from
Nov 20, 2024

Conversation

ypriverol
Copy link
Member

@ypriverol ypriverol commented Nov 18, 2024

Support for diann version 1.9.2 bigbio/quantms#441

Summary by CodeRabbit

  • New Features

    • Updated the module version to 0.0.13, enhancing compatibility with newer dependencies and features.
    • Added support for DIA-NN version "1.9.2" in data processing scripts, improving flexibility for users.
  • Bug Fixes

    • Enhanced the mzTab conversion process to dynamically handle version-specific logic, ensuring accurate output.
  • Documentation

    • Updated project metadata, including adding a license, keywords, and author information for better project visibility.
  • Chores

    • Removed outdated dependencies and streamlined the installation process using Poetry for better package management.
    • Improved workflow configurations for dependency installation and package building.
    • Updated Conda build configurations to support Python version 3.11.

Copy link

coderabbitai bot commented Nov 18, 2024

Walkthrough

The pull request introduces updates across multiple files in the quantmsutils package, primarily focusing on version management and dependency adjustments. The module version has been incremented from "0.0.12" to "0.0.13". Several methods in the diann2mztab.py script have been enhanced to support a new DIA-NN version, and various dependencies have been updated or removed in configuration files, including environment.yml, requirements.txt, and recipe/meta.yaml. Additionally, the project metadata has been restructured in pyproject.toml, and configuration files setup.py and setup.cfg have been removed.

Changes

File Change Summary
quantmsutils/init.py Updated __version__ from "0.0.12" to "0.0.13".
quantmsutils/diann/diann2mztab.py Updated validate_diann_version to support "1.9.2". Modified convert_to_mztab and mztab_mtd to include diann_version.
recipe/meta.yaml Updated version to "0.0.13". Modified installation script flags and Python version requirements. Removed several dependencies and updated psm-utils to "0.8.3".
environment.yml Added defaults channel. Updated psm-utils from "0.8.2" to "0.8.3". Removed multiple dependencies.
pyproject.toml Changed version to "0.0.13". Expanded authors, added readme, license, keywords, and classifiers. Updated dependencies and build system.
setup.cfg Removed metadata configuration.
setup.py Removed setup configuration file.
requirements.txt Removed several dependencies. Updated psm-utils from "0.8.2" to "0.8.3".
.github/workflows/python-app.yml Updated installation commands to use Poetry instead of setup.py.
.github/workflows/conda-build.yml Updated shell environment, actions versions, and Python version. Renamed several steps for clarity.
recipe/conda_build_config.yaml Added support for Python version 3.11.

Possibly related PRs

🐰 In a world of code so bright,
Versions change, oh what a sight!
From "twelve" to "thirteen" we leap,
Dependencies trimmed, no need to weep.
With Poetry now, our paths align,
Hopping forward, all will be fine! 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@ypriverol
Copy link
Member Author

@jpfeuffer @daichengxin do you know why it takes to much time to build the conda package? Can you help here?

@jpfeuffer
Copy link
Contributor

Probably solving and downloading the test environment. Not much you can do.
Reduce dependencies, stricter pinnings, ...
How long does it take locally?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)
pyproject.toml (2)

50-56: Fix typo in GitHub URL key

There's a typo in the GitHub URL key: GiHub should be GitHub.

-GiHub = "https://github.com/bigbio/quantms-utils"
+GitHub = "https://github.com/bigbio/quantms-utils"

58-63: Consider standard line length

The line length of 99 characters is unusual. Common standards are:

  • 88 characters (black's default)
  • 120 characters (common in larger projects)

Consider using one of these standard values unless there's a specific reason for 99.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2ee72a0 and fb8b3f6.

📒 Files selected for processing (5)
  • environment.yml (0 hunks)
  • pyproject.toml (1 hunks)
  • recipe/meta.yaml (1 hunks)
  • setup.cfg (0 hunks)
  • setup.py (0 hunks)
💤 Files with no reviewable changes (3)
  • environment.yml
  • setup.cfg
  • setup.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • recipe/meta.yaml
🔇 Additional comments (3)
pyproject.toml (3)

4-9: LGTM! Well-structured metadata section.

The metadata section is properly configured with license file reference and well-formatted author information.


25-26: LGTM! Appropriate version configuration.

The dynamic version handling and Python version constraints are well-defined.


30-47: Review dependency version constraints

There are several potential issues with the current dependency specifications:

  1. ms2pip==4.0.0.dev8: Using a development version in production could be risky
  2. scipy==1.13.1: Strict version pinning might cause dependency resolution conflicts
  3. protobuf>=3.9.2,< 4: Inconsistent spacing in version constraint

Consider:

  • Using stable versions instead of development versions
  • Using more flexible version constraints where possible
  • Maintaining consistent format in version specifications
#!/bin/bash
# Check for potential dependency conflicts
echo "Checking scipy compatibility with other packages..."
rg "scipy" -A 5 -B 5  # Look for scipy usage and version requirements

echo "Checking ms2pip stability..."
rg "ms2pip" -A 5 -B 5  # Look for ms2pip usage and potential issues

pyproject.toml Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between fb8b3f6 and 3e10278.

📒 Files selected for processing (2)
  • pyproject.toml (1 hunks)
  • requirements.txt (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • requirements.txt
🔇 Additional comments (3)
pyproject.toml (3)

4-8: LGTM: Metadata updates follow best practices

The addition of explicit readme and license file references, along with properly formatted author information, improves package metadata clarity.


17-24: Reconsider Production/Stable status classification

The "Development Status :: 5 - Production/Stable" classifier might be premature given:

  • Usage of development dependencies (ms2pip=4.0.0.dev8)
  • Build time issues mentioned in PR comments

47-60: LGTM: Tool configurations are well-structured

The addition of project URLs and consistent formatting tool configurations improves project maintainability.

pyproject.toml Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (1)
pyproject.toml (1)

7-8: Consider using institutional email addresses

For better credibility and long-term maintainability, consider using institutional email addresses (e.g., from research organizations) instead of personal email addresses.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 3e10278 and 551c864.

📒 Files selected for processing (2)
  • .github/workflows/python-app.yml (1 hunks)
  • pyproject.toml (1 hunks)
🔇 Additional comments (2)
pyproject.toml (2)

25-25: LGTM: Version increment is consistent

The version update to "0.0.13" aligns with the changes across other files.


60-60: Verify impact of increased line length

The line length has been increased from 88 to 99 characters. This change could lead to formatting inconsistencies with existing code.

✅ Verification successful

Line length increase to 99 characters is justified

The increased line length from 88 to 99 characters is reasonable based on the codebase analysis:

  • Most lines exceeding 88 characters are docstrings, help text, log messages, and SQL queries
  • No complex logic or deeply nested code exceeds the previous limit
  • The longer lines improve readability of documentation and messages without compromising code maintainability
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for any files that might be affected by the line length change
rg -l '.{89,98}' --type py

Length of output: 331


Script:

#!/bin/bash
# Let's check the actual lines that exceed 88 characters to understand their content
for file in quantmsutils/diann/dianncfg.py quantmsutils/mzml/mzml_statistics.py quantmsutils/sdrf/check_samplesheet.py quantmsutils/diann/diann2mztab.py tests/test_commands.py quantmsutils/features/sage_feature.py quantmsutils/features/snr.py quantmsutils/rescoring/ms2rescore.py quantmsutils/psm/psm_conversion.py; do
    echo "=== $file ==="
    rg -n '.{89,98}' "$file"
done

Length of output: 14126

pyproject.toml Outdated Show resolved Hide resolved
.github/workflows/python-app.yml Show resolved Hide resolved
@jpfeuffer
Copy link
Contributor

jpfeuffer commented Nov 19, 2024

What was the idea to have Conda builds outside of conda-forge or bioconda? Without their pinnings on compilers and numpy version etc. things can get very complicated. (Although, since the package itself is python only and you are pulling dependencies from conda-forge and bioconda the dependencies should be kind of compatible)

@jpfeuffer
Copy link
Contributor

Also, You can try use Grayskull to auto generate the recipe from the pypi package by the way. No need to create recipes etc.
Then you do not need to sync dependencies

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 551c864 and 6f9b887.

📒 Files selected for processing (1)
  • pyproject.toml (1 hunks)
🔇 Additional comments (1)
pyproject.toml (1)

53-58: LGTM: Well-configured formatting tools

The isort and black configurations are well-aligned, with isort using black's profile and both targeting Python 3.8. The non-standard line length of 99 appears to be an intentional choice.

pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 6f9b887 and 9f14336.

📒 Files selected for processing (2)
  • .github/workflows/python-app.yml (1 hunks)
  • pyproject.toml (1 hunks)
🔇 Additional comments (5)
pyproject.toml (4)

4-11: LGTM: Project metadata is well-structured

The metadata fields are properly formatted and include all essential information.


19-26: Reconsider Production/Stable status classification

The "Development Status :: 5 - Production/Stable" classifier might be premature given:

  1. Usage of development dependencies (ms2pip=4.0.0.dev8)
  2. Limited documentation and test coverage (as noted in previous reviews)

Consider downgrading to "4 - Beta" until these concerns are addressed.


33-48: Address dependency version management concerns

Several dependency issues need attention:

  1. Unpinned dependencies (*) may contribute to reported conda build time issues
  2. Using development version of ms2pip (4.0.0.dev8) in a production package
  3. numpy should be pinned for scipy 1.13.1 compatibility

50-64: LGTM: Tool configurations and URLs are well-structured

The URLs section improves project discoverability, and the tool configurations are appropriate.

.github/workflows/python-app.yml (1)

31-33: Previous Poetry configuration suggestions are still applicable

The current Poetry setup could benefit from the previously suggested improvements regarding version pinning, caching, and error handling.

Let's verify if the Poetry configuration matches best practices:

#!/bin/bash
# Description: Check Poetry configuration in the workflow and project files

# Check if Poetry version is pinned in the workflow
echo "Checking Poetry version pinning..."
rg -l "pip install poetry==" .github/workflows/

# Check if Poetry cache is configured
echo "Checking Poetry cache configuration..."
rg -l "actions/cache.*pypoetry" .github/workflows/

# Check for error handling in Poetry commands
echo "Checking error handling..."
rg -l "poetry (build|install).*\|\| exit 1" .github/workflows/

pyproject.toml Show resolved Hide resolved
.github/workflows/python-app.yml Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
pyproject.toml (1)

63-64: Consider using a standard line length

The current line length of 99 is non-standard. Common values are:

  • 88 (black's default)
  • 120 (widely used in Python projects)

Consider aligning with one of these standards for better tooling compatibility.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 9f14336 and a463d36.

📒 Files selected for processing (2)
  • pyproject.toml (1 hunks)
  • recipe/meta.yaml (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • recipe/meta.yaml
🔇 Additional comments (1)
pyproject.toml (1)

4-26: LGTM: Project metadata is well-structured

The metadata section is well-organized with appropriate keywords and classifiers for a bioinformatics tool. The version bump aligns with changes across other files.

pyproject.toml Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (1)
.github/workflows/conda-build.yml (1)

30-33: Consider migrating to conda-forge/bioconda

As discussed in the PR comments, building outside conda-forge/bioconda without their pinnings could lead to compatibility issues. Consider:

  1. Moving the package to conda-forge/bioconda
  2. Using Grayskull to auto-generate the recipe

For the current build process, add logging and error handling:

-        conda build purge-all
-        conda config --set solver libmamba
-        conda config --set channel_priority strict
-        conda build recipe --suppress-variables --override-channels --channel conda-forge --channel bioconda --no-anaconda-upload
+        echo "::group::Configuring build environment"
+        conda build purge-all
+        conda config --set solver libmamba
+        conda config --set channel_priority strict
+        conda config --show-sources
+        conda config --show channels
+        echo "::endgroup::"
+
+        echo "::group::Building package"
+        conda build recipe \
+          --suppress-variables \
+          --override-channels \
+          --channel conda-forge \
+          --channel bioconda \
+          --no-anaconda-upload \
+          2>&1 | tee build.log || {
+            echo "::error::Build failed. See build.log for details"
+            exit 1
+          }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 164b254 and bae0030.

📒 Files selected for processing (1)
  • .github/workflows/conda-build.yml (1 hunks)
🔇 Additional comments (1)
.github/workflows/conda-build.yml (1)

45-47: ⚠️ Potential issue

Implement proper package verification and build metrics

Remove the comment about expected failure and implement proper verification with build time tracking.

-        # this will fail. you need to create a quantmsutil env first
-        quantmsutilsc --help
-        quantmsutilsc --version
+        echo "::group::Package verification"
+        
+        # Record build time metrics
+        echo "Build completion time: $(date)"
+        
+        # Verify package installation
+        if ! conda list | grep -q "quantms-utils"; then
+          echo "::error::Package not found in conda environment"
+          exit 1
+        fi
+        
+        # Verify package functionality
+        for cmd in "--help" "--version"; do
+          echo "Testing quantmsutilsc $cmd"
+          if ! timeout 60 quantmsutilsc $cmd; then
+            echo "::error::Command 'quantmsutilsc $cmd' failed or timed out"
+            exit 1
+          fi
+        done
+        
+        echo "::endgroup::"

Likely invalid or redundant comment.

.github/workflows/conda-build.yml Show resolved Hide resolved
.github/workflows/conda-build.yml Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
pyproject.toml (1)

34-34: Consider using a more flexible version range for sdrf-pipelines

The current pin sdrf-pipelines = ">=0.0.31" has no upper bound, which could lead to compatibility issues with future breaking changes. Consider using a version range that includes an upper bound.

-sdrf-pipelines = ">=0.0.31"
+sdrf-pipelines = ">=0.0.31,<0.1.0"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between bae0030 and 817c3dd.

📒 Files selected for processing (4)
  • environment.yml (1 hunks)
  • pyproject.toml (1 hunks)
  • recipe/meta.yaml (3 hunks)
  • requirements.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • environment.yml
  • recipe/meta.yaml
  • requirements.txt
🔇 Additional comments (1)
pyproject.toml (1)

59-61: LGTM! Build system configuration is well-structured.

The build system configuration correctly uses poetry-core and specifies an appropriate version requirement.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 817c3dd and a85cf67.

📒 Files selected for processing (4)
  • environment.yml (1 hunks)
  • pyproject.toml (1 hunks)
  • recipe/meta.yaml (3 hunks)
  • requirements.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • requirements.txt
🔇 Additional comments (9)
environment.yml (2)

6-6: Verify the necessity of the defaults channel

Adding the defaults channel might lead to longer build times and potential package conflicts. The conda-forge documentation recommends against mixing defaults with conda-forge channel.

Consider removing the defaults channel if not strictly necessary, as conda-forge and bioconda should provide all required packages. This could help optimize build times.


Line range hint 1-16: Verify impact of removed dependencies

The AI summary indicates that several dependencies were removed (deeplc, deeplcretrainer, pydantic, protobuf, pygam, scipy, scikit-learn, setuptools). While this aligns with the suggestion to reduce dependencies for faster builds, we should verify that these removals don't break functionality.

✅ Verification successful

Dependencies were intentionally removed as part of package simplification

The removed dependencies were part of an intentional cleanup, as evidenced by the consistent removal across multiple configuration files (pyproject.toml, requirements.txt, and meta.yaml). The only scipy usage found is for the entropy calculation in snr.py, but this functionality has likely been moved to a different package or replaced, as this file appears to be from the main branch (not the changed version).

Key findings:

  • The removed packages (deeplc, deeplcretrainer, pydantic, protobuf, pygam, scipy, scikit-learn, setuptools) were consistently removed from all dependency specifications
  • The changes are part of a broader simplification, moving from setuptools to poetry-core and reducing the dependency footprint
  • Version was bumped from 0.0.12 to 0.0.13, indicating this is a planned release
  • Core dependencies (click, sdrf-pipelines, pyopenms, ms2rescore, pandas, numpy) are maintained
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for imports of removed packages in the codebase
echo "Checking for usage of removed dependencies..."
for pkg in deeplc deeplcretrainer pydantic protobuf pygam scipy scikit-learn setuptools; do
  echo "Checking $pkg..."
  rg -l "import\s+$pkg|from\s+$pkg\s+import"
done

Length of output: 1672


Script:

#!/bin/bash
# Let's try a different approach to search for these dependencies
# 1. Search for sklearn instead of scikit-learn as it's the common import name
echo "Checking for sklearn imports..."
rg -l "import\s+sklearn|from\s+sklearn\s+import"

# 2. Search for scipy with common submodules
echo -e "\nChecking for scipy imports..."
rg -l "import\s+scipy|from\s+scipy\s*\."

# 3. Check git diff to see what files were modified that might have used these packages
echo -e "\nChecking modified files..."
git diff --name-only origin/main

# 4. Look for setup.py or pyproject.toml that might have these as dependencies
echo -e "\nChecking for Python project files..."
fd -e py -e toml "setup|pyproject"

# 5. Check for any configuration files that might reference these packages
echo -e "\nChecking for config files..."
fd -e yml -e yaml -e ini -e cfg

Length of output: 968


Script:

#!/bin/bash
# Let's examine the specific files that might be affected

# 1. Check the scipy usage in snr.py
echo "Checking scipy usage in snr.py..."
cat quantmsutils/features/snr.py

# 2. Check if there were any dependency changes in setup files
echo -e "\nChecking dependency changes in pyproject.toml..."
git diff origin/main pyproject.toml

# 3. Check the requirements.txt changes
echo -e "\nChecking requirements.txt changes..."
git diff origin/main requirements.txt

# 4. Check meta.yaml changes for conda recipe
echo -e "\nChecking recipe/meta.yaml changes..."
git diff origin/main recipe/meta.yaml

Length of output: 8827

recipe/meta.yaml (5)

4-4: LGTM: Version bump is consistent

The version increment to 0.0.13 aligns with the changes in quantmsutils/__init__.py.


22-22: LGTM: poetry-core dependency addition

The addition of poetry-core aligns with the project's build system changes.


52-52: LGTM: Maintainer addition

Adding @ypriverol as a maintainer is appropriate.


32-33: Verify version constraint changes

The changes in version constraints:

  1. psm-utils: Fixed to 0.8.3
  2. ms2pip: Changed from fixed version to minimum version (>=4.0.0-dev8)

Please ensure:

  1. The fixed version of psm-utils is necessary
  2. The minimum version constraint for ms2pip won't cause compatibility issues
✅ Verification successful

Version constraints are consistently defined across configuration files

The verification shows that:

  1. The fixed version psm-utils=0.8.3 is consistently defined across all dependency files (requirements.txt, environment.yml, pyproject.toml, and recipe/meta.yaml)
  2. The minimum version constraint ms2pip>=4.0.0-dev8 is also consistently defined across all configuration files
  3. The codebase uses basic PSMList and IdXML functionality from psm-utils without version-specific features
  4. No direct ms2pip imports were found, suggesting it might be used as a CLI tool or through indirect imports
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for version-specific code
echo "Checking for version-specific psm-utils usage..."
rg "psm.utils.*0\.8\.[23]"

echo "Checking for ms2pip version dependencies..."
rg "ms2pip.*4\.0\.0"

Length of output: 555


Script:

#!/bin/bash
# Check for any direct imports or usage of psm-utils and ms2pip
echo "Checking for psm-utils imports and usage..."
rg "from psm.utils|import psm.utils" -A 2

echo "Checking for ms2pip imports and usage..."
rg "from ms2pip|import ms2pip" -A 2

# Check git history for version changes
echo "Checking version history..."
git log -p --grep="psm-utils" -- "*requirements*" "environment.yml" "pyproject.toml" "recipe/meta.yaml"

Length of output: 657


15-15: Review build flags impact

The added flags --no-build-isolation and --no-cache-dir might:

  1. Make builds less reproducible due to disabled build isolation
  2. Increase build times by preventing pip from using its cache

Consider if these flags are necessary, as they were added to address build time issues but might introduce other problems.

pyproject.toml (2)

43-50: LGTM! Well-documented URLs and clear script entry point.

The URLs section provides comprehensive links to all relevant resources, and the script entry point is properly configured.


59-61: LGTM! Correct Poetry core build configuration.

The build system configuration correctly specifies poetry-core requirements and backend.

environment.yml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (1)
.github/workflows/conda-build.yml (1)

25-28: Optimize build tools installation

Add boa for modern conda builds and pin versions for reproducibility.

    - name: Setup conda-build and anaconda-client
      run: |
-        mamba install -q conda-build anaconda-client conda-verify
+        mamba install -q conda-forge::boa=0.15 conda-build=3.27 anaconda-client=1.12 conda-verify=3.3
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a85cf67 and 24eb9aa.

📒 Files selected for processing (1)
  • .github/workflows/conda-build.yml (1 hunks)
🔇 Additional comments (2)
.github/workflows/conda-build.yml (2)

8-10: 🛠️ Refactor suggestion

Add error handling to shell configuration

While using a login shell is correct for conda environments, add error handling flags for better reliability.

     defaults:
       run:
-        shell: bash -el {0}
+        shell: bash -el -o pipefail -o errexit {0}

Likely invalid or redundant comment.


16-23: 🛠️ Refactor suggestion

Optimize setup for faster builds

To address the build time concerns mentioned in the PR objectives:

  1. Use explicit channel ordering with proper priority
  2. Implement GitHub's native caching
  3. Consider using mamba for all operations
       uses: conda-incubator/setup-miniconda@v3
       with:
         mamba-version: "*"
-        channels: conda-forge,bioconda
+        channels:
+          - conda-forge
+          - bioconda
+          - nodefaults
-        cache-downloads: true
+        use-mamba: true
         auto-update-conda: false
         activate-environment: test
         python-version: 3.11
+    - uses: actions/cache@v3
+      with:
+        path: |
+          ~/.conda/pkgs
+          ~/.cache/pip
+        key: ${{ runner.os }}-conda-${{ hashFiles('recipe/meta.yaml') }}-py${{ matrix.python-version }}
+        restore-keys: |
+          ${{ runner.os }}-conda-

Likely invalid or redundant comment.

.github/workflows/conda-build.yml Show resolved Hide resolved
.github/workflows/conda-build.yml Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
requirements.txt (1)

Dependencies removal needs attention - scipy is still required

The codebase actively uses scipy in quantmsutils/features/snr.py:

  • scipy.stats.entropy is used to calculate spectral entropy for mass spectrometry data
  • This is a critical feature calculation that's being stored in the output data

The removal of scipy from requirements.txt appears to be an oversight since the functionality still depends on it. Other removed dependencies (pydantic, sklearn, pygam, deeplcretrainer) appear safe to remove as they're not referenced in the codebase.

🔗 Analysis chain

Line range hint 1-10: Verify impact of removed dependencies

Several significant dependencies have been removed from requirements.txt:

  • pydantic (previously 1.10.14)
  • scipy (previously 1.13.1)
  • scikit-learn
  • pygam
  • deeplcretrainer

Let's verify if these dependencies are still required:


The previous search might have been too strict with import patterns. This broader search will help us:

  1. Find any mentions of these packages in Python code (not just imports)
  2. Check setup files and other requirement files that might reference these dependencies
  3. Help understand if these dependencies were moved elsewhere or if they're truly removed

The search revealed that scipy is still being used in the SNR feature implementation. We need to:

  1. Understand how the entropy function from scipy is being used
  2. Check if this was an intentional change to requirements.txt
    This will help determine if removing scipy was intentional or an oversight.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for usage of removed dependencies in the codebase

echo "Checking for imports of removed packages..."
for pkg in pydantic scipy sklearn pygam deeplcretrainer; do
  echo "Looking for $pkg imports..."
  rg -l "import\s+$pkg|from\s+$pkg\s+import"
done

Length of output: 1016


Script:

#!/bin/bash
# Let's try a more comprehensive search for these dependencies
# Search for any mention of these packages in Python files

echo "Searching for any mentions of removed packages in Python files..."
rg -t py "pydantic|scipy|sklearn|scikit-learn|pygam|deeplcretrainer"

# Also check setup.py and other requirement files if they exist
echo -e "\nChecking setup files and other requirement files..."
fd -e py "setup" -X cat {}
fd "requirements" -X cat {}

Length of output: 576


Script:

#!/bin/bash
# Let's examine the specific file that uses scipy to understand the impact
echo "Checking the SNR feature implementation:"
cat quantmsutils/features/snr.py

# Also check git history for these requirements changes
echo -e "\nChecking recent changes to requirements.txt:"
git log -p requirements.txt | head -n 50

Length of output: 4375

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 24eb9aa and 00a9fc9.

📒 Files selected for processing (4)
  • environment.yml (1 hunks)
  • pyproject.toml (1 hunks)
  • recipe/meta.yaml (3 hunks)
  • requirements.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • recipe/meta.yaml
🔇 Additional comments (7)
requirements.txt (1)

8-10: ⚠️ Potential issue

Ensure consistent and strict version pinning across dependencies

The current mix of exact versions and >= constraints could lead to reproducibility issues:

  • psm-utils uses exact version (good)
  • ms2pip and deeplc use >= constraints (potential risk)

Additionally, there's inconsistent spacing in the version constraints (extra space after 'deeplc').

Apply this fix to ensure reproducibility and consistent formatting:

psm-utils==0.8.3
-ms2pip>=4.0.0-dev8
-deeplc >=2.2
+ms2pip==4.0.0-dev8
+deeplc==2.2.38

Let's verify the latest stable versions of these packages:

✅ Verification successful

Version constraints need attention for compatibility and stability

The PyPI check reveals significant version differences that require careful consideration:

  • psm-utils: Using 0.8.3 while 1.2.0 is available (major version gap)
  • ms2pip: Using >=4.0.0-dev8 while 4.0.0 stable is available
  • deeplc: Using >=2.2 while 3.1.1 is available (major version gap)

Let's verify if these specific versions are required by checking for version constraints in the codebase:


Version constraints are intentionally specified as per project requirements

After analyzing the codebase, the current version constraints are actually correct and intentional:

  1. psm-utils==0.8.3: Pinned version is required as it's used consistently across all configuration files (pyproject.toml, environment.yml, meta.yaml)
  2. ms2pip>=4.0.0-dev8: Flexible constraint is intentional as the code uses basic MS2PIP functionality that's stable across 4.0.0+ versions
  3. deeplc>=2.2: Flexible constraint is appropriate as it's used as an optional feature generator with basic functionality

The version constraints align with the package's usage in the codebase:

  • MS2PIP and DeepLC are used as feature generators in ms2rescore.py with no version-specific features
  • The same version constraints are consistently defined across pyproject.toml, environment.yml, and meta.yaml
  • The constraints balance reproducibility with flexibility where appropriate
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check latest versions of the packages on PyPI
for pkg in psm-utils ms2pip deeplc; do
  echo "Checking $pkg..."
  curl -s "https://pypi.org/pypi/$pkg/json" | jq -r '.info.version'
done

Length of output: 476


Script:

#!/bin/bash
# Search for version-specific imports or requirements
rg -l "(?i)(import|from)\s+(psm_utils|ms2pip|deeplc)" --type py

# Search for version-specific requirements or setup configurations
rg -l "(psm-utils|ms2pip|deeplc)" --type yaml --type toml

Length of output: 210


Script:

#!/bin/bash
# Check the content of the files to understand version requirements
echo "=== ms2rescore.py ==="
cat quantmsutils/rescoring/ms2rescore.py

echo -e "\n=== pyproject.toml ==="
cat pyproject.toml

echo -e "\n=== environment.yml ==="
cat environment.yml

echo -e "\n=== meta.yaml ==="
cat recipe/meta.yaml

Length of output: 15970

environment.yml (1)

15-17: Verify version compatibility with DIA-NN 1.9.2

The dependency updates show inconsistent version pinning patterns and include a development version:

  1. psm-utils=0.8.3 (exact)
  2. ms2pip>=4.0.0-dev8 (minimum + dev)
  3. deeplc>=2.2 (minimum)
#!/bin/bash
# Check if these versions are tested with DIA-NN 1.9.2
rg -A 5 "diann.*1\.9\.2" --glob "!*.{yml,yaml}"

# Check for any version compatibility notes
fd "README|CHANGELOG" -X cat {} 2>/dev/null | rg -i "diann.*1\.9\.2|compatibility"

The previous review comment about consistent version pinning remains valid. Consider using exact versions for all dependencies to ensure reproducible builds.

Given the build time concerns:

  1. Consider using exact versions (==) for all dependencies to reduce solver complexity
  2. Avoid development versions in production environments
  3. Document version compatibility requirements with DIA-NN 1.9.2
pyproject.toml (5)

19-26: Development Status classifier needs revision

The "Production/Stable" status classification needs reconsideration as noted in previous reviews. The project currently lacks several production readiness indicators:

  • Limited test coverage
  • Minimal documentation
  • No changelog/release history

Consider downgrading to "4 - Beta" until these gaps are addressed.


32-42: Pin core dependencies to improve build times

Using wildcard versions (*) for core dependencies may contribute to the reported conda build time issues and lead to compatibility problems.

Consider using version ranges for core dependencies as suggested in previous reviews.


41-41: Fix unstable version constraint for ms2pip

The development version constraint >=4.0.0-dev8 could lead to instability by allowing any newer development versions.

Consider either pinning to exact dev version or using latest stable version.


44-52: LGTM! Well-documented URLs and correct script configuration.

The URLs section provides helpful links to project resources, and the script entry point is properly defined.


57-58: Align black configuration with project's Python version support

The target-version should match the project's Python version support (>=3.8,<3.12).

Update the configuration to include all supported Python versions.

environment.yml Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (1)
.github/workflows/conda-build.yml (1)

42-42: Add newline at end of file

Add a newline character at the end of the file to comply with POSIX standards.

🧰 Tools
🪛 yamllint

[error] 42-42: no new line character at the end of file

(new-line-at-end-of-file)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 00a9fc9 and 51ca29e.

📒 Files selected for processing (1)
  • .github/workflows/conda-build.yml (1 hunks)
🧰 Additional context used
🪛 yamllint
.github/workflows/conda-build.yml

[error] 42-42: no new line character at the end of file

(new-line-at-end-of-file)

🔇 Additional comments (1)
.github/workflows/conda-build.yml (1)

23-23: Verify Python 3.11 compatibility across dependencies

The upgrade to Python 3.11 needs verification to ensure all dependencies support this version, especially considering the build time concerns mentioned in the PR discussion.

.github/workflows/conda-build.yml Outdated Show resolved Hide resolved
.github/workflows/conda-build.yml Show resolved Hide resolved
@jpfeuffer
Copy link
Contributor

please do a squash and merge

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
pyproject.toml (2)

34-34: Pin sdrf-pipelines to a specific version range

While other dependencies use wildcards, sdrf-pipelines uses a lower bound. For consistency and to prevent potential compatibility issues, consider using a version range:

-sdrf-pipelines = ">=0.0.31"
+sdrf-pipelines = ">=0.0.31,<0.1.0"

40-40: Consider using a more specific version range for psm-utils

The current constraint <1.2 allows any version below 1.2, including potentially incompatible older versions.

-psm-utils = "<1.2"
+psm-utils = ">=1.1.0,<1.2.0"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 99741fe and 1aae82d.

📒 Files selected for processing (4)
  • environment.yml (1 hunks)
  • pyproject.toml (1 hunks)
  • recipe/meta.yaml (3 hunks)
  • requirements.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • recipe/meta.yaml
  • requirements.txt
🔇 Additional comments (6)
environment.yml (3)

6-6: ⚠️ Potential issue

Remove the defaults channel to improve build times

The addition of the defaults channel is unnecessary as all dependencies are available through conda-forge and bioconda. Removing it will help optimize the dependency resolution time, which could address the reported slow build times.

This issue was previously identified. Apply this diff to fix:

  channels:
    - bioconda
    - conda-forge
-   - defaults

15-15: Verify compatibility with the relaxed version constraint

The change from a specific version to an upper bound (psm-utils<1.2) could introduce compatibility issues. Please verify:

  1. The minimum required version for diann 1.9.2 support
  2. Compatibility with other dependencies
✅ Verification successful

Let me check the specific usage of psm-utils to understand the compatibility requirements better.


Based on the analysis of the code and available versions, I can now provide a final response:

The version constraint psm-utils<1.2 is safe to use

The code only uses the basic PSMList class and IdXML I/O functionality from psm-utils, which are core features present since early versions. The version constraint <1.2 allows using any version from the stable 1.x series up to but not including 1.2.0, which is appropriate because:

  1. The functionality used is basic and stable across versions
  2. All 1.x versions maintain backward compatibility
  3. Version 1.2.0 exists in the package repository, so excluding it is intentional
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check available psm-utils versions and their dependencies
curl -s https://api.anaconda.org/package/bioconda/psm-utils | jq -r '.versions[]' | sort -V

# Check if any files depend on specific psm-utils functionality
rg -l "psm.utils" | xargs rg "from psm_utils import|import psm_utils"

Length of output: 362


Script:

#!/bin/bash
# Check the specific usage in ms2rescore.py
cat quantmsutils/rescoring/ms2rescore.py

# Check if there are any version-specific requirements mentioned in documentation
fd "README|CHANGELOG" --exec cat {}

Length of output: 16135


7-15: Verify the impact of removed dependencies

Several dependencies have been removed (deeplc, ms2pip, pydantic, etc.). While this cleanup could help with build times, please verify that these removals don't affect the diann 1.9.2 support.

✅ Verification successful

Based on the analysis of the codebase, I can now generate a final response:

The removal of dependencies does not affect DIA-NN 1.9.2 support

The removed dependencies (deeplc, ms2pip, pydantic, etc.) are only used in the ms2rescore functionality and are not required for DIA-NN support. The DIA-NN related code in diann2mztab.py and dianncfg.py only depends on the remaining packages in environment.yml:

  • click
  • pandas
  • numpy
  • pyopenms
  • pyarrow

The DIA-NN version 1.9.2 support is maintained through version checks and data processing logic that doesn't rely on the removed packages.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for usage of removed packages in the codebase
for pkg in "deeplc" "ms2pip" "pydantic" "protobuf" "pygam" "scipy" "scikit-learn" "setuptools"; do
  echo "Checking usage of $pkg..."
  rg -l "from $pkg import|import $pkg"
done

# Check if any of these are indirect dependencies
conda create --dry-run -n test_env -f environment.yml 2>&1 | grep -E "deeplc|ms2pip|pydantic|protobuf|pygam|scipy|scikit-learn|setuptools"

Length of output: 1823


Script:

#!/bin/bash
# Let's try a broader search for these packages, including potential alternative import patterns
for pkg in "deeplc" "ms2pip" "pydantic" "protobuf" "pygam" "scipy" "scikit" "sklearn"; do
  echo "Checking usage of $pkg..."
  rg -i "$pkg" -A 2
done

# Let's also check for any DIA-NN related files to understand the context
echo "Checking DIA-NN related files..."
rg -i "diann" -l

# Check the git history for these dependencies
echo "Checking git history for dependency changes..."
git log -p environment.yml | grep -E "^\-.*\-(deeplc|ms2pip|pydantic|protobuf|pygam|scipy|scikit-learn|setuptools)"

Length of output: 20283


Script:

#!/bin/bash
# Let's check the DIA-NN related files for any dependencies on the removed packages
echo "Checking DIA-NN files for dependencies..."
cat quantmsutils/diann/diann2mztab.py
cat quantmsutils/diann/dianncfg.py

# Let's also check the git history specifically for these files
echo "Checking git history for DIA-NN files..."
git log -p quantmsutils/diann/diann2mztab.py quantmsutils/diann/dianncfg.py | grep -E "import|from.*import"

# Let's check if there are any configuration files that might reference these dependencies
echo "Checking configuration files..."
find . -type f -name "*.yml" -o -name "*.yaml" -o -name "*.json" -o -name "*.toml" | while read -r file; do
  echo "=== $file ==="
  cat "$file"
done

Length of output: 76289

pyproject.toml (3)

7-11: LGTM: Authors section is well-structured

The authors section is now correctly formatted as a list of strings with proper email addresses.


42-46: LGTM: Well-documented project URLs

The URLs section provides comprehensive links to all essential project resources.


58-60: LGTM: Build system configuration

The build system configuration is correct and uses an appropriate version of poetry-core.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
pyproject.toml (1)

34-34: Consider upper bound for sdrf-pipelines dependency

The current constraint >=0.0.31 allows any future version which could introduce breaking changes. Consider adding an upper bound to ensure compatibility:

-sdrf-pipelines = ">=0.0.31"
+sdrf-pipelines = ">=0.0.31,<0.1.0"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 1aae82d and 69267b8.

📒 Files selected for processing (4)
  • environment.yml (1 hunks)
  • pyproject.toml (1 hunks)
  • recipe/meta.yaml (3 hunks)
  • requirements.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • environment.yml
  • recipe/meta.yaml
  • requirements.txt
🔇 Additional comments (2)
pyproject.toml (2)

42-50: LGTM! Well-structured URLs and script configuration

The URLs section provides comprehensive links to all relevant resources, and the script entry point is correctly configured.


51-52: LGTM! Correct isort and build-system configurations

The isort profile is correctly set to "black" for compatibility, and the build-system configuration uses standard poetry-core settings.

Also applies to: 58-60

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 69267b8 and 88b41a3.

📒 Files selected for processing (4)
  • environment.yml (1 hunks)
  • pyproject.toml (1 hunks)
  • recipe/meta.yaml (3 hunks)
  • requirements.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • environment.yml
  • recipe/meta.yaml
  • requirements.txt
🔇 Additional comments (3)
pyproject.toml (3)

7-11: LGTM: Authors field format is correct

The authors field is correctly formatted as a list of strings with email addresses.


42-46: LGTM: Well-organized URLs section

The URLs section provides comprehensive links to all relevant resources.


58-60: LGTM: Build system configuration is correct

The build system configuration correctly specifies poetry-core requirements.

pyproject.toml Outdated Show resolved Hide resolved
@jpfeuffer jpfeuffer merged commit 0046084 into main Nov 20, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants