-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Train only discs landmarks #56
Conversation
Revamped the training script to better accommodate dataset-specific requirements: - Removed custom trainer initialization as it's no longer necessary. - Updated nnUNet configurations for different training setups. - Added conditional logic to adjust nnUNetPlans for dataset 103. - Ensured dataset fingerprint extraction if missing. - Streamlined preprocessing steps and ensured compatibility with plan configurations. - Implemented checks to avoid redundant decompression of already processed datasets. These changes make the script more flexible and efficient, reducing unnecessary steps and focusing on specific needs of different datasets.
Introduce parameter overrides for nnUNetTrainer, nnUNetPlanner, and nnUNetPlans to provide more flexibility in script configuration. Improved echo statements for clarity and corrected preprocessing checks to accurately prevent redundant operations.
Pinning nnunetv2 to version 2.4.2 due to compatibility issues with nnUNetTrainer_DASegOrd0_NoMirroring in version 2.5.1. Disabled `--verify_dataset_integrity` in training script as it is not functioning correctly in version 2.4.2.
Adjusted vertebrae numbering and label mappings to ensure consistency across the dataset and facilitate better alignment during the segmentation process. This change also simplifies the regions_class_order, removing unused labels. These updates should improve the accuracy of segmentation in the pipeline.
Introduce a '--tries' argument to specify the number of retry attempts when copying files, addressing potential issues with unreliable file system operations. The default is set to 10 attempts. This change enhances robustness by managing transient errors and ensuring more reliable file transfers.
Corrected several typos in argument help descriptions to improve clarity and professionalism. Enhanced file copy logic by checking if the destination file exists before accessing its size, preventing potential errors and improving reliability.
…0921 Aligned dataset paths and version numbers in pyproject.toml and README.md to new release. Refactored scripts and JSON mappings for label consistency and clarity. Removed obsolete nnUNetTrainer additions to streamline inference. Updated augmentation examples and usage in scripts to reflect new labeling scheme. These adjustments enhance maintainability and ensure correct data processing workflows.
It should be done manually for nnUNetPlannerResEncL
Introduced a new argument for providing a mapping of input labels to output segmentation labels. This facilitates custom label conversions, such as mapping the input C1 label to a specified output label. Updated related functions and modules to handle and apply this mapping throughout the segmentation process. This enhancement increases flexibility and accuracy in label assignments.
Enhanced landmark mapping logic to ensure landmarks are only mapped if they cover the majority of the associated voxels. This adjustment prevents improper landmark assignments and improves segmentation accuracy.
Added a filter to verify landmarks in segmentation data, ensuring accurate landmark mapping. This improves the reliability of landmark detection, addressing an issue where incorrect mapping could occur if landmarks were not properly isolated.
Enhanced the `_get_canal_centerline_indices` function to raise a `ValueError` when no canal labels are found in the segmentation data. This ensures that subsequent operations do not proceed with invalid or incomplete data, improving overall robustness and reliability.
Enhanced the segmentation to image transformation utility by introducing an interpolation option. This allows using 'nearest', 'linear', or 'label' interpolation methods, with the chosen method influencing the resampling process. Specifically, the 'label' option includes padding and center of mass calculations to preserve label integrity during transformation. This flexibility improves accuracy and application suitability for different types of segmentation data processing.
Refactor padding and dilation calculations to better account for voxel size discrepancies between image and segmentation. This ensures labels at the edge are appropriately managed and reduces interpolation artifacts during the transformation process. Fixes potential misalignment issues and improves accuracy.
Simplified and unified calculation of `pad_width` and `dilation_size` by converting them directly to integers. This reduces type casting and potential errors. Addresses readability and maintains functional integrity.
Adjusted description of segmentation steps to clarify class breakdown and process. Added details on manual label generation and usage of additional public datasets for sacrum segmentation. Updated illustration and dataset references for accuracy.
Updated logic to correctly handle labels near padding boundaries in segmentation images. Previously, labels were derived from the original segmentation data, missing those added during padding. Now, labels are correctly identified from the padded segmentation data, and additional steps ensure labels extending beyond padding are mapped correctly. This resolves inaccuracies in center of mass calculations for boundary labels.
Simplified the logic for setting labels at center of mass by: - Rounding and clipping indices directly within the loop. - Eliminating redundant steps for handling labels outside of padding. - Streamlining padding removal. This improves readability and maintainability while ensuring label indices stay within valid bounds.
Simplified comprehensions for flattening nested lists used in label extraction across multiple scripts. Addressed minor typos and improved documentation for `iterative_label.py` to clarify the matching process of vertebrae and discs using localizer segmentations. This increases code readability and maintainability.
Introduced a new argument `--region-default-sizes` to specify default disc/vertebrae counts for various spine regions. This addition enhances configurability, allowing users to define both maximum and default sizes separately. Updated relevant function signatures and documentation to incorporate this new parameter. This change addresses the need for finer control over segmentation parameters.
I personally think that we should not expect the user to know if he's using the right models. But these recent changes are also really particular because we are changing the outputs of the networks which should not happen too frequently. However, I still think that we should point to a specific release even inside the main code. This is already done when we download the weights so let's do what I suggested:
In the pyproject.toml:
Concerning the old weights I think that we can keep them. |
Yess !! you definitely right! We should put the release name in the file mane. |
Why can't we keep the release inside the name of the folder after extraction ? Also, we should not remove anything and keep all the previous weights |
Update --> I removed the weights from the folder and got the same bad results |
Did you run |
Updated the dataset URLs to include version-specific filenames for step1 and step2 datasets. This change enhances clarity and ensures that users download the correct dataset versions, reducing potential confusion with future releases.
Refactored logic to download pretrained model by fetching the download URL from package metadata in `pyproject.toml` instead of relying on local zip files. This ensures more reliable and up-to-date access to pretrained models, improving the robustness of the model installation process. If the zip file is not present locally, it is now downloaded directly using the metadata URL. Resolves potential issues where pretrained models might be missing.
Enhanced the retrieval of dataset URLs by parsing `pyproject.toml` directly instead of using the `importlib.metadata` package, improving flexibility. Simplified installation steps for pretrained models by combining URL and dataset handling to reduce redundancy. These changes aim to make the code more maintainable and user-friendly, particularly when managing multiple datasets or custom URLs.
Corrected the directory check logic to properly validate the presence of both step1 and step2 models. This ensures that the inference will correctly switch to the release subfolder if either model is missing. This addresses initialization issues where the inference might wrongly assume models are available
Replaced manual parsing of 'pyproject.toml' with 'importlib.metadata' to streamline and improve URL extraction for datasets. This reduces complexity and eliminates the need for regular expression parsing, leading to more maintainable code.
Cleaned up the `inference.py` module by removing an unused import to improve code readability and maintainability. Reduces potential confusion and slight overhead associated with unnecessary imports. No functional changes introduced.
Added a note in the README advising users to reinstall the package after pulling new versions from GitHub. This ensures that the latest updates are applied correctly. Helps avoid potential issues from outdated dependencies. Fixes issues related to update application.
shutil.copy2(src_path, dst_path) | ||
|
||
# Try copying up to num_tries times | ||
for attempt in range(num_tries): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it doesn't work the first time, is possible that it will work after ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
In network drive like in compute canada
Co-authored-by: Nathan Molinier <nathan.molinier@gmail.com>
Co-authored-by: Nathan Molinier <nathan.molinier@gmail.com>
Removed the merging of vertebrae labels when no disc is found in-between and implemented a new logic to correctly handle cases where two discs without a vertebrae are found or the last vertebrae disc is missing. This ensures more accurate label assignments in these scenarios, particularly where only spinous processes are segmented.
…alspineseg into train-only-discs-landmarks
The problem is related to the way the algorithm sort the labels - it projects center of mass of each on the canal centerline then sort by the z index, while this strategy usually works, sometimes like in this case it is not. The ultimate way to solve this is to straighten the segmentation by the canal, like SCT do for the cord and then to project center of mass and sort by z axis. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be working fine for the rest. The labeling issue has to be fixed see #67 (comment).
Enhanced logic for correcting labeling errors where discs and vertebrae are misaligned. Added checks and swaps for scenarios where discs are at the top or bottom of the label sequence with adjacent vertebrae. This improves segmentation accuracy, especially in edge cases with partially segmented vertebrae.
Refreshed URLs for the 'Localizer' and 'Model Output Preview' images in the README to provide clearer visual examples. This ensures that users see up-to-date and accurate representations, improving the documentation's utility.
To fix errors in the vertebrae segmentation we train model using only disc landmarks:
We also update the model description with this Illustration:
We also update the model description with this Illustration
Related to #50