Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2725 Add CellMark CLM-CL component #2772

Merged
merged 5 commits into from
Nov 28, 2024
Merged

2725 Add CellMark CLM-CL component #2772

merged 5 commits into from
Nov 28, 2024

Conversation

gouttegd
Copy link
Collaborator

This PR supersedes #2763. It adds the CLM-CL component, but does so while avoiding to overwrite CL’s customised QC workflow.

CL has a customised QC workflow, in which we additionally check for
taxon constraint violations. Whenever the repository is updated (e.g. to
reflect changes in cl-odk.yml after adding a new component), our QC
workflow file should _not_ be overwritten by the ODK-generated one.

When upgrading to a newer major version of the ODK, if the never version
brings changes to the standard QC workflow, it will be up to a CL
"pipeline engineer" to port those changes if necessary.
We add a new component obtained from the Cell Markers Ontology
(https://github.com/Cellular-Semantics/CellMark).

For now, it is refreshed unconditionally as part of all workflows, as
for the other similar components (the HRA subset and the CellxGene
subset). Later we will need to uncouple the refresh from at least the QC
workflow.
@gouttegd
Copy link
Collaborator Author

Draft PR as in its current state, the CLM-CL component would cause a build failure because of duplicated labels in CLM-CL (Cellular-Semantics/CellMark#9).

(This is exactly why it is important to uncouple the refreshing of external resources from the QC/release workflows: when external resources are unconditionally refreshed as part of all workflows, as is currently the case, we can end up with build failures at any time because of changes that have happened elsewhere and are totally out of CL’s control. #2644)

This commit make sure that external resources (e.g. mapping sets and
components that are maintained elsewhere, and simply used as they are in
CL) are only refreshed when the MIR Make variable is set to true.

Importantly, when running the QC workflows MIR is false, which means
that those resources will then _not_ be refreshed as part of the QC.
This is on purpose. Those external resources could at any time introduce
changes that could break CL. QC should test changes that happen entirely
within CL.

With that changes, it is up to CL's editors/maintainers to refresh the
external resources _when they want to do so_. For that, all they need to
do is to make sure that MIR is set to true. For example, this is done
automatically when calling `make refresh-imports`.

closes #2644
@gouttegd gouttegd self-assigned this Nov 18, 2024
@gouttegd gouttegd marked this pull request as ready for review November 18, 2024 15:46
@gouttegd
Copy link
Collaborator Author

PR updated to only refresh the CLM-CL components (as well as all other components) when MIR is true (e.g. when imports are updated with make refresh-imports).

@gouttegd gouttegd merged commit 1ca7174 into master Nov 28, 2024
1 check passed
@gouttegd gouttegd deleted the 2725-clm-cl-component branch November 28, 2024 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants