Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switching mappings from cross-reference annotations to SSSOM #3004

Open
gouttegd opened this issue Aug 7, 2023 · 6 comments
Open

Switching mappings from cross-reference annotations to SSSOM #3004

gouttegd opened this issue Aug 7, 2023 · 6 comments

Comments

@gouttegd
Copy link
Collaborator

gouttegd commented Aug 7, 2023

Currently, mappings between Uberon (and CL) taxon-neutral terms and corresponding taxon-specific terms from foreign taxon-specific ontologies are maintained under the form of cross-references (formally oboInOwl:hasDbXref) annotations on the Uberon (and CL) terms, as in this example:

[Term]
id: UBERON:0000020
name: sense organ
[...]
xref: AEO:0000094
xref: BSA:0000121
xref: BTO:0000202
xref: CALOHA:TS-2043
xref: EHDAA2:0001824
xref: EHDAA:500
xref: EMAPA:35955
[...]

We (the tech support group) would like to switch to maintaining these mappings into an external file in the SSSOM TSV format. This would bring several benefits. Among other things, it would allow to:

  • use a precise mapping predicate that can finely indicate the relation between the two mapped terms (whereas cross-references can only tell that there is a mapping, the meaning of which has to be inferred);
  • provide many bits of potentially useful information about each mapping itself (e.g. who asserted it, on what basis, with which confidence, etc.);
  • modernise the bridge generation workflow;
  • publish the mappings as a kind of “first-class release product”, instead of something that consumers need to forcibly extract from the ontology.

One foreseeable downside is that the mappings would no longer be editable directly from within Protégé, and there is (at least now) no specialised editing tool to edit a SSSOM file. But the SSSOM TSV format has been expressly designed to be easily editable with a standard spreadsheet software or even a decent text editor, so it shouldn’t too much of a hassle.

If Uberon (and CL) editors agree with this change, the plan is to:

  1. Extract all existing cross-references to foreign ontologies from the Uberon-edit file.
  2. Generate an initial SSSOM TSV file from those references.
  3. Delete the cross-references from the edit file.
  4. Generate a small component from the SSSOM TSV file that will contain “old-style” cross-references, to be imported back into the -edit file.

Step 4) is intended both for backwards compatibility (in case some applications downstream of Uberon are dependent on those cross-references and are not ready to consume SSSOM files yet) and for the convenience of editors (it will make the mappings “visible” – though not editable – from within Protégé, so that editors can know which foreign term a given term is mapped to without having to look it up in the external SSSOM file).

@gouttegd gouttegd changed the title Switching mappings to from cross-reference annotations to SSSOM Switching mappings from cross-reference annotations to SSSOM Aug 7, 2023
@gouttegd gouttegd self-assigned this Aug 7, 2023
@gouttegd
Copy link
Collaborator Author

gouttegd commented Aug 7, 2023

@gouttegd
Copy link
Collaborator Author

Feedback from Uberon editors in 21/08/2023 call:

  • No strong objection to the plan.
  • Concerns about possible confusion in editors’ mind. Need thorough documentation and possibly a tutorial “how to maintain mappings in the SSSOM format“.

@gouttegd
Copy link
Collaborator Author

gouttegd commented Aug 23, 2023

In order to both minimise the disruption for editors and to make sure the SSSOM-based bridge generation plumbing is working fine, I plan to do the transition in two phases.

Phase 1

  • Mappings remain maintained as cross-reference annotations in uberon-edit.obo (as they currently are).
  • During the release, we extract those cross-references and convert them into a SSSOM mapping set “on the fly”.
  • Similarly, we extract cross-references from some foreign ontologies (at the very least CL, plus a couple of taxon-specific ontologies for which the source of truth is the foreign ontology rather than Uberon or CL).
  • For some foreign ontologies that are already providing their mappings in the form of a SSSOM mapping set (currently, only FBbt), we simply fetch the mapping sets as they are.
  • We combine all those mappings sets (the one extracted from uberon-edit, the ones extracted from foreign ontologies, and the ones we got pre-made from foreign ontologies) into a single set from which we generate the bridge files.

In that phase, nothing changes for editors.

Once we are satisfied with the pipeline and the new SSSOM-derived bridge files, we can switch to phase 2.

Phase 2

  • We extract the cross-references from uberon-edit.obo one last time, but this time the generated SSSOM file becomes the source of truth. It is added to the repository and becomes the file that editors need to edit in order to modify the mappings. The original cross-reference annotations in uberon-edit.obo are removed.
  • From the editable SSSOM file, we derive a small component containing cross-reference annotations that is imported back into uberon-edit.obo. This is both a convenience for editors (so that mappings are at least visible in Protégé even if they are not editable) and for backwards compatibility, in case any downstream consumer of Uberon is relying on the presence of the cross-references.
  • The rest of the bridge pipeline remains as in phase 1, the only difference being that the Uberon mapping set is no longer dynamically generated.

Copy link

This issue has not seen any activity in the past 6 months; it will be closed automatically one year from now if no action is taken.

@github-actions github-actions bot added the Stale label Feb 21, 2024
@gouttegd
Copy link
Collaborator Author

Phase 1 was completed with #3061.

Phase 2 is still planned, but as it is not merely a technical issue (it changes the way editors will add / remove / modify mappings), it requires more thoughts and preparation.

At the very least, we need to:

1. carefully define which SSSOM slots editors are expected to fill;
2. write detailed documentation on how to maintain mappings in SSSOM format (which is probably something that could or maybe should be done at the mappings-commons level, since it could benefit other ontologies beyond Uberon).

Copy link

This issue has not seen any activity in the past 6 months; it will be closed automatically one year from now if no action is taken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant