-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Counterintuitive/wrong? offset from genomic to transcript mapping #332
Labels
Comments
@jsstevenson by the way, I am making quite a few changes in the mapper class. So not sure if you want to tackle this or I can. |
@jarbesfeld says this expected output from #352 is correct: {
"gene": "ELN",
"genomic_ac": "NC_000007.14",
"tx_ac": "NM_000501.4",
"seg_start": {
"exon_ord": 0,
"offset": 1,
"genomic_location": {
"type": "SequenceLocation",
"sequenceReference": {
"type": "SequenceReference",
"refgetAccession": "SQ.F-LrLMe1SRpfUZHkQmvkVKFEGaoDeHul"
},
"start": 74028173
}
}
} |
korikuzma
added a commit
that referenced
this issue
Aug 21, 2024
Close #345 and #332 * Update and fix bugs in`ExonGenomicCoordsMapper` * Change output for public methods in `ExonGenomicCoordsMapper` (leverage VRS Sequence Location and improve structure for transcript segment data). Renamed `warnings` to `errors`. * Resolve offset / genomic location bugs (#345 and #332) * Remove `mane_transcript` instance variable and use `mane_transcript_mappings` instead. * Refactor code that was unnecessary or extra. * Rename arguments in `genomic_to_tx_segment`: `alt_ac` -> `genomic_ac`, `genomic_start` -> ` seg_start_genomic`, `genomic_end` -> `seg_end_genomic` * pin `ga4gh.vrs` to `2.0.0a10` * Remove `get_genes_and_alt_acs` from `UtaDatabase`. Moved this to `ExonGenomicCoordsMapper` and renamed to `_get_genomic_ac_gene`. Will return single gene since genomic accessions are not needed anymore. --------- Co-authored-by: Jeremy Arbesfeld <jarbesfeld@gmail.com>
korikuzma
added a commit
that referenced
this issue
Aug 21, 2024
Close #345 and #332 * Update and fix bugs in`ExonGenomicCoordsMapper` * Change output for public methods in `ExonGenomicCoordsMapper` (leverage VRS Sequence Location and improve structure for transcript segment data). Renamed `warnings` to `errors`. * Resolve offset / genomic location bugs (#345 and #332) * Remove `mane_transcript` instance variable and use `mane_transcript_mappings` instead. * Refactor code that was unnecessary or extra. * Rename arguments in `genomic_to_tx_segment`: `alt_ac` -> `genomic_ac`, `genomic_start` -> ` seg_start_genomic`, `genomic_end` -> `seg_end_genomic` * pin `ga4gh.vrs` to `2.0.0a10` * Remove `get_genes_and_alt_acs` from `UtaDatabase`. Moved this to `ExonGenomicCoordsMapper` and renamed to `_get_genomic_ac_gene`. Will return single gene since genomic accessions are not needed anymore. --------- Co-authored-by: Jeremy Arbesfeld <jarbesfeld@gmail.com>
korikuzma
added a commit
that referenced
this issue
Aug 21, 2024
Close #345 and #332 * Update and fix bugs in`ExonGenomicCoordsMapper` * Change output for public methods in `ExonGenomicCoordsMapper` (leverage VRS Sequence Location and improve structure for transcript segment data). Renamed `warnings` to `errors`. * Resolve offset / genomic location bugs (#345 and #332) * Remove `mane_transcript` instance variable and use `mane_transcript_mappings` instead. * Refactor code that was unnecessary or extra. * Rename arguments in `genomic_to_tx_segment`: `alt_ac` -> `genomic_ac`, `genomic_start` -> ` seg_start_genomic`, `genomic_end` -> `seg_end_genomic` * pin `ga4gh.vrs` to `2.0.0a10` * Remove `get_genes_and_alt_acs` from `UtaDatabase`. Moved this to `ExonGenomicCoordsMapper` and renamed to `_get_genomic_ac_gene`. Will return single gene since genomic accessions are not needed anymore. --------- Co-authored-by: Jeremy Arbesfeld <jarbesfeld@gmail.com>
This was completed in #352 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
Mapping the beginning genomic coordinate of exon 1 on ELN on a GRCh37 accession produces a genomic data description with an exon offset of -97. This is counterintuitive (wrong even?) -- I would expect it to map to the beginning of exon 1 on a GRCh38 accession, ie it should map to exon 1 with offset 0.
Steps to reproduce
Expected behavior
Should be an exon offset of ~0
Current behavior
Gives an offset of
-97
Possible reason(s)
_set_exon_offset
looks fishy imhoSuggested fix
No response
Branch, commit, and/or version
efa56e1
Screenshots
No response
Environment details
mac
Additional details
I have yet to find another example of a gene on GRCh37 that maps to a transcript at all, so this might be specific to this gene
Contribution
Yes, I can create a PR for this fix.
The text was updated successfully, but these errors were encountered: