Skip to content

Commit

Permalink
more docs
Browse files Browse the repository at this point in the history
  • Loading branch information
jsstevenson committed Aug 22, 2023
1 parent 420b94a commit b327b2c
Show file tree
Hide file tree
Showing 5 changed files with 26 additions and 12 deletions.
19 changes: 19 additions & 0 deletions cool_seq_tool/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,25 @@ async def genomic_to_transcript_exon_coordinates(
iff `transcript` is not supplied. `gene` must be supplied in order to retrieve
MANE Transcript data. Liftovers genomic coordinates to GRCh38. TODO!!
For example:
.. code-block:: python
>>> from cool_seq_tool import CoolSeqTool
>>> cst = CoolSeqTool()
>>> result = await cst.genomic_to_transcript_exon_coordinates(
... chromosome=7,
... start=140730665,
... end=140924800,
... gene="BRAF"
... )
>>> result.genomic_data.exon_start
18
>>> result.genomic_data.exon_start_offset
1
>>> result.genomic_data.transcript
'NM_004333.6'
:param chromosome: Chromosome. Must either give chromosome number (i.e. ``1``)
or accession (i.e. ``"NC_000001.11"``).
:param start: Start genomic position
Expand Down
4 changes: 2 additions & 2 deletions cool_seq_tool/data_sources/residue_mode.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ def get_inter_residue_pos(
"""Return inter-residue position
:param start_pos: Start position
:param residue_mode: ``inter`-residue` if start/end are 0 based coords.
``residue`` if start/end are 1 based coords
:param residue_mode: ``"inter-residue"`` if start/end are 0 based coords.
``"residue"`` if start/end are 1 based coords
:param end_pos: End position. If ``None`` assumes both ``start`` and ``end`` have
same values.
:return: Inter-residue coordinates, warning
Expand Down
1 change: 0 additions & 1 deletion docs/source/api/cool_seq_tool.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,4 @@ CoolSeqTool
:members:
:undoc-members:
:special-members: __init__
:private-members:

10 changes: 4 additions & 6 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,17 +1,15 @@
cool-seq-tool |version|
=======================

``cool-seq-tool`` (**C**\ ommon **O**\ perations **o**\ n **L**\ ots of **Seq**\ uences **Tool**) provides:

- Transcript alignment data from the `Universal Transcript Archive <https://github.com/biocommons/uta>`_
- Fast access to sequence data using `SeqRepo <https://github.com/biocommons/biocommons.seqrepo>`_
- Liftover between assemblies (GRCh38 <--> GRCh37) from `PyLiftover <https://github.com/konstantint/pyliftover>`_
- Lifting over to a preferred `MANE <https://www.ncbi.nlm.nih.gov/refseq/MANE/>`_ compatible transcript.
``cool-seq-tool`` (**C**\ ommon **O**\ perations **o**\ n **L**\ ots of **Seq**\ uences **Tool**) provides utilities for sequence conversion and retrieval. Pooling data from sources like `the Universal Transcript Archive <#>`_, `SeqRepo <#>`_, `the Gene Normalizer <#>`_, and `others <#>`_, it enables consistent and unambiguous conversions between genomic coordinates and high-quality transcript annotations from the `MANE project <https://www.ncbi.nlm.nih.gov/refseq/MANE/>`_.

.. code-block:: pycon
>>> from cool_seq_tool import CoolSeqTool
>>> cst = CoolSeqTool()
>>> result = await cst.transcript_to_genomic_coordinates(transcript="NM_002529.3", exon_start=1)
>>> (result.genomic_data.gene, result.genomic_data.chr, result.genomic_data.start, result.genomic_data.strand)
('NTRK1', 'NC_000001.11', 156861146, 1)
``cool-seq-tool`` is a library created to support efforts to `normalize variation descriptions <https://github.com/cancervariants/variation-normalization/>`_ and `model gene fusions <https://cancervariants.org/projects/fusions>`_ under the mantle of the `Variation Interpretation for Cancer Consortium (VICC) <https://cancervariants.org>`_. It is developed primarily by the `Wagner Lab <https://www.nationwidechildrens.org/specialties/institute-for-genomic-medicine/research-labs/wagner-lab>`_. Full source code is available on `GitHub <https://github.com/GenomicMedLab/cool-seq-tool>`_.

Expand Down
4 changes: 1 addition & 3 deletions docs/source/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ Transcript to genomic coordinates

.. TODO is this a correct description of why the `gene` arg can be provided?
Given a transcript and starting and/or ending exons (and offsets), retrieve the corresponding genomic location. The `transcript` argument is required, and should be a RefSeq transcript identifier (e.g. ``"NM_002529.3"``). In addition, at least one of ``exon_start`` and ``exon_end`` should be given as integers, referring to the `i`\ th exon from that transcript accession (1-indexed, i.e. ``0`` is not a legal ``exon_start`` value). Additionally, genomic coordinate offsets can be passed for both start and end positions. Finally, if known, a gene symbol can be given to ensure that the most accurate transcript equivalence is selected.
Given a transcript and starting and/or ending exons (and offsets), retrieve the corresponding genomic location. The ``transcript`` argument is required, and should be a RefSeq transcript identifier (e.g. ``"NM_002529.3"``). In addition, at least one of ``exon_start`` and ``exon_end`` should be given as integers, referring to the `i`\ th exon from that transcript accession (1-indexed, i.e. ``0`` is not a legal ``exon_start`` value). Additionally, genomic coordinate offsets can be passed for both start and end positions. Finally, if known, a gene symbol can be given to ensure that the most accurate transcript equivalence is used.

For example, the following chunk retrieves genomic coordinates bounding exons 1 through 5 on the transcript ``NM_004333.4`` (from the BRAF gene):

Expand Down Expand Up @@ -112,8 +112,6 @@ See the :py:meth:`transcript_to_genomic_coordinates <cool_seq_tool.app.CoolSeqTo
Genomic to transcript coordinates
---------------------------------

.. TODO is this accurate
``cool-seq-tool`` can also perform conversions in the other direction, retrieving a preferred transcript and exon coordinates given genomic location data. The required ``chromosome`` argument accepts either an integer chromosome number (using ``23`` and ``24`` for the X and Y chromosomes, respectively) or a complete RefSeq identifier (e.g. ``NC_000024.10``). A starting and/or an ending genomic position is also required. Finally, either a gene symbol and/or a transcript accession identifier must be provided. When only given a gene, the most preferred transcript (i.e. MANE transcript, if available) will be fetched, per the :ref:`transcript-policy`; if a transcript is given, then exon coordinates matching that transcript are returned, regardless of policy preference. In the process, liftover is performed to convert provided genomic coordinates to GRCh38, if necessary.

For example, the following chunk fetches the MANE Select transcript and corresponding exon coordinates for genomic position (140730665, 140924800) on the BRAF gene:
Expand Down

0 comments on commit b327b2c

Please sign in to comment.