Skip to content
This repository has been archived by the owner on Sep 20, 2024. It is now read-only.

UC 1a. Develop a more accurate pipeline to detect de novo mutations in family trios by utilizing the consistent calls #2

Open
NoopDog opened this issue Jun 29, 2021 · 0 comments
Labels
SYS INTEROP System interoperability use case

Comments

@NoopDog
Copy link
Collaborator

NoopDog commented Jun 29, 2021

Interop Contact: Allison Heath
Active in 2021: [Inactive]
Researchers: Bruce Gelb (Mount Sinai), i)
**Platforms" NHLBI BioData Catalyst + Kids First DRC
Analysis Question: The Pediatric Cardiovascular Genetics Consortium (PCGC) is committed to defining the molecular mechanisms for Congenital Heart Disease. They have developed a novel method to identify de novo mutations in clinical probands by post-processing the family genotypes posited by the GATK whole-genome sequencing (WGS) pipeline.

This method has a precision rate of 95% for de novo SNVs as well as short INDELs (validated by Sanger sequencing of the putative calls). Seven Bridges Genomics, Inc. (SBG) has recently described Pan-genome Graph References for improved WGS analyses and presented the use of personalized genome graphs for more consistent variant calling in family trios (ASHG 2019).

This collaboration aims to develop a more accurate pipeline to detect de novo mutations in family trios by utilizing the consistent calls and other graph-related information produced by the SBG graph tools in the PCGC pipeline.

Analysis Plan:

  1. Obtain confirmation from appropriate NIH Data Access Committees (e.g., NHLBI & Kids First) that the datasets and data uses are allowable and can be used/combined in this manner.
  2. Identify a subset of trios with validated de novo variants from PCGC to use as “gold standard” for new graph-based methods.
  3. Refine and improve methods utilizing the validated data on Cavatica
    (ideally) Through a single sign on event, authenticate user and authorize appropriate access through RAS integration
  4. Expand to run in Cavatica and BDCatalyst across entire PCGC cohort
    Provide new PCGC callset to approved researchers for analysis and further community validation / potential method refinement
    If improvement, run across Kids First and TOPMed studies of interest to provide callsets to the community
    1. Potentially any trio-based AnVIL datasets, e.g. CMGs
@NoopDog NoopDog added the Epic label Jun 29, 2021
@NoopDog NoopDog changed the title UC 1a. NHLBI BioData Catalyst + Kids First DRC UC 1a. (Manning) NHLBI BioData Catalyst + Kids First DRC Jun 29, 2021
@NoopDog NoopDog changed the title UC 1a. (Manning) NHLBI BioData Catalyst + Kids First DRC UC 1a. NHLBI BioData Catalyst + Kids First DRC Jun 29, 2021
@jackDiGi jackDiGi added the inactive This use case is not being worked on. label Sep 3, 2021
@NoopDog NoopDog removed the Epic label Sep 23, 2021
@jackDiGi jackDiGi added the SYS INTEROP System interoperability use case label Nov 16, 2021
@NoopDog NoopDog moved this to On Hold in NCPI Use Case Tracker Dec 3, 2021
@linikujp linikujp removed the inactive This use case is not being worked on. label Jan 27, 2022
@NoopDog NoopDog changed the title UC 1a. NHLBI BioData Catalyst + Kids First DRC UC 1a. Develop a more accurate pipeline to detect de novo mutations in family trios by utilizing the consistent calls Feb 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
SYS INTEROP System interoperability use case
Projects
Status: On Hold
Development

No branches or pull requests

3 participants