Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Clear variant missing in TNscope T/N WGS case #1458

Open
mathiasbio opened this issue Jun 26, 2024 · 3 comments
Open

[Bug] Clear variant missing in TNscope T/N WGS case #1458

mathiasbio opened this issue Jun 26, 2024 · 3 comments
Labels
Bug Something isn't working

Comments

@mathiasbio
Copy link
Collaborator

Description

In ticket: https://clinical-scilifelab.supportsystem.com/scp/tickets.php?id=70333#reply we were contacted by a clinician regarding a variant that was not called by our somatic caller TNscope but was called in DNAscope. It is clearly not a germline variant as it does not appear at all in the matched normal sample. We need to investigate and understand what is going on with this variant, and how to prevent it from happening again in the future.

How to reproduce

No response

Expected behaviour

No response

Anything else?

No response

Pipeline version

15.0.0

@mathiasbio mathiasbio added the Bug Something isn't working label Jun 26, 2024
@github-project-automation github-project-automation bot moved this to Todo in BALSAMIC Jun 26, 2024
@mathiasbio
Copy link
Collaborator Author

mathiasbio commented Jun 26, 2024

I have investigated this a bit:

  • I cannot see the variant in the most raw output of TNscope
  • I have tried lowering the TLOD score to 3 but it still did not show up
  • I tried running the analysis on the latest version of Sentieon but it still did not show up
  • I tried running without BSQR tables, still no variant.
  • When I removed the normal sample the variant showed up. Which is strange as the normal doesn't have the variant at all.

I have notified Sentieon of all of this and am waiting for a response to see if they have any ideas on what's happening and how to proceed

@mathiasbio
Copy link
Collaborator Author

I've added a google-sheet on the drive to track my investigations into this: https://docs.google.com/spreadsheets/d/1Ekz_GX7QX1uL4_DzassxdYj6e2Vlij98Y-H3CJf93oA/edit?usp=sharing

So far I've managed to make the variant appear without the normal sample, and by forcing TNscope to call the variant using the --given function. At least then I could get the metrics of the variant and I could compare the standard call with the normal and without the normal and see if any parameter changed towards the direction of variants that are usually called as PASS.

I identified 2 potential metrics that may be responsible:

metric explanation comparison
BaseQRankSumPS Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities per sample Went from 1.428 to -1.059 when removing normal sample
SOR Symmetric Odds Ratio of 2x2 contingency table to detect strand bias Went from 0.77 to 0.91 when removing normal sample

I next want to dig into these parameters and see if I can adjust some thresholds for them in TNscope, but there are no options available when running --help, so I have contacted Sentieon to ask for their "hidden options..."

@mathiasbio
Copy link
Collaborator Author

mathiasbio commented Jul 1, 2024

I received a reply from Don Freed at Sentieon regarding this missing variant, and in short he hasn't been able to make sense of why it wasn't called either, but managed to make it appear by adjusting the --min_tumor_allele_frac 0.01 which also doesn't make sense given that the tumor allele-frequency is 0.5.

Reply in full:

Hi Mathias,

Thank you for all of the information and the bam snippets. Thank you for your patience here as well; this week has been quite busy.

I have reviewed the missed variant call. The variant can be called with the argument --min_tumor_allele_frac 0.01 (or lower). Here is the variant record after --min_tumor_allele_frac 0.01 is added to the command:
5     1295228     .     G     A     50.18 PASS  ECNT=1;FS=4.724;HCNT=1;MAX_ED=.;MIN_ED=.;NLOD=6.02;NLODF=5.39;PV=0.0009;PV2=0.0012;SOR=0.774;TLOD=150.45    GT:AD:AF:ALTHC:ALT_F1R2:ALT_F2R1:BaseQRankSumPS:ClippingRankSumPS:DP:DPHC:FOXOG:MQRankSumPS:NBQPS:QSS:REF_F1R2:REF_F2R1:ReadPosEndDistPS:ReadPosRankSumPS   0/1:50,43:0.462366:41:28:15:-1.059:0.000:93:91:0.651:0.000:37.953:1968,1632:26:24:32.882:-0.940 0/0:20,0:0:0:0:0:.:.:20:20:.:.:.:740,0:14:6:26.850:.
Variants with an allele fraction less than --min_tumor_allele_frac are essentially ignored by the caller. This site has a relatively high AF, and I am still looking into why it was not called with the default --min_tumor_allele_frac setting (0.03) as the alt allele AF is much higher than 0.03.

"I next wanted to see if there was any parameter in TNscope that I could adjust for these parameters to see if I could make the variant emerge when running the tool with the matched normal, but there are none that I can see when running --help so I wonder if there's any in the hidden parameters. Could you send me a list of those parameters or some method by which I can make them appear in commandline?" -- Mathias

There are some hidden parameters that can be tuned to adjust TNscope's variant calling. Here are the setting (including some hidden parameters) that I have found to be the most useful:

Support for the alt allele in the tumor sample

--min_init_tumor_lod - Minimum tumorLOD for candidate selection (default: 4.0)
--min_tumor_lod - Minimum tumorLOD (default: 6.3)
--min_tumor_allele_frac - Minimum tumor allele frac (default: 0.03)

Support for the alt allele in the normal sample

--min_init_normal_lod - Minimum normalLOD for candidate selection (default: 0.5)
--min_normal_lod - Minimum normalLOD (default: 2.2)
--min_dbsnp_normal_lod - Minimum normalLOD at dbSNP site (default: 5.5)
--max_normal_alt_cnt - Maximum number of normal alt allele reads
--max_normal_alt_qsum - Maximum normal alt allele qual score sum
--max_normal_alt_frac - Maximum normal alt allele fraction
--max_fisher_pv_active - Maximum p-value from a Fisher's exact test of support for the alt allele in the tumor sample

Other useful settings

--prune_factor - Pruning factor in the kmer graph (default: 2) - Setting --prune_factor 0 enables dynamic pruning which can be helpful for high coverage samples

I hope this is helpful. Please reach out if you have any questions.

Best regards,
Don

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
Status: Todo
Development

No branches or pull requests

1 participant