Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering and sorting in WTS module #4989

Open
Jakob37 opened this issue Nov 1, 2024 · 2 comments
Open

Filtering and sorting in WTS module #4989

Jakob37 opened this issue Nov 1, 2024 · 2 comments

Comments

@Jakob37
Copy link
Collaborator

Jakob37 commented Nov 1, 2024

Is your feature request related to a problem in the current program to new available techology or software? Please describe and add links/citations if appropriate.

Our CLGs have started using Tomte and the WTS module in Scout to interpret RNA-seq samples. It seems to work well for them, but some additions could make it even more useful!

Describe the solution you'd like

In particular when having a list containing more than a handful outliers, it would help to be able to:

  • Sort the rows on "P-value" or on "Value"
  • Filter the rows on "P-value" and (simultaneously) on "Value"
  • Filter to show only expression or splicing at a time

This is particularly useful if you load data with a more liberal cutoff, but still want to be able to focus on what is the most relevant.

A related question is - would it make sense to optionally also show the FDR? In omics-datasets I usually find FDR more informative than the p-value. (This is not something requested by the CLGs here, just an open question from me).

Additional context

A screenshot from the demo to have something to look at:

wts_screenshot

@dnil
Copy link
Collaborator

dnil commented Nov 1, 2024

I think we could manage that. Especially filter on type 😉
Screenshot 2024-11-01 at 11 43 57.

We have not seen a need for these, as the variants within panels have so far been very few. But I suppose that is decided with cutoffs on the pipeline side. Sorting is always a bit contentious, but we have no rank here so why not! 🎉

As for FDR adjusted, I am in principle all for any kind of multiple testing correction, but not quite sure how to apply it here. 😊 I was impressed and slightly confused at Benjamini&Hochberg and have not really tried to read up on Benjamin&Yekutieli.
How do you think we should apply it? For all genes, for the default panel or or for all clinical panels?I suppose we will not be able to feed any hypothesis about candidate genes to the pipeline, which would otherwise have been cool. We do parse both padjust and p_adjust_gene so should be simple enough to show either, perhaps just instead of the unadjusted P-vaule. 🤷

@Jakob37
Copy link
Collaborator Author

Jakob37 commented Nov 1, 2024

I think we could manage that. Especially filter on type 😉

Aha, nice, didn't spot that 😅

We have not seen a need for these, as the variants within panels have so far been very few. But I suppose that is decided with cutoffs on the pipeline side. Sorting is always a bit contentious, but we have no rank here so why not! 🎉

Sounds good, thanks! Yes, when running with a more explorative loose cutoff we sometimes ended up with a larger bunch of hits, also within a panel. Among those, lower p-value tended to seem more "real" (as one would hope for). Having some false positives in the list was deemed OK here by the CLG as she anyway would verify things with her own eyes.

As for FDR adjusted, I am in principle all for any kind of multiple testing correction, but not quite sure how to apply it here. 😊 I was impressed and slightly confused at Benjamini&Hochberg and have not really tried to read up on Benjamin&Yekutieli.
How do you think we should apply it? For all genes, for the default panel or or for all clinical panels?I suppose we will not be able to feed any hypothesis about candidate genes to the pipeline, which would otherwise have been cool. We do parse both padjust and p_adjust_gene so should be simple enough to show either, perhaps just instead of the unadjusted P-vaule. 🤷

Hmm. My understanding is that the point with the BH correction is to go from p-values "how unlikely is it that this is a fluke, if only looking at this transcript" to for FDR adjusted "if looking across this dataset at FDR < 0.1, max 10% of what I am looking at should be noise". If the p-values are nicely distributed that is, which they seemed to be for OUTRIDER but not FRASER when I looked ...

Anyway, I was just thinking to allow the user check the padjust value sent through from FRASER and OUTRIDER, which seems to be the FDR-corrected p-value. Not sure if it is preferably to see instead of the regular P-value though, maybe Scout-users are more used to the meaning of regular p-values 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants