-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New bulk 5'-RACE supported protocol, non-overlaping reads rescue #343
base: dev
Are you sure you want to change the base?
Conversation
Release 4.1
…kflow for this new 5'-RACE support
…parameter in nexflow user custom conf files for PRESTO_ASSEMBLEPAIRS_UMI process via args.
…MI always output failed reads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
why remove specific_5p_race_umi from nextflow_schema.json ?
I can't find any comments on my pull request, what should I do next please ?
Thank you very much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @JustBioinfo, sorry for the late review as I realized I never made my previous comments public. I have one comment for now, and to review the newly introduced specific_5p_race_umi
protocol it would be great to have more details about the library design of this protocol. In what way does it differ from the dt_5p_race_umi
protocol? Maybe a quick diagram of the library construct would be helpful here.
Additionally, we provide test sequencing data for each of the implemented protocols in the nf-core/test-datasets repository, and this is what is used for testing in the respective test config profiles, so we'd need to create as well a test profile with test data for this new protocol.
@@ -0,0 +1,40 @@ | |||
process PRESTO_MASKPRIMERS_ALIGN_TRIM { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This process is almost the same as presto_maskprimers_align
, so this one can be reused by just providing --primer_mask_mode
parameter as trim
I think I was just trying to solve some merge conflicts and might have removed your new profile from the config accidentally, let's see whether all other tests are passing! |
Hi @JustBioinfo, thanks for your PR. There are a couple of changes that would improve this code, by providing a new parameter only and not a new protocol. This way people can trim sequences preceding the UMI in the The new parameters should then be documented in the parameter schema, which can be done with the Regarding the non-overlapping read rescue, I'm hesitant to add the |
This PR adds :
library_generation_method
to allow the analysis of 5'RACE library where R1 reads not start directly by the UMI by adding a new processPRESTO_MASKPRIMERS_ALIGN_TRIM
that launchMaskPrimers.py align
in trim mode beforePRESTO_MASKPRIMERS_UMI
process.--assemblepairs_join
to allow non-overlapping reads to be rescued usingassemblepairs join
on failed reads fromassemblepairs align
. In fact, in our libraries we have a large proportion of reads which do not overlap, but which turn out to be detected as productive sequences at the end of the pipeline.This PR doesn't add it at the moment but is it possible to have in options the possibility of opting for IgBlast's 19-column mode ?
enhancement #342
PR checklist
Do I need to add tests, and if so, can you tell me how?
So far I've been testing with real data sets from my research lab, should I add a test data set on nf-core/airrflow branch on the nf-core/test-datasets repository ? If so, I'll check with my team to see what I can provide.
nf-core lint
).LookupError: Failed to clone from the remote:
https://github.com/nf-core/modules.git``nextflow run . -profile test,docker --outdir <OUTDIR>
).nextflow run . -profile debug,test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.I'm new to analyzing this type of data, so I'm not familiar with the various AIRR library generation methods. Can you help me to name the new supported protocols, so far I've called it
specific_5p_race_umi
but I'm not sure it's the right way to name it.Output Documentation in
docs/output.md
is updated.Have two new output folders, :
presto/trim_upstream_umi_linker
to store R1 reads where the UMI upstream sequence was trim.presto/08-assemble-pairs-join
if the new--assemblepairs_join
option is enabled.Are they well named to update docs/output.md ?
CHANGELOG.md
is updated.