You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We will need to perform quality control on sequencing reads, BAM files generated by aligners and alignment statistics output by aligners.
Use the small BAM file provided at the following link to test out each of the quality control tools in your container locally.
You need to understand what inputs are required by each quality control tool, and the expected outputs.
FastQC
Requires sequencing reads as input.
Raw sequencing reads
Trimmed sequencing reads
Alignment log files
Samtools
Requires as input a BAM file generated by aligners. BAM file must be sorted, and have an index file. Refer to samtools --help for documentation regarding the stats we want to include in our report:
depth
flagstat
idxstats
stats
QualiMap
Requires as input BAM file generated by aligners. BAM file must be sorted, and have an index file. Refer to qualimap --help for documentation regarding bamqc and rnaseq.
The rnaseq module requires additional input files such as a GTF file. You can use the following GTF file here which corresponds to the test BAM file.
Remove the first line in the genes.gtf file, its chromosome name is nonsensical:
RSeQC requires annotations in BED format for the -r flag. Convert the GTF file to a BED file. I had serious issues using gtf2bed so as a workaround, use gffread:
Quality control is arguably the most difficult step, as it requires intimate knowledge of the entire workflow.
Work in collaboration with groups focusing on mapping RNA-Seq reads - they will be able to tell you the output files generated by aligners.
MultiQC goes at the very end of the workflow, collecting all of the files produced by fastqc, rseqc, qualimap and samtools. This will create a very comprehensive HTML report at the end of our workflow 😎
We will need to perform quality control on sequencing reads,
BAM
files generated by aligners and alignment statistics output by aligners.Use the small
BAM
file provided at the following link to test out each of the quality control tools in your container locally.You need to understand what inputs are required by each quality control tool, and the expected outputs.
FastQC
Requires sequencing reads as input.
Samtools
Requires as input a
BAM
file generated by aligners.BAM
file must be sorted, and have an index file. Refer tosamtools --help
for documentation regarding the stats we want to include in our report:depth
flagstat
idxstats
stats
QualiMap
Requires as input
BAM
file generated by aligners.BAM
file must be sorted, and have an index file. Refer toqualimap --help
for documentation regardingbamqc
andrnaseq
.The
rnaseq
module requires additional input files such as aGTF
file. You can use the followingGTF
file here which corresponds to the testBAM
file.Remove the first line in the
genes.gtf
file, its chromosome name is nonsensical:bamqc
rnaseq
RSeQC
RSeQC
requires annotations inBED
format for the-r
flag. Convert theGTF
file to aBED
file. I had serious issues usinggtf2bed
so as a workaround, usegffread
:gffread -F --keep-exon-attrs genes.gtf --bed > genes.bed
infer_experiment.py
(A)bam_stat.py
(A)inner_distance.py
(PE only, otherwise empty files) (B)read_distribution.py
(A)read_duplication.py
(B)junction_annotation.py
(C)junction_saturation.py
(A)The meaning behind
A/B/C
:A: python script does not have
-o
flag, redirect stdout to.txt
file using>
:infer_experiment.py -i RAP1_UNINDUCED_REP2.Aligned.out_sorted.bam -r genes.bed > RAP1_UNINDUCED_REP2.Aligned.out_infer_experiment.txt
B: Has the
-o
flag, pass file baseName to arg:C: Has
-o
flag, but must redirect output to.txt
file using2>
formultiqc
compatibility:junction_annotation.py -i RAP1_UNINDUCED_REP2.Aligned.out_sorted.bam -r genes.bed -o RAP1_UNINDUCED_REP2.Aligned.out 2> RAP1_UNINDUCED_REP2.Aligned.out_junctions.txt
The text was updated successfully, but these errors were encountered: