For each processed sample the pipeline stores results into a folder named after the sample identifier. These folders are created in the directory specified as a parameter in params.results
.
Result files for this workshop can be found in the folder results
within the current folder. There you should see a directory called ENCSR000COQ/
containing the following files:
- Variant calls
-
final.vcf
This file contains all somatic variants (SNVs) called from RNAseq data. You will see variants that pass all filters, with the
PASS
keyword in the 7th field of the vcf file (filter status
), and also those that did not pass one or more filters.commonSNPs.diff.sites_in_files
Tab-separated file with comparison between variants obtained from RNAseq and "known" variants from DNA.
The file is sorted by genomic position and contains 8 fields:
1
CHROM
chromosome name;
2
POS1
position of the SNV in file #1 (RNAseq data);
3
POS2
position of SNV in file #2 (DNA "known" variants);
4
IN_FILE
flag whether SNV is present in the file #1 '1', in the file #2 '2', or in both files 'B';
5
REF1
reference sequence in the file 1;
6
REF2
reference sequence in the file 2;
7
ALT1
alternative sequence in the file 1;
8
ALT2
alternative sequence in the file 2
known_snps.vcf
Variants that are common to RNAseq and "known" variants from DNA.
- Allele specific expression quantification
-
ASE.tsv
Tab-separated file with allele counts at common SNVs positions (only SNVs from the file
known_snps.vcf
)The file is sorted by coordinates and contains 13 fields:
1
contig
contig, scaffold or chromosome name of the variant
2
position
position of the variant
3
variant ID
variant ID in the dbSNP
4
refAllele
reference allele sequence
5
altAllele
alternate allele sequence
6
refCount
number of reads that support the reference allele
7
altCount
number of reads that support the alternate allele
8
totalCount
total number of reads at the site that support both reference and alternate allele and any other alleles present at the site
9
lowMAPQDepth
number of reads that have low mapping quality
10
lowBaseQDepth
number of reads that have low base quality
11
rawDepth
total number of reads at the site that support both reference and alternate allele and any other alleles present at the site
12
otherBases
number of reads that support bases other than reference and alternate bases
13
improperPairs
number of reads that have malformed pairs
- Allele frequency histogram
-
AF.histogram.pdf
This file contains a histogram plot of allele frequency for SNVs common to RNA-seq and "known" variants from DNA.