Skip to content

Latest commit

 

History

History
66 lines (54 loc) · 3.31 KB

pipeline_results.adoc

File metadata and controls

66 lines (54 loc) · 3.31 KB

Results overview

For each processed sample the pipeline stores results into a folder named after the sample identifier. These folders are created in the directory specified as a parameter in params.results.

Result files for this workshop can be found in the folder results within the current folder. There you should see a directory called ENCSR000COQ/ containing the following files:

Variant calls

final.vcf

This file contains all somatic variants (SNVs) called from RNAseq data. You will see variants that pass all filters, with the PASS keyword in the 7th field of the vcf file (filter status), and also those that did not pass one or more filters.

commonSNPs.diff.sites_in_files

Tab-separated file with comparison between variants obtained from RNAseq and "known" variants from DNA.

The file is sorted by genomic position and contains 8 fields:

1

 CHROM

chromosome name;

2

 POS1

position of the SNV in file #1 (RNAseq data);

3

 POS2

position of SNV in file #2 (DNA "known" variants);

4

 IN_FILE

flag whether SNV is present in the file #1 '1', in the file #2 '2', or in both files 'B';

5

 REF1

reference sequence in the file 1;

6

 REF2

reference sequence in the file 2;

7

 ALT1

alternative sequence in the file 1;

8

 ALT2

alternative sequence in the file 2

known_snps.vcf

Variants that are common to RNAseq and "known" variants from DNA.

Allele specific expression quantification

ASE.tsv

Tab-separated file with allele counts at common SNVs positions (only SNVs from the file known_snps.vcf)

The file is sorted by coordinates and contains 13 fields:

1

 contig

contig, scaffold or chromosome name of the variant

2

 position

position of the variant

3

 variant ID

variant ID in the dbSNP

4

 refAllele

reference allele sequence

5

 altAllele

alternate allele sequence

6

 refCount

number of reads that support the reference allele

7

 altCount

number of reads that support the alternate allele

8

 totalCount

total number of reads at the site that support both reference and alternate allele and any other alleles present at the site

9

 lowMAPQDepth

number of reads that have low mapping quality

10

 lowBaseQDepth

number of reads that have low base quality

11

 rawDepth

total number of reads at the site that support both reference and alternate allele and any other alleles present at the site

12

 otherBases

number of reads that support bases other than reference and alternate bases

13

 improperPairs

number of reads that have malformed pairs

Allele frequency histogram

AF.histogram.pdf

This file contains a histogram plot of allele frequency for SNVs common to RNA-seq and "known" variants from DNA.