Variant calls from capture-GBS data of chromosome 5 of a 4x bi-parental cross of Solanum tuberosum. Variant calling pipeline was kept simple with minimal QC or filtering to produce a rough example VCF for testing IO.
Additionally, an example mixed-ploidy VCF has been generated by (incorrectly) treating 10 samples as diploid instead of tetraploid.
Note that variant calling was performed with freebayes which encodes null genotype calls as a single .
regardless of ploidy.
Based to the VCF specification a single .
should be interpreted as a haploid null genotype and hence some tools may (correctly) interpret samples containing null calls as having variable ploidy.
Due to this issue, filtered VCFs have been included in which all null calls have been removed.
See freebayes issue #172.
Thanks to:
- Samantha Baldwin
- Katrina Monaghan
- Susan Thomson
for producing and publishing the sequencing data.
- NCBI BioProject: PRJNA414303
- NCBI SRR sample list: SRR_Acc_List.txt
- NCBI SRA sample data: SraRunTable.csv
- Reference Genome: PGSC v4.03
- Note: Reference genome includes unanchored regions (
ST4.03ch00
)
- Note: Reference genome includes unanchored regions (
sratoolkit/2.8.2.1
samtools/1.3.1
bwa/0.7.17
freebayes/1.3.4
bcftools/1.12
tabix/0.2.6
- Download data:
01_download.sh
- Align Reads:
02_alignment.sh
- Call variants:
03_calling.sh
- Filtering:
04_filter.sh
Tetraploid calls:
- Raw variant calls:
PRJNA414303.CHR5.vcf.gz
- Filtered to depth >= 10 in 1 or more samples:
PRJNA414303.CHR5.filterDP10.vcf.gz
- Filtered to exclude null calls:
PRJNA414303.CHR5.filterNullGT.vcf.gz
Mixed-ploidy calls:
- Raw variant calls:
PRJNA414303.CHR5.mixed.vcf.gz
- Filtered to depth >= 10 in 1 or more samples:
PRJNA414303.CHR5.mixed.filterDP10.vcf.gz
- Filtered to exclude null calls:
PRJNA414303.CHR5.mixed.filterNullGT.vcf.gz