-
Notifications
You must be signed in to change notification settings - Fork 4
Module: Sequencing
Niema Moshiri edited this page Jul 9, 2018
·
34 revisions
The Sequencing module simulates sequencing imperfections, such as the following:
- Sequence subsampling per individual
- Sequencing error
- Post-processing
- Consensus (ambiguity, etc.)
See the source code to see what is defined by the abstract class.
- Uses ART to simulate realistic Roche 454 reads (amplicon sequencing)
- Generates one sequencing run per sampled individual
- Requirements:
- Config Parameters:
-
art_454_path
: The path to yourart_454
executable (or simply"art_454"
if it is in yourPATH
variable) -
art_454_options
: The command-line arguments with which to runart_454
(excluding<-A|-B>
,<INPUT_SEQ_FILE>
,<OUTPUT_FILE_PREFIX>
, and<#_READS/#_READ_PAIRS_PER_AMPLICON>
)- To use default settings, simply use the empty string (i.e.,
""
)
- To use default settings, simply use the empty string (i.e.,
-
art_454_amplicon_mode
: The desired mode of amplicon sequencing- Specify
"single"
for single-end amplicon sequencing - Specify
"paired"
for paired-end amplicon sequencing
- Specify
-
art_454_reads_pairs_per_amplicon
: Number of reads (single-end) or read pairs (paired-end) per amplicon -
out_dir
: The simulation's output directory
-
- Uses ART to simulate realistic Roche 454 reads (paired-end)
- Generates one sequencing run per sampled individual
- Requirements:
- Config Parameters:
-
art_454_path
: The path to yourart_454
executable (or simply"art_454"
if it is in yourPATH
variable) -
art_454_options
: The command-line arguments with which to runart_454
(excluding<INPUT_SEQ_FILE>
,<OUTPUT_FILE_PREFIX>
,<FOLD_COVERAGE>
,<MEAN_FRAG_LEN>
, and<STD_DEV>
)- To use default settings, simply use the empty string (i.e.,
""
)
- To use default settings, simply use the empty string (i.e.,
-
art_454_fold_coverage
: The desired fold of read coverage -
art_454_mean_frag_len
: The average DNA fragment size for paired-end read simulation -
art_454_std_dev
: The standard deviation of the DNA fragment size for paired-end read simulation -
out_dir
: The simulation's output directory
-
- Uses ART to simulate realistic Roche 454 reads (single-end)
- Generates one sequencing run per sampled individual
- Requirements:
- Config Parameters:
-
art_454_path
: The path to yourart_454
executable (or simply"art_454"
if it is in yourPATH
variable) -
art_454_options
: The command-line arguments with which to runart_454
(excluding<INPUT_SEQ_FILE>
,<OUTPUT_FILE_PREFIX>
, and<FOLD_COVERAGE>
)- To use default settings, simply use the empty string (i.e.,
""
)
- To use default settings, simply use the empty string (i.e.,
-
art_454_fold_coverage
: The desired fold of read coverage -
out_dir
: The simulation's output directory
-
- Uses ART to simulate realistic Illumina NGS sequence data from the true sequences
- Generates one sequencing run per sampled individual
- Requirements:
- Config Parameters:
-
art_illumina_path
: The path to yourart_illumina
executable (or simply"art_illumina"
if it is in yourPATH
variable) -
art_illumina_options
: The command-line arguments with which to runart_illumina
(excluding-i
and-o
) -
out_dir
: The simulation's output directory
-
- Uses ART to simulate realistic SOLiD reads (amplicon mate-pair, F3-R3)
- Generates one sequencing run per sampled individual
- Requirements:
- Config Parameters:
-
art_SOLiD_path
: The path to yourart_SOLiD
executable (or simply"art_SOLiD"
if it is in yourPATH
variable) -
art_SOLiD_options
: The command-line arguments with which to runart_SOLiD
(excluding<INPUT_SEQ_FILE>
,<OUTPUT_FILE_PREFIX>
,<LEN_READ>
, and<READ_PAIRS_PER_AMPLICON>
)- To use default settings, simply use the empty string (i.e.,
""
)
- To use default settings, simply use the empty string (i.e.,
-
art_SOLiD_len_read
: The desired length of F3/R3 reads (max 75) -
art_SOLiD_read_pairs_per_amplicon
: The desired number of read pairs per amplicon -
out_dir
: The simulation's output directory
-
- Uses ART to simulate realistic SOLiD reads (amplicon paired-end, F3-F5)
- Generates one sequencing run per sampled individual
- Requirements:
- Config Parameters:
-
art_SOLiD_path
: The path to yourart_SOLiD
executable (or simply"art_SOLiD"
if it is in yourPATH
variable) -
art_SOLiD_options
: The command-line arguments with which to runart_SOLiD
(excluding<INPUT_SEQ_FILE>
,<OUTPUT_FILE_PREFIX>
,<LEN_READ_F3>
,<LEN_READ_F5>
, and<READ_PAIRS_PER_AMPLICON>
)- To use default settings, simply use the empty string (i.e.,
""
)
- To use default settings, simply use the empty string (i.e.,
-
art_SOLiD_len_read_F3
: The desired length of F3 reads (max 75) -
art_SOLiD_len_read_F5
: The desired length of F5 reads (max 75) -
art_SOLiD_read_pairs_per_amplicon
: The desired number of read pairs per amplicon -
out_dir
: The simulation's output directory
-
- Uses ART to simulate realistic SOLiD reads (amplicon single-end, F3)
- Generates one sequencing run per sampled individual
- Requirements:
- Config Parameters:
-
art_SOLiD_path
: The path to yourart_SOLiD
executable (or simply"art_SOLiD"
if it is in yourPATH
variable) -
art_SOLiD_options
: The command-line arguments with which to runart_SOLiD
(excluding<INPUT_SEQ_FILE>
,<OUTPUT_FILE_PREFIX>
,<LEN_READ>
, and<READS_PER_AMPLICON>
)- To use default settings, simply use the empty string (i.e.,
""
)
- To use default settings, simply use the empty string (i.e.,
-
art_SOLiD_len_read
: The desired length of F3 reads (max 75) -
art_SOLiD_reads_per_amplicon
: The desired number of reads per amplicon -
out_dir
: The simulation's output directory
-
- Uses ART to simulate realistic SOLiD reads (mate-pair, F3-R3)
- Generates one sequencing run per sampled individual
- Requirements:
- Config Parameters:
-
art_SOLiD_path
: The path to yourart_SOLiD
executable (or simply"art_SOLiD"
if it is in yourPATH
variable) -
art_SOLiD_options
: The command-line arguments with which to runart_SOLiD
(excluding<INPUT_SEQ_FILE>
,<OUTPUT_FILE_PREFIX>
,<LEN_READ>
, and<FOLD_COVERAGE>
)- To use default settings, simply use the empty string (i.e.,
""
)
- To use default settings, simply use the empty string (i.e.,
-
art_SOLiD_len_read
: The desired length of F3/R3 reads (max 75) -
art_SOLiD_fold_coverage
: The desired fold of read coverage -
art_SOLiD_mean_frag_len
: The mean fragment size for mate-pair read simulation -
art_SOLiD_std_dev
: The standard deviation of the fragment size for mate-pair simulation -
out_dir
: The simulation's output directory
-
- Uses ART to simulate realistic SOLiD reads (paired-end, F3-F5)
- Generates one sequencing run per sampled individual
- Requirements:
- Config Parameters:
-
art_SOLiD_path
: The path to yourart_SOLiD
executable (or simply"art_SOLiD"
if it is in yourPATH
variable) -
art_SOLiD_options
: The command-line arguments with which to runart_SOLiD
(excluding<INPUT_SEQ_FILE>
,<OUTPUT_FILE_PREFIX>
,<LEN_READ_F3>
,<LEN_READ_F5>
,<FOLD_COVERAGE>
,<MEAN_FRAG_LEN>
, and<STD_DEV>
)- To use default settings, simply use the empty string (i.e.,
""
)
- To use default settings, simply use the empty string (i.e.,
-
art_SOLiD_len_read_F3
: The desired length of F3 reads (max 75) -
art_SOLiD_len_read_F5
: The desired length of F5 reads (max 75) -
art_SOLiD_fold_coverage
: The desired fold of read coverage -
art_SOLiD_mean_frag_len
: The mean fragment size for mate-pair read simulation -
art_SOLiD_std_dev
: The standard deviation of the fragment size for mate-pair simulation -
out_dir
: The simulation's output directory
-
- Uses ART to simulate realistic SOLiD reads (single-end, F3)
- Generates one sequencing run per sampled individual
- Requirements:
- Config Parameters:
-
art_SOLiD_path
: The path to yourart_SOLiD
executable (or simply"art_SOLiD"
if it is in yourPATH
variable) -
art_SOLiD_options
: The command-line arguments with which to runart_SOLiD
(excluding<INPUT_SEQ_FILE>
,<OUTPUT_FILE_PREFIX>
,<LEN_READ>
, and<FOLD_COVERAGE>
)- To use default settings, simply use the empty string (i.e.,
""
)
- To use default settings, simply use the empty string (i.e.,
-
art_SOLiD_len_read
: The desired length of F3 reads (max 75) -
art_SOLiD_fold_coverage
: The desired fold of read coverage -
out_dir
: The simulation's output directory
-
- Uses DWGSIM to simulate realistic NGS sequence data from the true sequences
- Generates one sequencing run per sampled individual
- Requirements:
- DWGSIM
- Config Parameters:
-
dwgsim_path
: The path to your DWGSIM executable (or simply"dwgsim"
if it is in yourPATH
variable) -
dwgsim_options
: The command-line options with which to run DWGSIM (just the options, not<in.ref.fa>
or<out.prefix>
)- To use default settings, simply use the empty string (i.e.,
""
)
- To use default settings, simply use the empty string (i.e.,
-
out_dir
: The simulation's output directory
-
- Uses Grinder to simulate realistic Sanger sequence data from the true sequences
- Generates one sequencing run per sampled individual
- Requirements:
- Config Parameters:
-
grinder_path
: The path to your Grinder executable (or simply"grinder"
if it is in yourPATH
variable) -
out_dir
: The simulation's output directory
-
- Do not output any sequences
- Requirements:
- None
- Config Parameters:
- None
- Returns full error-free sequences for all viruses
- Requirements:
- None
- Config Parameters:
-
out_dir
: The simulation's output directory
-
Niema Moshiri & Siavash Mirarab 2016