HIV Volatility project:
- Divide a fasta file into multiple fasta files based on identifiers
- Divide a csv file into multiple csv/fasta files based on identifiers
- Calculate the volatility
HIV AA Frequency project:
- Amino Acids Distribution.
- In-host Amino Acids Distribution.
- Delta of Amino Acids Distribution.
- Euclidean Distance.
- Automate Log Conversion Process
- Distribution Percentage Cutoff
- Translate a Excel/CSV row into a python list
- Filt out accession numbers from B.KR in a .NWK format file
- Convert AA sequence from csv to fasta
SARS-CoV-2 project:
- Extract sequences
- Clean up sequences
- CSV conversion
Flu project:
- NEW APPROACH: clean data by searching non-ACTG chars and remove the corresponding sequence directly in a Nucleotide file
- Calculate FD and stdev with multiple selection options (e.g. Position, seasons, group, country, ...)
- Calculate the difference between different FD profile
- Match sequence with all attributes
- Match sequence with groups (grouped by newick tree)
- Assign Hydropathy Value.
- Combine several fasta format files into one fasta file.
- Remove 8 Characters & dash Before each Dash in a newick format file.
- Group Amino Acids
- Equally distribute a stdev_output into 2 groups
- Extract position pairs based on a given range of p_value in a co-volatility matrix
- Compare two groups of position pairs and extract overlapping pairs
HCV project:
- (New version)Find Accession Numbers that contain special characters(e.g. #, $)
- Remove Accession Numbers that contain special characters