This is the GitHub repository for Madelyne Xiao's 2018 - 2019 BridgeUP:STEM internship group
This is the home of #coralcrew's internship code and projects. By the end of the internship, we will have :
- written data cleaning, analysis, and visualization scripts for mitogenome data
- gained a deeper understanding of black coral and anemone phylogenies
- built some phylogenetic trees of our own!
Jot down some of your own goals here:
Below, you'll find a week-by-week breakdown of our work.
Week 18: Beginning Data Visualization
- Reviewing Python Pandas dataframes
- Getting our feet wet with some sunspot data! Graphing and binning datapoints.
- AT and GC skew analysis for leading/lagging strand identification
- purine and pyrimidine skew analysis for heavy/light strand identification
- building our own DNA walk visualizers using Python's tk package
Week 15: Phylogenetics in the Wild
- finishing translations of our data files
- exploring some well-known phylogenetics concepts, as found in the museum!
- a talk with Dr. Nathalie Goodkin, a marine chemist in Earth and Planetary Sciences
Week 14: Genetic Codes and Translation
- learning about different types of genetic codes
- translating our raw nucleotide data files using Python dictionaries
- reviewing the pipeline we've built over the past three months, from raw data to tree
- starting an exploratory data analysis of our raw FASTA files
- Learning about the variety of tree-drawing methods available
- Using Phylip to build some phylogenetic trees from our distance matrix input files
Week 11: Presentations and break
- presentations -- great job, coral crew!
Week 10: More Alignment and Distance Matrices
- creating input files for tree construction!
- preparing for next week's mid-year presentation
Week 9: More Alignment and Distance Matrices
- building distance matrices from alignment files (and, from distance matrices, building trees!)
- start thinking about Dec. 20 mid-year presentations
Week 8: Finishing Assembly, More Alignment
- we'll present our solutions to the assembly algorithm (at long last!)
- more work on BLAST with Biopython, intro to distance matrices
- short week -- on Tuesday, we spoke to Mercer about his research and got a tour of the wet collections -- saw a giant squid, a giant aquatic isopod!
Week 6: Assembly cont'd, beginning Alignment
- made our own DeBruijn graphs with texts of our choice!
- learned about the bridges of Konigsberg problem and Eulerian walks
- began working with BLAST for sequence lookups
Week 5: Assembly Algorithms (short week)
- on Thursday, worked on writing our own brute-force algorithms to align two nucleotide contigs.
- also discussed big-O notation, algorithmic efficiency, and some common algorithmic problems (e.g., sorting, traveling salesman)
- discussed the process of genome assembly through a hands-on 'assembly' of familiar and less-familiar sets of song lyrics (context is important!)
- tried our hand at building trees with a candy phylogeny (happy Halloween!)
- began writing our own functions to solve the noisy data problems we encountered last week
- learned about some common features of noisy biological data
- intro to a new Python package, BioPython!
- played a collaborative story-writing game called Exquisite Corpse to practice git pulling, pushing, and text-editing on the command line
- learned some nifty new Unix commands and shortcuts!