-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write abstract #6
Comments
I tested grid-search based parameter optimization with the python tool, ParOpt. I coded a wrapper function in python that outputs a single value obtained from abyss that we were interested in optimizing, N50. First, I optimized k, k-mer length. I then extended the method to do simultaneous optimization of four parameters, k,s,n & l. Afterwards, to test the quality of the genome assembly, I generated reports for all optimization runs using QUAST, utilizing the reference genome to assess the completeness of each genome assembly. |
I wrote an R script to plot a heat map of the output of spearmint optimization. |
I used optimx which is an R library wrapping several (14) optimizers. To be
able to use R, I wrote an R wrapper function to call abyss.
Of all the methods, nmkb seemed to have the best performance. That was not
striking as nmkb is a variation of Nelder-Mead for discrete parameter
optimization.
|
I used Jupyter notebook widgets with matplotlib to visualize how each iteration of optimization changes the parameters to maximize the parameter optimized. An example .ipynb file is available. |
Genetic Algorithms are a set of adaptive algorithms that have been applied in several optimization problems for the identification of Pareto Frontiers (a multi-metric optimal solution space for an evaluated function). GAs are also used in optimization problems because of their tractability for global optimization of multiple parameters simultaneously, in the presence of several local optimas. |
Apologies for making it so long, clubbed together my introduction and some of the interesting observations too. Cheers guys, it was great working with all of you! |
I tested a given data set (500k.fq ?) via Bayesian optimization using Spearmint and MongoDB. I collaborated with Shaun Jackman on this subpart and extended abyss.py to optimize for two parameters k and s. After 20 iterations a maximum for N50 of 19356 was found at k=28, s=200. |
I've implemented functions, in Python, to call upon ABySS and initiate assembly runs for the OPAL (OPtimization ALgorithm) program, a Python/Cython framework using a black-box program written in C/C++ and based on the NOMADS (Nonlinear Optimization Mesh-Adaptive Direct Search) algorithm. This allowed me to test for the optimization of the k value, given also its parameter, the variable N50 (maximum) to be measured against, and the dataset 200k. While the NOMADS algorithm behind OPAL does allow for multiple continuous and discrete values, I have yet to try them. For reporting the results, I still need to find a way to record and display all its running inputs-versus-outputs, since the program currently only writes out the final optimized value for the given measure in a single file, which is overwritten after each run. I also need to implement some conditional statements so the program does not halt indefinitely from failed ABySS make due to invalid inputs, nor skip over files already created from previous runs. |
Hi, all. We have to write our scientific and lay abstract. See hackseq/October_2016#76
Please write a few sentences about your work here. I'll put them all together. Thanks!
The text was updated successfully, but these errors were encountered: