Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap / todos #1

Open
vadimnazarov opened this issue Mar 12, 2015 · 0 comments
Open

Roadmap / todos #1

vadimnazarov opened this issue Mar 12, 2015 · 0 comments
Assignees

Comments

@vadimnazarov
Copy link
Member

vadimnazarov commented Mar 12, 2015

2.0 version

MAAG

  • Implement shifts in event probabilities.
    • In Clonotypes.
    • In MAAGBuilder.
  • Replace generation probability computing with forward-algo-like procedure.
  • Add MarkovChain with errors.
    • Tests
  • Implement errors in MAAGBuilder.
    • V.
    • D.
    • J.
  • Implement errors in MAAGForward-backward.
    • VJ (waiting for 100K test)
    • VDJ
  • Implement errors in alignments.
  • Fix replacement of MAAG event probe in MAAGBuilder.
  • Add move assignment operator to MAAG.
  • With which value initialise error probability?

PAM

  • Implement a PAM + inference algorithm with errors in alignments.
    • VJ
    • VDJ
  • Fix segfault
    "There are four common mistakes that lead to segmentation faults: dereferencing NULL, dereferencing an uninitialized pointer, dereferencing a pointer that has been freed (or deleted, in C++) or that has gone out of scope (in the case of arrays declared in functions), and writing off the end of an array.
    A fifth way of causing a segfault is a recursive function that uses all of the stack space. On some systems, this will cause a "stack overflow" report, and on others, it will merely appear as another type of segmentation fault. "

IO

  • Fix Python converter (V / D / J alignments column instead of starts/ends columns)
  • Fix writer
  • Refactor parser.
  • Refactor parser with the new aligner with virtual functions instead of templates.
  • Implement a separate class for align all genes on clonotypes sequences. Pass it as a object to Parser if you (user) want to.
    • Implement SW local aligner for Variable genes.
    • Implement SW local aligner for Joining genes.
  • Add translation subroutine.
  • Add aligner parameters for alignment - thresholds for length / score, etc.

2.1 version

MAAG

  • Add MarkovChain to MAAG (for amino acids).
    • VJ
    • VDJ
  • Implement MAAGaa
    • VJ
    • VDJ
  • Implement amino acid sequence MAAG builder.
    • Tests.

IO

  • Implement amino acid aligner.
    • VJ
      • Tests.
    • VDJ
      • Tests.

2.2 version

PAM

  • Data diversity measure.
  • Implement and test new secret EM algorithm.
    • Save #iter for each parameter, not globally.

2.3 version

Optimisations

2.4 version

Docs

  • Add support for high precision numbers or decide to work only with long doubles.
  • Write API documentation using Doxygen.
  • Write general / usage documentation using MkDocs.
  • Publish all documentation on GitHub pages.

2.5 version

IO

  • MAAG serialization.
    • Binary representation.
      • Tests.
    • Reading.
      • Tests.
    • Writing.
      • Tests.
  • ??? Memory mapped MAAG repertoire in case of very large files (align -> save to disk -> read from the memory mapped file).

Far Future

MAAG

  • Add checks for zero or error gene segments and other events in MAAG builder.

AAPAG

  • Implement AAPAG (Amino Acid Pattern Assembly Graph).
  • Implement fast generation of neighbour amino acid sequences.

Optimisations

  • Play with SIMD https://github.com/p12tic/libsimdpp
    • markov chains, probs in forward-backward
    • computing of full probabilities
  • Rewrite all using templates - in this case code will be without unnecessary "ifs". Basic scripts (compute, inference and generate) for each possible recombination.
  • Do return value optimisation everywhere when possible.
  • Check if lazy evaluation can be added anywhere.
  • Decide to refactor or not MarkovChain in MAAGBuilder.
  • Branching (if - statements) optimisations.
    • Try to always build event indices MMC, just do not include it to the resulting MAAG.
    • Move if (full_build) from the cycles to their own out cycles with only one cycle in MAAGBuilder.
    • ?: instead of if-else in MAAGBuilder deletions and insertions.
  • Check speed in ClonotypeBuilder in returning void vs returning ClonotypeBuilder& procedures.
  • Use fixed-size matrices in some cases like VJ deletions because all VJ gene segments sequences are pretty similar in size. (???)
  • Rewrite ModelParameterVector with plain arrays.
  • Optimise sequence class (currently std::string, need speed and memory improvements using bit vectors).
  • Compilation options which removes all verbosing for speed.

Refactoring

  • Replace all raw pointer with std::unique_ptr.
  • Add Google Test instead of my test.
  • Shared ptr for VDJRecombinationGenes.

Other

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant