This work is the extension of the original Enahcner GAN. LA-based-enhancer-GAN tries to solve the problems faced by 1d Conv based enhancer GAN.
- Not every enhancer produced is present in the human genome
- more training time
- Unstable training
- A pretrained AE to learn the continuous representation of AEs]
- Less training data
- Less training time
- More accurate learning of Enhancer regions
- 43011 experimentally defined enhancers from human genome
- Information related to the version of the libraries can be found in the requirements.txt file.
- Install Libraries using requirements.txt
- Run the train_ae.py
- Get the pretrained AE model
- Run the test_ae.py
- Run the train_gan.py
- Get the results
- Do blast
- Perform biological analyses
- The work successfuly generates enhancers of similar size e.g. in our case it prodcues 131 Nucs Enhancers etc.
Sequences can be generated by the proposed mechanism, to check their reliability BLAST can be performed on the sequences so that the similarity with human genome can be checked. Below are the MSA results obatined from BLAST for one sequence:
The paper will be uploaded soon in arxiv.