Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review DNA BERT #52

Open
Shuyib opened this issue Jun 9, 2023 · 2 comments
Open

Review DNA BERT #52

Shuyib opened this issue Jun 9, 2023 · 2 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@Shuyib
Copy link
Owner

Shuyib commented Jun 9, 2023

BERT models are encoders which are good for natural language understanding. This is based on the documentation available on HuggingFace NLP course. We need to assess this. You'll need to find out how to move from raw text -> tokenize text -> model -> logits -> Prediction to find motifs which is a capability that has been indicated in their README.

MEME is still a black box and very computationally intensive. But since the source code of BERT is available we can assess it.

Link to DNA BERT

@Shuyib Shuyib added the help wanted Extra attention is needed label Jun 9, 2023
@Shuyib
Copy link
Owner Author

Shuyib commented Jun 12, 2023

🤔 reviewing the tokenizer. Should have start,stop codons and promoter regions should be marked. My assumption.

@Shuyib
Copy link
Owner Author

Shuyib commented Jul 19, 2023

Looking into this now? 🏃🏿

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
Development

No branches or pull requests

2 participants