Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keyerror has been occurred when I run pretrainer.py #8

Open
Dharmogata opened this issue May 22, 2020 · 4 comments
Open

Keyerror has been occurred when I run pretrainer.py #8

Dharmogata opened this issue May 22, 2020 · 4 comments

Comments

@Dharmogata
Copy link

Dharmogata commented May 22, 2020

Hello, Xuhan
I got a key error when I run pretrainer.py scrip.

Could you please let me know why?

Thank you in advance.

keyerror

keyerror2

@martin-sicho
Copy link

I think this is the same problem as discussed in PR #3.

@Dharmogata
Copy link
Author

Dharmogata commented May 26, 2020

@martin-sicho
Thank you so much for your help. and I have got many keyErrors with '\' and 65, 66, 3 ... and '\' again.
Could please give any advise?
keyerror
keyerror2

@martin-sicho
Copy link

martin-sicho commented May 27, 2020

No problem @gwanseum. It is some time since I worked with the internals of DrugEx so @XuhanLiu should be able to assist you better, but maybe we can still figure this out.

The '/' and '\' tokens have to do with stereochemistry, the configuration around the double bond to be precise. DrugEx was not designed with stereochemistry in mind as far as I know. You can try to include it in your 'voc.txt' and see what happens or you could strip srereochemistry from the input altogether. It is quite strange there would be two slashes next to each other, though. So the '\\' anomaly is probably a result of something trying to escape the backslash in your input),

The errors with numbers 65 and so on indicate that there is some kind of mismatch in the number of characters in the vocabulary. It is hard to interpret this since I don't know what your input looks like, though.

So I would suggest you check your inputs for strange characters and make sure that the vocabulary is created correctly.

By the way, I also have a slightly more user-friendly version of DrugEx here: https://github.com/martin-sicho/DrugEx/tree/feature/api. It can generate the voc.txt file from the input automatically so that you have all the tokens. It is not really documented yet (except for this notebook) and I would still suggest to check your input before using it, though.

@Dharmogata
Copy link
Author

@martin-sicho I really appreciate your kind advice and finally found out what went wrong!

and I will definately go over a suggested user-friendly version of DrugEx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants