Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symbol table #4

Open
hhaoyan opened this issue Jul 6, 2018 · 0 comments
Open

Symbol table #4

hhaoyan opened this issue Jul 6, 2018 · 0 comments

Comments

@hhaoyan
Copy link
Contributor

hhaoyan commented Jul 6, 2018

Hi @OlgaGKononova can you explain how did you make the symbol table?
I assumed they are HTML special symbols. But I think some of them are not working in Chrome. Please see this fiddle: https://jsfiddle.net/vfj3hw0q/1/ Thus, I'm not sure if the replacement of symbols like &Agr; -> Α really works for our project. Maybe the table needs updating.

In addition, I found this package https://github.com/chartbeat-labs/textacy to be extremely useful. It essentially does the same thing as this repo, but with a nicer interface and standard. Here is an example:

import textacy

text= 'ℏ. Then the mixtures were placed in alumina crucibles and sintered at 1200 ° C for 4 h in air. '
print(repr(text))

text = textacy.preprocess_text(text, fix_unicode=True)
print(repr(text))
'ℏ. Then the mixtures were placed in alumina crucibles and sintered at 1200\xa0° C for 4\xa0h in air. '
'ℏ. Then the mixtures were placed in alumina crucibles and sintered at 1200 ° C for 4 h in air.'

So I suggest looking into this package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant