We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In a use case of phonetics I need to distinguish the sound of བ (ba or wa), but this seems currently impossible with botok:
བ
ba
wa
རབ་གསལ་བས
རབ་གསལ་ - བས
བས
wé
བྱང་ཆུབ་བར་དུ
བྱང་ཆུབ་ - བར་ - དུ
བར
bar
is there any way I discriminate between the two with botok (or any other tool)?
The text was updated successfully, but these errors were encountered:
བར་དུ་ should be added to the vocab. I would argue that it's a frozen expression by now. We'll add instructions on how to do this in the botok docs
Sorry, something went wrong.
well, what I'll do with another POS tagger is to look at the n.rel tag of https://web.archive.org/web/20170824153724/http://larkpie.net/tibetancorpus/tags
n.rel
No branches or pull requests
In a use case of phonetics I need to distinguish the sound of
བ
(ba
orwa
), but this seems currently impossible with botok:རབ་གསལ་བས
is tokenized asརབ་གསལ་ - བས
(in that caseབས
is pronouncedwé
)བྱང་ཆུབ་བར་དུ
is tokenized asབྱང་ཆུབ་ - བར་ - དུ
(in that caseབར
is pronouncedbar
)is there any way I discriminate between the two with botok (or any other tool)?
The text was updated successfully, but these errors were encountered: