Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collect gold-standard corpora #26

Open
reynoldsnlp opened this issue Sep 23, 2019 · 2 comments
Open

collect gold-standard corpora #26

reynoldsnlp opened this issue Sep 23, 2019 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@reynoldsnlp
Copy link
Owner

reynoldsnlp commented Sep 23, 2019

We need a large collection of gold-standard disambiguated Russian texts for FST/CG testing. One way or another, this will require converting tags and format to udar/CG3. Some possibilities include:

@reynoldsnlp reynoldsnlp self-assigned this Sep 23, 2019
@reynoldsnlp reynoldsnlp added enhancement New feature or request FST Bugs and improvements in the FST in giellatekno's repository labels Sep 23, 2019
@reynoldsnlp
Copy link
Owner Author

reynoldsnlp commented Sep 23, 2019

It looks like SynTagRus has now been published in a Universal Dependencies format: https://github.com/UniversalDependencies/UD_Russian-SynTagRus/tree/master

@reynoldsnlp
Copy link
Owner Author

also other UD treebanks exist: https://universaldependencies.org/#russian-treebanks

@reynoldsnlp reynoldsnlp removed the FST Bugs and improvements in the FST in giellatekno's repository label Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant