You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
About 200M tokens in total. Typically used for translation systems, but maybe useful to include as well (for applications ala translation matrix embeddings)?
The text was updated successfully, but these errors were encountered:
So, many "dead" links (for News-Commentary9.1.tar.gz and first matrix), I'll update this with results.
UPD: Better use root link - http://opus.nlpl.eu/, it's useful, but need to convert this to the simpler format
Aligned corpora for many European language pair (cs-en, ru-fr, …): http://opus.nlpl.eu/News-Commentary.php
About 200M tokens in total. Typically used for translation systems, but maybe useful to include as well (for applications ala translation matrix embeddings)?
The text was updated successfully, but these errors were encountered: