Skip to content

avinashvarna/sa_wiki_text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sanskrit Wikipedia Text Dump

This repo uses WikiExtractor for dumping the text from Sanskrit Wikipedia. The zipped dumped text in XML format is available here. The XML document format is specified here.

Please see this notebook for an example of how to use the data.

As a hack, this repo uses Travis-CI to run the dump script periodically and upate the dump.

About

Dump text from sanskrit wikipedia

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published