1m Alexa Website's Titles Crawler

This package is compatible with Python 3.8.2. You can choose your threads number to process on console. And configuration the user agent + crawl timeout in config.py file.

Download top-1m.csv.zip and unzip top-1m.csv to root folder.
Install modules.
Run it!

Source 1m websites http://s3.amazonaws.com/alexa-static/top-1m.csv.zip

Install Modules pip install -r requirements.txt

Run python run.py

Thank you for reading!

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
CoreModules		CoreModules
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
requirements.txt		requirements.txt
run.py		run.py
top-1m.csv		top-1m.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1m Alexa Website's Titles Crawler

About

Releases

Packages

Languages

License

tieutantan/1m-Alexa-Website-Titles-Crawler

Folders and files

Latest commit

History

Repository files navigation

1m Alexa Website's Titles Crawler

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages