This package is compatible with Python 3.8.2. You can choose your threads number to process on console. And configuration the user agent + crawl timeout in config.py
file.
- Download
top-1m.csv.zip
and unziptop-1m.csv
to root folder. - Install modules.
- Run it!
Source 1m websites
http://s3.amazonaws.com/alexa-static/top-1m.csv.zip
Install Modules
pip install -r requirements.txt
Run
python run.py
Thank you for reading!