A tool for web crawling & content discovery.
git clone https://github.com/AhmedConstant/BlindCrawler.git
cd /BlindCrawler
sudo pip3 install requirements.txt
python3 BlindCrawler.py -s https://domain.com
python3 BlindCrawler.py -s https://sub.domain.com/path
python3 BlindCrawler.py -s https://sub.domain.com/path --random-agents
python3 BlindCrawler.py -s https://sub.domain.com/path -c "key: value; key:value"
- Process
- Crawle the subdomains to expand the discovery surface.
- Crawle /robot.txt for more URLs to crawle.
- Crawle /sitemap.xml for more URLs to crawle.
- Use web archive CDX API to get more URLs to crawle.
- Output
- Performance
- There will be a continuous process to make performance as fast as possible
- Design
- OOP Design
- Good Documentation.
- Easy to edit the script code
-
Relase beta version. - Output in JSON, XML and CSV formats.
- Bruteforce for the sensitive files and directories.
- Extract strings with high entropy from crawled pages. [UUID, Key..etc]
- Recognize the static/repetitive Urls to avoid crawling it & reduce time and resources.
- Let the user provide its own pattern to extract from crawled pages.
- Create a custom wordlist for directory bruteforcing.
- Search for potential DOM XSS vulnerable functions.
- Fuzzing the GET Parameters.
- .....
Ahmed Constant Twitter