Skip to content

A tool for web crawling & content discovery

License

Notifications You must be signed in to change notification settings

AhmedConstant/BlindCrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BlindCrawler - Beta v1.0

alt text
A tool for web crawling & content discovery.

Installation

git clone https://github.com/AhmedConstant/BlindCrawler.git

cd /BlindCrawler

sudo pip3 install requirements.txt

Usage

Runtime

domain

python3 BlindCrawler.py -s https://domain.com

subdomain

python3 BlindCrawler.py -s https://sub.domain.com/path

random agents

python3 BlindCrawler.py -s https://sub.domain.com/path --random-agents

with cookies

python3 BlindCrawler.py -s https://sub.domain.com/path -c "key: value; key:value"

Features

Runtime

  • Process
    • Crawle the subdomains to expand the discovery surface.
    • Crawle /robot.txt for more URLs to crawle.
    • Crawle /sitemap.xml for more URLs to crawle.
    • Use web archive CDX API to get more URLs to crawle.
  • Output Runtime
    • A file with all crawled URLs
    • A file with all paths crawled
    • A file with subdomains discovered.
    • A file with schemes discovered.
    • A file with emails discovered.
    • a file with comments discovered
      Runtime
  • Performance
    • There will be a continuous process to make performance as fast as possible
  • Design
    • OOP Design
    • Good Documentation.
    • Easy to edit the script code

To-Do List

  • Relase beta version.
  • Output in JSON, XML and CSV formats.
  • Bruteforce for the sensitive files and directories.
  • Extract strings with high entropy from crawled pages. [UUID, Key..etc]
  • Recognize the static/repetitive Urls to avoid crawling it & reduce time and resources.
  • Let the user provide its own pattern to extract from crawled pages.
  • Create a custom wordlist for directory bruteforcing.
  • Search for potential DOM XSS vulnerable functions.
  • Fuzzing the GET Parameters.
  • .....

The Author

Ahmed Constant Twitter

About

A tool for web crawling & content discovery

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages