Web Crawler

Overview

The Web Crawler is a powerful cybersecurity tool that allows you to crawl the web, exploring links and sublinks, and displaying all the discovered links. This tool is designed to help cybersecurity professionals and web developers gain insight into the structure of websites and identify potential vulnerabilities.

Features

Web Crawling: The tool starts at a specified URL and recursively crawls through links and sublinks on web pages.
Link Display: All discovered links are displayed in an organized and user-friendly format, making it easy to analyze and assess the website's structure.
Customizable Crawling: You can customize the depth of the crawl and set other parameters to tailor the tool's behavior to your specific needs.
Export Results: Save the crawled links to a file for further analysis or documentation.

Installation

Clone the repository to your local machine:

git clone https://github.com/sbmmahato/webcrawler.git

Change to the project directory:
```
cd web-crawler
```

Usage

Basic Usage

To start a basic web crawl, use the following command:

python3 webcrawler.py https://example.com

Replace https://example.com with the URL you want to start crawling from. The tool will start the crawl and display the discovered links on the terminal.

The file with the list of the crawled websites then gets saved in web-crawler/recon/{folder named as the domain}/crawler_output.txt

Credit

This project is made by Subham Mahato.

Contributions

Contributions to this project are welcome. Feel free to open issues or submit pull requests to help improve this tool.

Disclaimer

This tool is intended for educational and cybersecurity research purposes only. Ensure you have proper authorization before using it on any website or network you do not own or have permission to scan.

Contact

For questions or inquiries, you can contact the project maintainer at sbmwork07@gmail.com.

Happy crawling! 🕷️🌐

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
recon/google.com		recon/google.com
README.md		README.md
subdomain_finder.py		subdomain_finder.py
subdomains-top1mil-5000.txt		subdomains-top1mil-5000.txt
webcrawler.py		webcrawler.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Crawler

Overview

Features

Installation

Usage

Basic Usage

Credit

Contributions

Disclaimer

Contact

About

Releases

Packages

Contributors 2

Languages

sbmmahato/web-crawler

Folders and files

Latest commit

History

Repository files navigation

Web Crawler

Overview

Features

Installation

Usage

Basic Usage

Credit

Contributions

Disclaimer

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages