Skip to content

This script is designed to download and verify HTTP/s and SOCKS5 proxies from public databases and files.

Notifications You must be signed in to change notification settings

mogelpeter/proxy-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Proxy Scraper and Checker

Stable

Discord

Script Description

This script is designed to download and verify HTTP/s and SOCKS5 proxies from public databases and files. It offers the following key features:

  • Configurable Threading: Adjust the number of threads based on your system's capability using a usage_level setting from 1 to 3.
  • Scraping Proxies: Automatically scrape HTTP/s and SOCKS5 proxies from various online sources.
  • Checking Proxies: Validate the functionality of the scraped proxies to ensure they are operational.
  • System Monitoring: Display the script's CPU and RAM usage in the console title for real-time performance monitoring.

Usage

  1. Installation:

    • Clone the repository or download the .zip file.
    • Navigate to the project directory.
  2. Running the Script:

    • Execute the script using:
      start.bat
      or
      python main.py
  3. Configuration:

    • The script uses a config.json file to manage settings.
    • Adjust the usage_level, and specify the list of URLs for HTTP/s and SOCKS5 proxies.
  4. Educational & Research Purposes Only:

    • This script is intended for educational and research purposes only. Use it responsibly and in accordance with applicable laws.

Requirements

  • Python 3.8+
  • All necessary packages are automatically installed when the script is run.

Example config.json

{
    "usage_level": 2,
    "http_links": [
        "https://api.proxyscrape.com/?request=getproxies&proxytype=https&timeout=10000&country=all&ssl=all&anonymity=all",
        "https://api.proxyscrape.com/v2/?request=getproxies&protocol=http&timeout=10000&country=all&ssl=all&anonymity=all"
    ],
    "socks5_links": [
        "https://raw.githubusercontent.com/B4RC0DE-TM/proxy-list/main/SOCKS5.txt",
        "https://raw.githubusercontent.com/saschazesiger/Free-Proxies/master/proxies/socks5.txt"
    ]
}

By following this documentation, you should be able to set up, run, and understand the Proxy Scraper and Checker script with ease.

Important Information!

For educational & research purposes only!

Detailed Documentation

Functions

generate_random_folder_name(length=32)

Generates a random folder name with the specified length.

remove_old_folders(base_folder=".")

Removes old folders with 32 character names in the base folder.

get_time_rn()

Returns the current time formatted as HH:MM:SS.

get_usage_level_str(level)

Converts the usage level integer to a string representation.

update_title(http_selected, socks5_selected, usage_level)

Updates the console title with current CPU, RAM usage, and validation counts.

center_text(text, width)

Centers the text within the given width.

ui()

Clears the console and displays the main UI with ASCII art.

scrape_proxy_links(link, proxy_type)

Scrapes proxies from the given link, retries up to 3 times in case of failure.

check_proxy_link(link)

Checks if a proxy link is accessible.

clean_proxy_links()

Cleans the proxy links by removing non-accessible ones.

scrape_proxies(proxy_list, proxy_type, file_name)

Scrapes proxies from the provided list of links and saves them to a file.

check_proxy_http(proxy)

Checks the validity of an HTTP/s proxy by making a request to httpbin.org.

check_proxy_socks5(proxy)

Checks the validity of a SOCKS5 proxy by connecting to google.com.

check_http_proxies(proxies)

Checks a list of HTTP/s proxies for validity.

check_socks5_proxies(proxies)

Checks a list of SOCKS5 proxies for validity.

signal_handler(sig, frame)

Handles SIGINT signal (Ctrl+C) to exit gracefully.

set_process_priority()

Sets the process priority to high for better performance.

loading_animation()

Displays a loading animation while verifying proxy links.

clear_console()

Clears the console screen.

continuously_update_title()

Continuously updates the console title with current status.