Skip to content

This repository provides tools and instructions for web scraping assignments from the Graphy platform. It guides you on how to scrape data from Graphy and clean it using Python, making the data ready for use.

Notifications You must be signed in to change notification settings

VigyanShaala-Tech/Webscraping-Assignments-Graphy

Repository files navigation

Web Scraping Assignment

Description

This repository contains four main folders:

  1. scripts:

    • This folder contains all the code necessary for cleaning our web scraping data as part of the assignment.
    • It includes Python scripts which will help to clean the data we extracted from websites.
  2. output:

    • The "Output" folder is where you will find the result of the web scraping process.
    • This folder contains the output Excel sheet generated as a result of running the code.
    • You can access the final scraped data and results in this folder.
  3. data_files:

    • The "data_files" folder contains source files required to support the code which we extracted from web scraping using Postman API.
  4. Standard Operating Procedure (SOP):

    • The "SOP" folder provides detailed instructions and guidelines on how to use the code in the "Code" folder for cleaning the assignment.
    • It also details how we should perform web scraping to obtain data from the Postman API.
    • This document outlines the step-by-step procedure for executing the process of web scraping using various tools.

Usage

To begin using this project, follow these steps:

  1. Code Execution:

    • Refer to the SOP (Standard Operating Procedure) provided in the "SOP" folder for detailed instructions on how to execute the web scraping code located in the "Code" folder.
  2. Review Output:

    • Once the code has been executed, the resulting data will be stored in the "Output" folder in an Excel sheet.
    • You can review this data to ensure the web scraping process was successful.

Getting Started

  1. Clone the repository to your local machine:
    git clone <repository-url>
    
  2. Navigate to the project directory:
    cd <repository-directory>
    
    
  3. Follow the instructions in the SOP document to execute the code and perform web scraping.

Prerequisites

  1. Python 3.x
  2. Required Python libraries (listed in the SOP document)

About

This repository provides tools and instructions for web scraping assignments from the Graphy platform. It guides you on how to scrape data from Graphy and clean it using Python, making the data ready for use.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published