Series Scraper for s.to

Description

This script is a web scraper that extracts information about movies and TV shows from a website called "s.to". The extracted data includes the title, season number, episode number, and the date when the episode was watched.

The script was mainly generated by using ChatGPT, a GPT-2 based chatbot that can generate code from natural language. The script was then modified to work with the s.to and aniworld.to website.

Requirements

Node.js and npm
A s.to / aniworld.to account

Usage

To use this script, you need to have Node.js and npm installed on your computer.

Clone the repository and navigate to the project directory in your terminal.
```
git clone https://github.com/Kamiikaze/sto-scraper
cd sto-scraper
```
Install the required dependencies using npm.
```
npm install
```
Create a `.env` file in the root directory of the project, and add the following variables with your own values:
```
PAGE_USERNAME=your_username
PAGE_PASSWORD=your_password

HEADLESS=true
DONT_LOAD_STYLES=true
DO_SCREENSHOTS=false
```
`PAGE_USERNAME:` your username on the s.to website.

`PAGE_PASSWORD:` your password on the s.to website.

`HEADLESS:` whether to run the browser in headless mode.

`DONT_LOAD_STYLES:` whether to block unnecessary resources such as stylesheets and fonts to speed up page loading.

`DO_SCREENSHOTS:` whether to take a screenshot of the logged in page.
Run the script using the following command:
```
npm run start
```
The script will start running and will output the data to a file in the ./public/data directory. If the DO_SCREENSHOTS variable is set to true, screenshots of the login page and each scraped page will also be saved in the dist directory.

Note: The script will stop after scraping the first X pages, where X is the value of the firstXPages variable in the script. If you want to scrape all pages, set this variable to 0.

Example Output

{
   "totalMovies": 1,
   "movieTitles": {
      "Peripherie": {
         "seasonCount": 1,
         "totalEpisodesCount": 6,
         "seasons": {
            "1": {
               "episodeCount": 6,
               "episodes": {
                  "1": {
                     "title": "Alternative Realität",
                     "seenAt": "21.02.2023 15:08:06 Uhr vor einem Tag"
                  },
                  "2": {
                     "title": "Empathiebonus",
                     "seenAt": "21.02.2023 16:15:57 Uhr vor einem Tag"
                  },
                  "3": {
                     "title": "Haptischer Nebel",
                     "seenAt": "21.02.2023 17:16:05 Uhr vor einem Tag"
                  },
                  "4": {
                     "title": "Jackpot",
                     "seenAt": "22.02.2023 15:36:28 Uhr vor 5 Stunden"
                  },
                  "5": {
                     "title": "Was ist mit Bob?",
                     "seenAt": "22.02.2023 16:39:28 Uhr vor 4 Stunden"
                  },
                  "6": {
                     "title": "Fick dich und friss Scheiße!",
                     "seenAt": "22.02.2023 17:38:39 Uhr vor 3 Stunden"
                  }
               }
            }
         }
      }
   }
}

Web View

I also created a web view for the scraped data. Just run the following command to start the web server:

npm run web-view

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
LICENSE.md		LICENSE.md
package-lock.json		package-lock.json
package.json		package.json
readme.md		readme.md
start.bat		start.bat
tsconfig.json		tsconfig.json
web-view.bat		web-view.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Series Scraper for s.to

Description

Requirements

Usage

Example Output

Web View

License

About

Releases

Languages

License

Kamiikaze/sto-scraper

Folders and files

Latest commit

History

Repository files navigation

Series Scraper for s.to

Description

Requirements

Usage

Example Output

Web View

License

About

Resources

License

Stars

Watchers

Forks

Releases

Languages