Skip to content

A simple data science project that involves web scraping, data cleaning and visualization, model building, and model explanation using character data from Marvel Fandom.

Notifications You must be signed in to change notification settings

ArnelMalubay/Marvel-Predicting-Appearances

Repository files navigation

Predicting Marvel Appearances using CatBoost

This project consists of four main parts.

  1. Scraping the Marvel Wiki page for data about Marvel characters using scrapy. For the main file used for scraping, navigate to marvelscraping/marvelscraping/spiders/marvel_spider.py. The scraped data is stored in marvelscraping/characters.csv. To run this scraping job, you can clone this repository and run

pip install -r requirements.txt

preferably on a virtual environment. Then, navigate to the marvelscraping folder and run

scrapy crawl characters -o characters.csv

on the terminal.

  1. Visualizing results and identifying patterns and trends
  2. Predicting number of appearances using Catboost
  3. Explaining created model using SHAP

About

A simple data science project that involves web scraping, data cleaning and visualization, model building, and model explanation using character data from Marvel Fandom.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published