Skip to content

Predicting YouTube video genres using Linear Regression and a trending YouTube videos dataset.

Notifications You must be signed in to change notification settings

KyleNThao/Youtube_BiClassification

Repository files navigation

About

Using a dataset obtained from Kaggle, this project attempts to identify different generes of videos by analyzing trending U.S. Youtube video's tags, descriptions, and titles. Data is trimmed and the more common and unnecessary words are weeded out for vectorization. Words that were weeded out including common english words like to and a as well as words like youtube.com, instagram, and patreon that provided no meaningful information to the categorical genres.

This project utilized Binary classification and two different models for comparison, a Fully Connected Neural Network and a Convultion Neural Network.

Viewing the project

This project was done in Python notebook and can be viewed alternatively in Jupyter's nbviewer for the best results.

Link to the project's nbviewer

Dataset

While this project only utilized the U.S. dataset of trending videos, the full set gathered by the user Mitchell J., can be found online and downloaded here:

Trending Youtube Video Statistics

Final report

Documentation and presentations folder contains the final report with the hypothesis and conclusion to the project. I have also included the PowerPoint presentation used.

About

Predicting YouTube video genres using Linear Regression and a trending YouTube videos dataset.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published