Skip to content

This project involves analyzing the Netflix dataset to uncover insights about the content available on the platform. The analysis focuses on understanding the distribution of movies and TV shows, release trends, and key statistics related to duration and seasons.

Notifications You must be signed in to change notification settings

VishalSinhaRoy/Netflix_dataanalysis---SQL

Repository files navigation

Netflix_data_analysis_SQL

netflix

This project explores the Netflix dataset using SQL to uncover key insights about the platform's content. The analysis answers several business-relevant questions, helping to identify trends in content distribution, release patterns, and characteristics of movies and TV shows.

📁 Dataset Overview

The dataset includes information about Netflix's movies and TV shows, such as:

type: Movie or TV Show

title: Title of the content

director: Director's name

cast: Main actors and actresses

country: Country of production

date_added: When the content was added to Netflix

release_year: The year the content was released

rating: Content rating (e.g., PG-13, TV-MA)

duration: Duration of movies or number of seasons for TV shows

listed_in: Genres or categories

description: A short description of the content

🛠️ Analysis Questions

The following SQL queries were used to explore the dataset and address important business questions:

  1. Different type of content Netflix carries
  2. Percentage of different type of content in Netflix
  3. Find the most common rating for the movies and TV shows
  4. Release all the movies released in Covid
  5. Which year has the highest number of release
  6. Find the top 10 countries with the mostcontent on netflix
  7. Identify the longest movie
  8. Identify the longest tv show
  9. Average duration of movies and tv shows
  10. Find the content that was added in recent 5 years
  11. Find all the movies/TV shows by director Rajiv Chilaka
  12. List all the TV shows with more than 5 seasons
  13. Most common genres
  14. Count the number of content item in each genre
  15. Find the movie and title which contains multi-genre content in them
  16. Find the Tv show and title which contains multi-genre content in them
  17. Find each year and the average number of content released by India on netflix
  18. List all the movies that are documentaries
  19. Find all the content without a director
  20. Find how many movies actor Amitabh Bachchan appeared in last 10 years
  21. Find the top actors who have appeared in the highest number of movies produced in india
  22. Find the top directors and their most frequent actors/actresses
  23. Who are the directors having most content in netflix
  24. Categorize the content based on the keywords 'kill', 'violence' and 'sex' etc. Label them as '18+', 'bad' and rest as 'good'

🚀 Conclusion

This project demonstrates how SQL can be used to analyze a real-world dataset and derive actionable business insights. From content distribution to trends in releases and average durations, this analysis helps to inform strategic decisions for platforms like Netflix.

Feel free to explore the SQL queries and adapt them for your own analysis!

🛠️ Tools Used

SQL: Structured Query Language for querying the dataset, data exploration and analysis.

📈 Future Work

Deeper analysis on content ratings and viewer preferences.

Time-series analysis to explore the growth of content over time.

Sentiment analysis of descriptions to identify content themes.

About

This project involves analyzing the Netflix dataset to uncover insights about the content available on the platform. The analysis focuses on understanding the distribution of movies and TV shows, release trends, and key statistics related to duration and seasons.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published