Skip to content

sshehryar/ds1004-big-data-final-project

Repository files navigation

DS 1004 - Big Data - Project

Team Members

Name NetID
Burak Dincer bd1308
Harshit Prabhat Singh hps257
Syed Ali Shehryar sas786

Analyzing NYPD Complaint Data (Historic)

Project Report Link

You can open the project report from Google Drive by clicking here!


We analyzed the number of reported criminal offenses with respect to Boroughs that form New York City. To do so, we extracted three columns of our interest from the main data set:

  • CMPLNT_FR_DT (Exact date of occurrence for the reported event).

  • KY_CD (Three digit offense classification code)

  • BORO_NM (Name of Borough).


With the help of these 3 columns, we analyzed monthly and yearly variations in the most frequently reported crimes for each of the five boroughs of New York City.


IPYTHON NOTEBOOKS

We used Jupyter Notebooks to organize our plots and do preliminary analysis on the dataset of our interest (date-borough-code-cleaned.csv) that we extracted from the main NYPD Complaint Dataset. For proper exploration, we used Pyspark.

  • Complaint-Analysis-Borough-Wise.ipynb lists the python scripts we used for a practice exploratory run along with the outputs.
  • Graphs-for-Analysis.ipynb contains the code and plots that we have used to visualize trends of our interest.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published