Skip to content

Latest commit

 

History

History
54 lines (38 loc) · 2.45 KB

README.md

File metadata and controls

54 lines (38 loc) · 2.45 KB

NYC---bikes-analysis

Project for Structured Data Processing course at Warsaw University of Technology.

Authors: Łukasz Lepianka, Marta Szuwarska.

Data source: Citi Bike.

Łukasz:

I conducted 2 analyses in R:

  1. Firstly, I took a look at how Covid 19 pandemy influenced travelling by bike. I fetched rides' count and time from Marches since 2019 to 2022 and presented them on interactive plots. 1

  2. The other analysis focused on presenting the number of the bike uses on interactive NYC map diveded by neighbourhoods. I created some maps and two of them showed movement in morning rush hours. From these I could tell which ones are more residential neighbourhoods and which ones are more focused on offices and entertainment facilities. 3

Both analyses were transfered to shiny app.

While working with this project I learned mainly how to create a simple app in shiny, how to create interactive maps in leaflet and using spatial objects.

Marta:

  1. To start, I engaged in data cleansing, which included deleting empty and invalid rows (e.g. with age set to lower than 5 or average speed over 40 km/h). All analises we did were conducted on the cleansed data.

CleaningData2019

  1. Then I got round to an age analysis, in which I compared cyclists' age with distance they covered and their average speed.

AgeAnalysisDistance AgeAnalysisSpeed

  1. Finally, I made a simple predictive model using logistic regression to predict user type (users with and without year subscription).

usertypepredictionaccuracy2019

All my work was done in Python. This project allowed me to get the hang of predictive modelling with scikit-learn and improve my data analysis skills.