This project explored how latitudinal position affects weather in regards to maximum temperature (F), humidity (%), cloudiness (%), and wind speed (mph). Using the Python library CitiPy, a randomized data set of over 500 cities globally was compiled. The cities are of varying distance from the equator. A weather check on each city was performed using a series of successive API calls from the OpenWeatherMap API in order to create a representative model of weather across the world.
From this data set, a series of scatter plots was created using Matplotlib in order to visualize the data. Cities were then separated into Northern and Southern Hemisphere groups, and Python libraries NumPy and SciPy were then used to run linear regression to determine whether a correlation existed between city latitude and any of the four weather characteristics.
A list of ideal vacation spots around the world was then generated by performing an analysis on the data from Part 1. Initially, a global heatmap of the cities humidity percentage was created using Jupyter-Gmaps. Then ideal weather parameters for temperature, humidity, cloudiness, and wind speed were defined and used to narrow down the data set. The parameters chosen were temperature between 65-78 degrees F, humidity below 50%, cloudiness below 30%, and wind speed below 15 mph. Google Places API was utilized to find hotel locations within 5000 meters of each ideal vacation spot. A second heatmap with the hotel layer was created for assistance in finding the perfect vacation destination.
- Created a list for holding lat_lngs and cities.
- Created a set of random lat and lng combinations.
- Identified nearest city for each lat, lng combination; if the city was unique, added it to a cities list.
- Printed the city count to confirm sufficient count for study.
- Created starting URL for Weather Map API call.
- Created an empty list for city data.
- Created counters for record and set counts.
- Ran a loop through all the cities in the list.
- Grouped cities in sets of 50 for logging purposes.
- Create endpoint URL with each city.
- Logged the URL, record, and set numbers.
- Ran an API request for each of the cities.
- Parsed the JSON and retrieved data.
- Parsed out the max temp, humidity, cloudiness, and wind speed.
- Appended the city information into city_data list.
- Converted array of JSONs into Pandas DataFrame.
- Viewed record count to ensure more than 500 cities.
- Displayed dataframe and checked statistics.
- Created a dataframe with the indices of cities that have humidity over 100%.
- Made a new dataframe equal to the city data to drop all humidity outliers by index.
- Extracted relevant fields from the dataframe.
- Exported the City_Data into a CSV file.
- Built scatter plot for latitude vs. temperature.
- Built scatter plots for latitude vs. humidity.
- Built scatter plots for latitude vs. cloudiness.
- Built scatter plots for latitude vs. wind speed.
- Created a function to create Linear Regression plots.
- Created Northern and Southern Hemisphere dataframes.
- Plotted linear regression for max temp versus latitude for northern hemisphere.
- Plotted linear regression for max temp versus latitude for southern hemisphere.
- Plotted linear regression for humidity versus latitude for northern hemisphere.
- Plotted linear regression for humidity versus latitude for southern hemisphere.
- Plotted linear regression for cloudiness versus latitude for northern hemisphere.
- Plotted linear regression for cloudiness versus latitude for southern hemisphere.
- Plotted linear regression for wind speed versus latitude for northern hemisphere.
- Plotted linear regression for wind speed versus latitude for southern hemisphere.
- Stored CSV created in part one into a dataframe.
- Configured gmaps.
- Created heatmap of humidity.
- Narrowed down dataframe to find ideal weather conditions (max temp between 65 and 78, humidity below 50%, cloudiness below 30%, & windspeed below 15 mph).
- Created dataframe to store hotel names along with city, country and coordinates.
- Set parameters to search for a hotel.
- Created list of lat and lng from cities data.
- Used the search term: "Hotel" and the lat/lng on Google Maps.
- Made requests and printed URLs.
- Converted to JSON.
- Got first hotel from the results and stored the names.
- Used a template to add the hotel marks to the heatmap.
- Added marker layer on top of heat map.
As observed in the scatter plots below, the maximum temperature of cities increases as they are located closer to the equator. Interestingly, the cities with the highest maximum temperature are between 20 to 40 degrees latitude which could be attributed to the Earth's axis tilt in relation to the sun during the month of July. This strong correlation was reflected in both hemispheres in relatively high r-values as well.
The percentages of humidity appear to be much more clustered in a 60% and higher range as cities are located closer to the equator. However, the poor r-value suggests a weak correlation between the two variables.
Both cloudiness and wind speed displayed no relationship to latitude. This was reflected in the extremely low r-value for both variables in each hemisphere.
Max temperature increases as the cities are located closer to the equator and nearly all of the cities that are located close to the equator have a high percentage of humidity (> 60%). However, there appears to be no relationship between the latitude and cloudiness or wind speed.
Both the Northern and Southern Hemispheres share relatively high r-values which indicates a positive correlation between max temperature and latitude. This suggests that as latitude approaches zero, max temperature increases.
Both hemispheres have weak r-values which indicates no apparent relationship between humidity and latitude.
Again, r-values for both hemispheres are extremely low which indicates no relationship between cloudiness & latitude.
The extremely low r-values for both hemispheres indicates no relationship between wind speed and latitude.
Data
References
- Jupyter Notebook
- Python - Pandas, Matplotlib, NumPy, SciPy, CitiPy, Gmaps, OS, Requests
Kiran Rangaraj - LinkedIn: @Kiran Rangaraj