An application that predicts the weather
-
Download the dataset we are using for the project: https://www.ncei.noaa.gov/data/global-summary-of-the-month/archive/
-
Unzip the data into a folder (gsom-latest).
-
Copy and paste the directory in which the dataset is stored.
-
In your terminal, input "spark-submit preprocess.py " to run the preprocessing file that manipulates the raw data into usable data.
-
Once the preprocess.py file is done running, a folder called "updatedWeather2.csv" will be created - it contains the data our model will use.
-
In your terminal, input "spark-submit weather_predict.py updatedWeather2.csv".
-
You will be displayed with two dataframes: One includes data from a weather station in Wheatland, and the other from Corvallis. The last two columns will be the predictions made by the model -Snow precipitations, as P_SNOW, and Monthly average temperature, as P_TAVG. The differences between SNOW/TAVG (the actual values) and P_SNOW and P_TAVG can be easily observed.