This project takes a look at housing data for King County in the state of Washington. The student notebook follows the OSEMN framework (Obtaining, Scrubbing, Exploring, Modeling, and Interpretting the data) to see what features of a house will help predict the selling price.
I explored 3 questions in the data:
- What are the unique features of the most expensive houses?
- How do houses built after the year 2000 compare to houses built before 2000?
- Which season of the year is best for buying? Which season of the year is best for selling?
I iterated through the different house features and settled on a model with features that predict the cost of the house.
After limiting the houses to the affordable range, I was able to create a prediction with a r^2 value of .922.
Column names gives a description of the data listed in each column. Student is the jupyter notebook used for this project. Presentation gives a brief overview of the findings in the project. KC_housing data is the data file.
Here is the link for the blog I wrote while completing this project: https://roweyerboat.github.io/cleaning_geographical_data