EDA Project: King County House Sales 2014-2015

This repo contains the results of the EDA project in the neuefische Data Science, Machine Learning & AI Bootcamp. It consists of 2 notebooks:

The EDA notebook itself containing a classical EDA and a client-focused EDA:
- https://github.com/jottemka/eda_neuefische/blob/82a769aa8193682e45b5cc63c326e136f3ea55d0/eda_king_county.ipynb
A presentation notebook that was used to generate the corresponding Jupyter slides for the stakeholder meeting:
- https://github.com/jottemka/eda_neuefische/blob/ef8f00f8dba6c665a72b53648f7c0915470c23d3/presentation.ipynb

Data Insights

There are 3 interesting data insights that might be contrary to common views:

More rooms does mean higher price, but the relationship is not as strong as one might expect.
Older houses are not generally cheaper. The correlation is almost zero.
Surprisingly, just like agricultural products, house prices exhibit seasonality effects.

Client Recommendations

When to buy?

We recommend buying in February and to avoid buying in April.

We also recommend buying in the middle of the month and to avoid buying in the beginning.

Where to buy?

Based on our client's needs, we recommend low-fluctuation neighborhoods. The plot below shows all zipcode areas, ranked according to their fluctuation. Our client should pick from the neighborhoods on the left-hand side.

What to buy?

Instead of specific buying recommendations, we decided to propose the following methodology to our (fictional) client:

Start with most affordable house with at least 3 bedrooms and 2 bathrooms
Ask yourself: would you be willing to pay for a neighborhood lower fluctuation?

The first five result of this procedure are shown in the table below. The least expensive option resulting from this procedure is a house with ID 15796 in Rainier Beach with 5 bedrooms for 133,000 USD. Notice that improving on the neighborhood can mean compromising on other aspects.

house_id	price	bedrooms	bathrooms	sqft_living
7129304540	133000.000000	5.000000	2.000000	1430.000000
1823049182	147400.000000	3.000000	2.000000	1080.000000
2976800749	150000.000000	4.000000	2.000000	1460.000000
3356403304	154000.000000	3.000000	3.000000	1530.000000
7129300595	158000.000000	3.000000	2.000000	1090.000000

Environment Setup

This repo contains a requirements.txt file with a list of all the packages and dependencies you will need.

Before you can start with plotly in Jupyter Lab you have to install node.js (if you haven't done it before). Check Node version by run the following commands:

node -v

If you haven't installed it yet, begin at step_1. Otherwise, proceed to step_2.

`macOS` type the following commands :

Step_1: Update Homebrew and install Node by following commands:

brew update
brew install node

Step_2: Install the virtual environment and the required packages by following commands:

pyenv local 3.11.3
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

`WindowsOS` type the following commands :

Step_1: Update Chocolatey and install Node by following commands:

choco upgrade chocolatey
choco install nodejs

Step_2: Install the virtual environment and the required packages by following commands.

For PowerShell CLI :

pyenv local 3.11.3
python -m venv .venv
.venv\Scripts\Activate.ps1
pip install --upgrade pip
pip install -r requirements.txt

For Git-Bash CLI :

pyenv local 3.11.3
python -m venv .venv
source .venv/Scripts/activate
pip install --upgrade pip
pip install -r requirements.txt

Note: If you encounter an error when trying to run pip install --upgrade pip, try using the following command:

python.exe -m pip install --upgrade pip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

EDA Project: King County House Sales 2014-2015

Data Insights

Client Recommendations

When to buy?

Where to buy?

What to buy?

Environment Setup

`macOS` type the following commands :

`WindowsOS` type the following commands :

Files

README.md

Latest commit

History

README.md

File metadata and controls

EDA Project: King County House Sales 2014-2015

Data Insights

Client Recommendations

When to buy?

Where to buy?

What to buy?

Environment Setup

macOS type the following commands :

WindowsOS type the following commands :

`macOS` type the following commands :

`WindowsOS` type the following commands :