Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unit test values for COD, PRD, and PRB based on known good IAAO values #26

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file added .Rhistory
Empty file.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

AssessPy is a software package for Python developed by the Cook County Assessor’s (CCAO)
Data Department. It contains many of the functions necessary to perform a standard
[sales ratio study](https://www.iaao.org/media/standards/Standard_on_Ratio_Studies.pdf).
[sales ratio study](https://www.iaao.org/wp-content/uploads/Standard_on_Ratio_Studies.pdf).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per @jeancochrane, this link is dead, so I updated to the new one.


For assessors, we believe that this package will reduce the complexity of calculating
ratio statistics and detecting sales chasing. We also hope that reporters, taxpayers,
Expand Down
5 changes: 4 additions & 1 deletion assesspy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@
prb_ci,
prd_ci,
)
from .load_data import ccao_sample, quintos_sample
from .load_data import (
ccao_sample,
quintos_sample,
)
from .metrics import (
cod,
cod_met,
Expand Down
15 changes: 15 additions & 0 deletions assesspy/tests/conftest.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,13 @@
import pathlib

import numpy as np
import pandas as pd
import pytest as pt

import assesspy as ap

FIXTURE_DIR = pathlib.Path(__file__).parent / "fixtures"


@pt.fixture(autouse=True, scope="function")
def set_seed() -> None:
Expand All @@ -23,6 +27,17 @@ def quintos_data() -> tuple:
return sample.estimate, sample.sale_price


@pt.fixture(scope="session", params=["1_1", "1_4", "d_1", "d_2"])
def iaao_data_name(request):
return request.param


@pt.fixture(scope="session")
def iaao_data(iaao_data_name) -> tuple:
sample = pd.read_csv(FIXTURE_DIR / f"iaao_table_{iaao_data_name}.csv")
return sample.estimate, sample.sale_price


@pt.fixture(
scope="session",
params=[
Expand Down
37 changes: 37 additions & 0 deletions assesspy/tests/fixtures/iaao_table_1_1.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
estimate,sale_price
48000,138000
28800,59250
78400,157500
39840,74400
68160,114900
94400,159000
67200,111900
56960,93000
87200,138720
38240,59700
96320,146400
67680,99000
32960,47400
50560,70500
61360,78000
47360,60000
58080,69000
47040,55500
136000,154500
103200,109500
59040,60000
168000,168000
128000,124500
132000,127500
160000,150000
160000,141000
200000,171900
184000,157500
160000,129600
157200,126000
99200,77700
200000,153000
64000,48750
192000,144000
190400,141000
65440,48000
18 changes: 18 additions & 0 deletions assesspy/tests/fixtures/iaao_table_1_4.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
estimate,sale_price
87200,138720
38240,59700
96320,146400
68610,99000
32960,47400
50560,70500
61360,78000
47360,60000
56580,69000
47040,55500
136000,154500
98000,109500
56000,60000
159100,168000
128000,124500
132000,127500
160000,150000
26 changes: 26 additions & 0 deletions assesspy/tests/fixtures/iaao_table_d_1.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
estimate,sale_price
116700,114500
130300,121000
130200,133900
145500,139000
134100,145000
153900,156500
143400,161100
156900,169500
169000,175000
149200,181000
160100,188900
191400,205000
177200,216150
205500,219000
206500,235000
243800,249000
211600,258900
242500,263000
258400,305900
265900,312500
305700,336000
291600,360000
312800,399900
352200,418500
354900,459000
17 changes: 17 additions & 0 deletions assesspy/tests/fixtures/iaao_table_d_2.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
estimate,sale_price
45000,50000
100000,100000
165000,150000
180000,200000
250000,250000
330000,300000
315000,350000
400000,400000
495000,450000
450000,500000
550000,550000
660000,600000
585000,650000
700000,700000
825000,750000
1875000,2500000
39 changes: 36 additions & 3 deletions assesspy/tests/test_metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,48 @@ def metric_val(self, metric, ccao_data, quintos_data):
return getattr(ap, metric)(*quintos_data)
return getattr(ap, metric)(*ccao_data)

def test_metric_value_is_correct(self, metric, metric_val):
def test_metric_value_is_correct_ccao(self, metric, metric_val):
expected = {
"cod": 17.81456901196891,
"prd": 1.0484192615223522,
"prb": 0.0024757,
"mki": 0.794,
"ki": -0.06,
}
assert pt.approx(metric_val, rel=0.02) == expected[metric]
assert pt.approx(metric_val, rel=0.01) == expected[metric]

def test_metric_value_is_correct_iaao(
self, metric, iaao_data_name, iaao_data
):
if metric in ["mki", "ki"]:
return None
else:
result = getattr(ap, metric)(*iaao_data)
expected = {
"1_1": {
"cod": 29.8,
"prd": 0.98,
"prb": 0.232,
},
"1_4": {
"cod": 14.5,
"prd": 0.98,
"prb": 0.135,
},
"d_1": {
"cod": 7.5,
"prd": 1.027,
"prb": -0.120,
},
"d_2": {
"cod": 7.8,
"prd": 1.056,
"prb": -0.011,
},
}
Comment on lines +34 to +55

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Thought, non-blocking] If we did decide to factor out an iaao_table_name fixture, it might make sense to move these definitions into that fixture so that the table names and their expected values are defined in one place. But again, I think this is probably fine, especially if we don't anticipate needing to reference these values in other tests.

assert (
pt.approx(result, rel=0.02) == expected[iaao_data_name][metric]
)

def test_metric_has_numeric_output(self, metric_val):
assert type(metric_val) is float
Expand All @@ -38,7 +71,7 @@ def test_metric_succeeds_on_good_input(self, metric, good_input):

def test_metric_met_function_thresholds(self, metric, metric_val):
if metric == "ki":
pt.skip("Skipping test for 'ki' metric (ki_met does not exist)")
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not actually skipping a test, we just don't even need to run it. Switching to this has the advantage of not showing a "N skipped tests" warning in the pytest results.

expected = {
"cod": False,
"prd": False,
Expand Down
28 changes: 15 additions & 13 deletions assesspy/tests/test_sales_chasing.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,33 +13,37 @@ def sample_dist(self, ccao_data):
return ratio

@pt.fixture(params=["normal", "chased", "sample"])
def distribution(self, request, ccao_data, sample_dist):
return request.param, {
def distribution_name(self, request):
return request.param

@pt.fixture
def distribution(self, distribution_name, ccao_data, sample_dist):
return {
"normal": np.random.normal(1, size=1000).tolist(),
"chased": np.append(
np.random.normal(1, 0.15, 900), [1] * 100
).tolist(),
"sample": sample_dist,
}[request.param]
}[distribution_name]

@pt.fixture(params=["cdf", "dist", "both"])
def method(self, request):
return request.param

def test_is_sales_chased_output_is_boolean(self, distribution, method):
dist_name, dist_data = distribution
assert isinstance(ap.is_sales_chased(dist_data, method), bool)
assert isinstance(ap.is_sales_chased(distribution, method), bool)

def test_is_sales_chased_has_expected_output(self, distribution, method):
dist_name, dist_data = distribution
def test_is_sales_chased_has_expected_output(
self, distribution_name, distribution, method
):
expected = {
"normal": {"cdf": False, "dist": False, "both": False},
"chased": {"cdf": True, "dist": True, "both": True},
"sample": {"cdf": False, "dist": True, "both": False},
}
assert (
ap.is_sales_chased(dist_data, method)
== expected[dist_name][method]
ap.is_sales_chased(distribution, method)
== expected[distribution_name][method]
)

@pt.mark.parametrize(
Expand All @@ -61,13 +65,11 @@ def test_is_sales_chased_raises_on_invalid_values(
self, input_data, distribution, method
):
with pt.raises(Exception):
dist_name, dist_data = distribution
ap.is_outlier(input_data(dist_data), method)
ap.is_outlier(input_data(distribution), method)

def test_is_sales_chased_raises_on_invalid_method(self, distribution):
with pt.raises(Exception):
dist_name, dist_data = distribution
ap.is_sales_chased(dist_data, method="hug")
ap.is_sales_chased(distribution, method="hug")

def test_is_sales_chased_warns_on_small_sample(self):
with pt.warns(UserWarning):
Expand Down
18 changes: 8 additions & 10 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,16 +22,14 @@ AssessPy package
Source Code <https://github.com/ccao-data/assesspy>

AssessPy is a software package for Python developed by the Cook County
Assessor's (CCAO) Data Department. The codebase for the CCAO's CAMA system
uses a wide range of functions regularly, and packaging these functions
streamlines and standardizes their use. The CCAO is publishing this package
to make it available to assessors, reporters, and citizens everywhere.

For assessors, we believe that this package will reduce the complexity
of calculating ratio statistics and detecting sales chasing. We also
believe that reporters, taxpayers, and members of academia will find
this package helpful in monitoring the performance of local assessors
and conducting research.
Assessor’s (CCAO) Data Department. It contains many of the functions necessary
to perform a standard
`sales ratio study <https://www.iaao.org/wp-content/uploads/Standard_on_Ratio_Studies.pdf>`_

For assessors, we believe that this package will reduce the complexity of
calculating ratio statistics and detecting sales chasing. We also hope that
reporters, taxpayers, and members of academia will find this package helpful
in monitoring the performance of local assessors and conducting research.
Comment on lines +25 to +32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This text was out-of-date compared to the README, so just quickly copied it over.


For detailed documentation on included functions and data, :doc:`visit the
full reference list <reference>`.
Expand Down