-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add climatology features #38
base: main
Are you sure you want to change the base?
Conversation
Before merging we will need to update the workflow to ensure this works |
@dnerini could you have a look at the code in its current state ? I moved the calculation of the rolling mean to mlpp-features, as discussed. |
…or that day window
@pytest.fixture | ||
def clim_dataset(): | ||
"""Create climatology dataset as if loaded from zarr files, still unprocessed.""" | ||
|
||
def _data(): | ||
|
||
variables = [ | ||
"cloud_area_fraction", | ||
] | ||
|
||
stations = _stations_dataframe() | ||
times = pd.date_range("2000-01-01T00", "2000-01-02T00", freq="1h") | ||
|
||
n_times = len(times) | ||
n_stations = len(stations) | ||
|
||
var_shape = (n_times, n_stations) | ||
ds = xr.Dataset( | ||
None, | ||
coords={ | ||
"time": times, | ||
"station": stations.index, | ||
"longitude": ("station", stations.longitude), | ||
"latitude": ("station", stations.latitude), | ||
"height_masl": ("station", stations.height_masl), | ||
"owner_id": ("station", np.random.randint(1, 5, stations.shape[0])), | ||
"pole_height": ("station", np.random.randint(5, 15, stations.shape[0])), | ||
"roof_height": ("station", np.zeros(stations.shape[0])), | ||
}, | ||
) | ||
for var in variables: | ||
measurements = np.random.randn(*var_shape) | ||
nan_idx = [np.random.randint(0, d, 60) for d in var_shape] | ||
measurements[nan_idx[0], nan_idx[1]] = np.nan | ||
ds[var] = (("time", "station"), measurements) | ||
return ds | ||
|
||
return _data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we move to using the "obs" part of the variable dictionnary, we likely don't need this anymore
try: | ||
rolling_mean_day = ( | ||
rolling_mean_hour.where( | ||
rolling_mean_hour["dayofyear"].isin(days_range), drop=True | ||
) | ||
.groupby("time.hour") | ||
.mean() | ||
) | ||
except ValueError as e: | ||
if "hour must not be empty" in str(e): | ||
days_list.remove(day) | ||
continue | ||
else: | ||
raise e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort of fix to pass the pytest.
It fails (without that fix) if one the days window is empty.
No description provided.