Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connect scoringutils To forecasttools #9

Open
AFg6K7h4fhy2 opened this issue Oct 7, 2024 · 5 comments · May be fixed by #21
Open

Connect scoringutils To forecasttools #9

AFg6K7h4fhy2 opened this issue Oct 7, 2024 · 5 comments · May be fixed by #21
Assignees
Labels
feature A new tool or utility being added the package or code-base. first-pass A first-pass at a specific task; typically evolves into something more substantial later. Medium Priority

Comments

@AFg6K7h4fhy2
Copy link
Collaborator

AFg6K7h4fhy2 commented Oct 7, 2024

This depends on #30 and #28 .

The scope of this PR includes convert a forecast idata with time representation to a ScoringUtils-indigestible parquet file.

@AFg6K7h4fhy2 AFg6K7h4fhy2 self-assigned this Oct 7, 2024
@AFg6K7h4fhy2 AFg6K7h4fhy2 added this to the [October 14, October 25] milestone Oct 11, 2024
@AFg6K7h4fhy2 AFg6K7h4fhy2 linked a pull request Oct 15, 2024 that will close this issue
@AFg6K7h4fhy2 AFg6K7h4fhy2 added feature A new tool or utility being added the package or code-base. first-pass A first-pass at a specific task; typically evolves into something more substantial later. labels Oct 21, 2024
@AFg6K7h4fhy2
Copy link
Collaborator Author

(Hubverse submission dataframe → ScoringUtils ready dataframe) not deemed a priority?

@seabbs
Copy link

seabbs commented Oct 23, 2024

Noting you can do this via HubEval if you want

@AFg6K7h4fhy2
Copy link
Collaborator Author

AFg6K7h4fhy2 commented Oct 23, 2024

Noting you can do this via HubEval if you want

Hadn't seen; thank you, Sam.

Had been thinking mostly in terms of something like the following:

data = {
    "location": ["DE", "DE", "AL", "AL"],
    "forecast_date": ["2021-01-01", "2021-01-01", "2021-07-12", "2021-07-12"],
    "target_end_date": ["2021-01-02", "2021-01-02", "2021-07-24", "2021-07-24"],
    "target_type": ["Cases", "Deaths", "Deaths", "Deaths"],
    "model": [None, None, "epiforecasts-EpiNow2", "epiforecasts-EpiNow2"],
    "horizon": [None, None, 2, 2],
    "quantile_level": [None, None, 0.975, 0.990],
    "predicted": [None, None, 611, 719],
    "observed": [127300, 4534, 78, 78]
}

# convert data to pl.DataFrame, then to forecasts_to_score.parquet

Then in R, something akin to

df <- read_parquet("forecasts_to_score.parquet")

forecast_quantile <- df |>
  as_forecast_quantile(
    forecast_unit = c(
      <insert col names>
    )
  )

Would appreciate an examination of this workflow by @SamuelBrand1 @dylanmorris .

@AFg6K7h4fhy2
Copy link
Collaborator Author

There are still likely considerations for ScoringUtils 2.0 that need to be accounted for in this PR.

@AFg6K7h4fhy2
Copy link
Collaborator Author

Also, this PR partially depends on the utilities featured in #34 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new tool or utility being added the package or code-base. first-pass A first-pass at a specific task; typically evolves into something more substantial later. Medium Priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants