-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added group_by parameter to util_corr_fit() #69
base: main
Are you sure you want to change the base?
Conversation
…r_triangle function
…nces where the group_by variable(s) are different for the synthetic and actual data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- What happens if a group exists in one data set but not the other data set?
- I like how you restructured the output of
util_corr()
, but it broke ALL of the tests. We need to rewrite the tests to reference the new output structure.
R/util_corr_fit.R
Outdated
|
||
# reorder data names | ||
# reorder data names (this appears to check if the variables are the same) | ||
# issue when the groups in the synthetic data do not match the groups in the og data, and vice versa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"og data" may be a little casual for our roxygen headers...
R/util_corr_fit.R
Outdated
dplyr::group_split(dplyr::across({{ group_by }})) | ||
|
||
groups <- lapply(data, function(x) dplyr::select(x, {{ group_by }}) |> | ||
slice(1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dplyr::slice()
instead of just slice()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you replace this with count(data, groups)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is to add the group by variables to the final datasets. I can add additional code to add the Ns to the metric data. I need to think more about how to add it to the corr_data dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
count(data, {{ group_by }})
will return a data frame with the groups and the frequency of the groups that you can plug into bind_cols()
below.
R/util_corr_fit.R
Outdated
return(list( | ||
corr_data, | ||
metrics | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return(
list(
corr_data,
metrics
)
)
R/util_corr_fit.R
Outdated
data <- dplyr::select(data, names(synthetic_data)) | ||
|
||
synthetic_data <- dplyr::select(synthetic_data, dplyr::where(is.numeric), {{ group_by }}) |> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're still using %>%
instead of |>
now to make sure the code is backwards compatible with R < 4.0.0.
difference = .data$original - .data$synthetic, | ||
proportion_difference = .data$difference / .data$original) | ||
|
||
correlation_data <- bind_cols(correlation_data, groups) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dplyr::binds_cols()
R/util_corr_fit.R
Outdated
correlation_fit = map_dbl(results, "correlation_fit"), | ||
correlation_difference_mae = map_dbl(results, "correlation_difference_mae"), | ||
correlation_difference_rmse = map_dbl(results, "correlation_difference_rmse"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
purrr::map_dbl()
R/util_corr_fit.R
Outdated
correlation_fit = map_dbl(results, "correlation_fit"), | ||
correlation_difference_mae = map_dbl(results, "correlation_difference_mae"), | ||
correlation_difference_rmse = map_dbl(results, "correlation_difference_rmse"), | ||
bind_rows(groups) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dplyr::bind_rows()
R/util_corr_fit.R
Outdated
bind_rows(groups) | ||
) | ||
|
||
corr_data <- dplyr::bind_rows(map_dfr(results, "correlation_data")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
purrr::map_dfr()
No description provided.