Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resilient seeds for batched replication #113

Merged
merged 14 commits into from
Oct 18, 2022
Merged

Resilient seeds for batched replication #113

merged 14 commits into from
Oct 18, 2022

Conversation

wlandau
Copy link
Member

@wlandau wlandau commented Oct 18, 2022

Prework

Related GitHub issues and pull requests

Summary

In targets, target-specific pseudo-random number generator seeds are deterministic and depend on the target names. So far, these default seeds have applied to entire batches of replications in tarchetypes target factories like tar_rep() and tar_map_rep(). This behavior has had the undesirable consequence of changing seed assignment when the batching structure changes (i.e. when batches and reps change while the total number of replications batches * reps remains constant).

This PR assigns a special seed to each replicate. These new seeds depend on the parent target name and the total rep index. As long as batches * reps remains constant, these seeds will not change if you change the batching structure. In other words, tar_rep(name = x, command = rnorm(1), batches = 100, reps = 1, ...) now has the same output as tar_rep(name = x, command = rnorm(1), batches = 10, reps = 10, ...). Seeds are available in the output of most target factories through the "tar_seed" column.

The affected functions are:

  • tar_rep()
  • tar_rep2()
  • tar_map_rep()
  • tar_map2_count()
  • tar_map2_size()
  • tar_render_rep()

For tar_map2_count() and tar_map2_size(), it is also possible to generate your own seeds in command1 and use them in command2. Similarly, you can supply seeds in tar_render_rep() via the params argument and then use them in the R Markdown report. Likewise in tar_quarto_rep() with execute_params. tar_quarto_rep() is ready on tarchetypes' end for seed resilience, but Quarto itself does not currently have a flag to set seeds.

@wlandau wlandau merged commit 74969b2 into main Oct 18, 2022
@wlandau wlandau deleted the 111 branch October 18, 2022 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants