Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full Parallel Environment Runner does not handle return_after_episode_num as expected/optimally #8

Open
SamNPowers opened this issue May 12, 2023 · 0 comments

Comments

@SamNPowers
Copy link
Collaborator

Each Batch runner spun up by FullParallel will attempt to do the full set of return_after_episode_num episodes...then in task_base, it'll just take the first n. This:
a. means more episodes are run than necessary
b. may prioritize faster episodes -- i.e. if one parallel process gets 6 short episodes (e.g. failures), that will return first, and be prioritized over longer runs, biasing the results of the data

(This is why current eval is done sequentially, but it is not optimal.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant