Full Parallel Environment Runner does not handle return_after_episode_num as expected/optimally #8

SamNPowers · 2023-05-12T15:50:05Z

Each Batch runner spun up by FullParallel will attempt to do the full set of return_after_episode_num episodes...then in task_base, it'll just take the first n. This:
a. means more episodes are run than necessary
b. may prioritize faster episodes -- i.e. if one parallel process gets 6 short episodes (e.g. failures), that will return first, and be prioritized over longer runs, biasing the results of the data

(This is why current eval is done sequentially, but it is not optimal.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Full Parallel Environment Runner does not handle return_after_episode_num as expected/optimally #8

Full Parallel Environment Runner does not handle return_after_episode_num as expected/optimally #8

SamNPowers commented May 12, 2023

Full Parallel Environment Runner does not handle return_after_episode_num as expected/optimally #8

Full Parallel Environment Runner does not handle return_after_episode_num as expected/optimally #8

Comments

SamNPowers commented May 12, 2023