-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions regarding Simulation and Network Parallelism #21
Comments
I think it's better to train and simulate on all 5 GPUs; it doesn't cost much extra GPU memory. The bottleneck is mainly on the CPU.
For For 5-GPU training, I also recommend to use a larger (#gpus x batch size) and (#gpus x n_steps) compared to a single-gpu case, otherwise the multi-GPU training setting isn't really advantageous over a single-GPU training setting. Example config for 5-GPU training:
Then when initiating experiments, use the arg |
Hello,
I want to train a DAPG+PPO agent using 5 GPUs. I want to use the same hyperparameters as specified in the ManiSkill2 paper in Appendix F.8 in Table 9: Hyperparameters for DAPG+PPO.
My question is how I need to modify the hyperparameters given that I want to train on multiple GPUs.
Additionally I would like to get advice on whether is is more efficient to train and simulate on 5 GPUs or whether it is more efficient to separate training and simulation so that I train on 2 GPUs and simulate on 2 different GPUs. In the code it is required that
len(args.sim_gpu_ids) == len(args.gpu_ids)
meaning I could not make use of the fifth GPU that I have if I were to separate training and simulation.I would like to train for 25e6 steps (total_steps), with 2e4 number of samples per PPO step (n_steps), number of samples per minibatch 300 (batch_size), number of critic warm up epochs 4 (critic_warmup_epoch), number of PPO updates epochs 2 (num_epoch), replay buffer capacity of 2e4 (I believe it must be the same size as n_steps, correct be if I am wrong), a model checkpoint every 1e6 steps, a final evaluation after the entire training is over.
Also I want to use a demonstration buffer with dynamic loading. From the Readme I got the following configuration. Do I need to make changes to any of the hyperparameters, like capacity and cache_size because I am training on 5 GPUs? Does the capacity of the demonstration replay buffer need to match the capacity of the experience replay buffer?
In the Readme the following information is provided. I am wondering if the Readme already named all of the hyperparmeters that are affected by simulation and training parallelism?
The text was updated successfully, but these errors were encountered: