Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Truncated signal not treated as end of episode in PPO agent #224

Open
asdfGuest opened this issue Nov 13, 2024 · 0 comments
Open

Truncated signal not treated as end of episode in PPO agent #224

asdfGuest opened this issue Nov 13, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@asdfGuest
Copy link

asdfGuest commented Nov 13, 2024

Description

I have discovered that the PPO agent does not treat a truncation signal as the end of an episode.
In the current code, the truncation signal is only triggers value bootstrapping.

To confirm this, I tested the agent with two custom cartpole environments :

  1. environment A : Sending a termination signal only if the state of the cartpole is not stable.
  2. environment B : Sending a termination signal both when the state is unstable and when a truncation occurs.

Both environments have a very short maximum time steps (i.e. 10) to clearly see the effect of truncation.

Actually I couldn't found reasonable performance difference, but could see that vale loss was very different.
value loss
A - purple, B - green

As you can see, A have a much bigger loss value compared to B.
I reson it is because A correctly perform 10-step bootstrapping, propagation of blue network is slow and smaller errors (also considering the fact that initial output of network is almost zero).
However, B is performing a larger n-step bootstrapping across multiple episodes, leading to relatively larger errors.

What skrl version are you using?

1.3.0

What ML framework/library version are you using?

Pytorch 2.4.0+cu118

Additional system information

Python 3.10.15, Window11

@asdfGuest asdfGuest added the bug Something isn't working label Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant