You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thanks for making this public, haven't found a good env so far that implements Multiplayer NL. Am I understanding the code right that the observationspace isn't actually perfect information? E.g. it is only the last couple actions? Do you have any research on how this affects convergence?
I had a bit of trouble understanding the code, so I apologize if I just didn't read it right.
The text was updated successfully, but these errors were encountered:
You're welcome! I've written a wrapper that tracks the action history for limit games. For no-limit games, this piece is slightly less obvious, so I would default to the recurrent option that tracks the history of public observations, which is also sound. So, perfect recall is still supported for NL through recurrent NNs but you did correctly spot that action history is only tracked through that and not explicitly.
If you wish to track it explicitly, you could just track it manually in the training code or write an appropriate "Wrapper" for the NL environment. However, this will require you to adjust the NN observation space accordingly. TL;DR: Recurrent is cleaner and more scalable.
Hey Eric,
thanks for making this public, haven't found a good env so far that implements Multiplayer NL. Am I understanding the code right that the observationspace isn't actually perfect information? E.g. it is only the last couple actions? Do you have any research on how this affects convergence?
I had a bit of trouble understanding the code, so I apologize if I just didn't read it right.
The text was updated successfully, but these errors were encountered: