Observationspace/Infostate #16

ILikePoker · 2020-11-26T00:21:32Z

Hey Eric,

thanks for making this public, haven't found a good env so far that implements Multiplayer NL. Am I understanding the code right that the observationspace isn't actually perfect information? E.g. it is only the last couple actions? Do you have any research on how this affects convergence?
I had a bit of trouble understanding the code, so I apologize if I just didn't read it right.

EricSteinberger · 2020-12-25T13:28:58Z

Heyo,

You're welcome! I've written a wrapper that tracks the action history for limit games. For no-limit games, this piece is slightly less obvious, so I would default to the recurrent option that tracks the history of public observations, which is also sound. So, perfect recall is still supported for NL through recurrent NNs but you did correctly spot that action history is only tracked through that and not explicitly.

If you wish to track it explicitly, you could just track it manually in the training code or write an appropriate "Wrapper" for the NL environment. However, this will require you to adjust the NN observation space accordingly. TL;DR: Recurrent is cleaner and more scalable.

Cheers,
Eric

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Observationspace/Infostate #16

Observationspace/Infostate #16

ILikePoker commented Nov 26, 2020

EricSteinberger commented Dec 25, 2020

Observationspace/Infostate #16

Observationspace/Infostate #16

Comments

ILikePoker commented Nov 26, 2020

EricSteinberger commented Dec 25, 2020