Skip to content

Soft Actor Critic (SAC)

Seungjae Ryan Lee edited this page Mar 30, 2019 · 2 revisions

SAC (Haarnoja et al., 2018a) incorporates maximum entropy reinforcment learning, where the agent's goal is to maximize expected reward and entropy concurrently. Combined with TD3, SAC achieves state of the art performance in various continuous control tasks. SAC has been extended to allow automatically tuning of the temperature parameter (Haarnoja et al., 2018b), which determines the importance of entropy against the expected reward.

KAIR

RL Algorithms

Simulator

OpenManipulator

  • Setup
  • Default Controller
  • Demo Controller

Sim2Real

  • Domain Randomization

Misc

Clone this wiki locally