Skip to content
View liziniu's full-sized avatar

Highlights

  • Pro

Block or report liziniu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. ReMax ReMax Public

    Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)

    Python 154 13

  2. policy_optimization policy_optimization Public

    Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)

    Python 23 5

  3. HyperDQN HyperDQN Public

    Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)

    Python 11 1

  4. ISWBC ISWBC Public

    Code for NeurIPS 2023 Paper (Imitation Learning from Imperfection: Theoretical Justifications and Algorithms)

    Python 7