Skip to content

A collection of deep reinforcement learning algorithm implementations, including DQN, A3C, PPO, and more, for research and experimentation on classic control tasks.

License

Notifications You must be signed in to change notification settings

ml-dev-world/deep-reinforcement-learning-algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Reinforcement Learning Algorithms

A comprehensive collection of deep reinforcement learning (DRL) algorithm implementations, including DQN, A3C, PPO, and more, designed for research, experimentation, and educational purposes. This repository aims to provide accessible, well-structured code and resources for DRL enthusiasts, researchers, and developers working on classic control tasks.

🚀 Features

  • Multiple DRL Algorithms: Implementations of popular algorithms such as DQN, A3C, PPO, SAC, and others.
  • Environment Compatibility: Supports a range of classic control environments like CartPole, LunarLander, MountainCar, etc.
  • Modular Codebase: Each algorithm is implemented as a standalone module, making it easy to understand, modify, and extend.
  • Logging & Visualization: Track training progress and visualize results with integrated logging.
  • Flexible Training Parameters: Easily adjust hyperparameters for each algorithm to optimize performance or experiment with custom settings.

📄 Table of Contents

🎓 Getting Started

Deep reinforcement learning combines deep learning with reinforcement learning, allowing agents to learn how to make sequences of decisions from raw sensory input. This repository offers code for several well-known DRL algorithms. The goal is to provide accessible implementations that are easy to integrate, experiment with, and use as a reference for research and educational projects.

🧠 Algorithms Implemented

The following algorithms are currently implemented:

  • DQN (Deep Q-Network)
  • DDPG (Deep Deterministic Policy Gradient)
  • A3C (Asynchronous Advantage Actor-Critic)
  • PPO (Proximal Policy Optimization)
  • SAC (Soft Actor-Critic)
  • TD3 (Twin Delayed DDPG)

Each algorithm includes a separate module that can be run and modified independently.

📦 Installation

Clone this repository:

git clone https://github.com/yourusername/deep-reinforcement-learning-algorithms.git
cd deep-reinforcement-learning-algorithms

Install the required dependencies:

pip install -r requirements.txt

Note: This codebase requires Python 3.8+ and has been tested with PyTorch 1.8+ and Gym 0.26+.

⚙️ Usage

To start training an agent, select the desired algorithm and environment.

Running a DQN Agent on LunarLander

python train.py --algorithm DQN --env LunarLander-v2

Training Options

You can configure hyperparameters and other settings via command-line arguments. For example:

python train.py --algorithm PPO --env CartPole-v1 --learning_rate 0.0003 --batch_size 64

For more details on command-line arguments, refer to the CLI Options section.

📊 Results & Visualization

Each training run logs data for rewards, loss, and other metrics. These can be visualized using tools like TensorBoard.

  1. TensorBoard:

    tensorboard --logdir runs
  2. Video Rendering: Episodes are recorded (if enabled) and saved in the videos/ folder, so you can watch the agent’s performance.

  3. Sample Output: Below is a sample performance graph of PPO on CartPole-v1: Sample Performance Graph

🌌 Examples

Here are some example commands to get started with different algorithms:

  • Train DQN on LunarLander-v2:
    python train.py --algorithm DQN --env LunarLander-v2
  • Train PPO on MountainCar-v0:
    python train.py --algorithm PPO --env MountainCar-v0
  • Train A3C on CartPole-v1:
    python train.py --algorithm A3C --env CartPole-v1

🛤️ Roadmap

We plan to add more algorithms, features, and environments over time. Upcoming features include:

  • Implement Rainbow DQN
  • Add support for Atari environments
  • Integrate advanced logging options
  • Provide pre-trained model checkpoints

🤝 Contributing

Contributions are welcome! If you'd like to contribute, please follow these steps:

  1. Fork the project.
  2. Create a new branch (git checkout -b feature/YourFeature).
  3. Make your changes.
  4. Commit your changes (git commit -m 'Add YourFeature').
  5. Push to the branch (git push origin feature/YourFeature).
  6. Open a pull request.

For major changes, please open an issue first to discuss what you’d like to change.

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

📬 Contact

If you have any questions, feel free to reach out:

About

A collection of deep reinforcement learning algorithm implementations, including DQN, A3C, PPO, and more, for research and experimentation on classic control tasks.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages