Skip to content

Commit

Permalink
Add references
Browse files Browse the repository at this point in the history
  • Loading branch information
cr-xu committed Feb 4, 2024
1 parent 7e1cbc5 commit 0a8a21d
Showing 1 changed file with 37 additions and 12 deletions.
49 changes: 37 additions & 12 deletions tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
"\n",
"<h3 style=\"text-align: center; vertical-align: middle;\">Implementation example for the RL4AA'24 workshop</h3>\n",
"\n",
"<p style=\"text-align: center\">Simon Hirländer, Jan Kaiser, Chenran Xu, Andrea Santamaria Garcia</p>"
"<p style=\"text-align: center\">Simon Hirlaender, Jan Kaiser, Chenran Xu, Andrea Santamaria Garcia</p>"
]
},
{
Expand Down Expand Up @@ -604,17 +604,17 @@
" sample 8 tasks\n",
" for t in tasks:\n",
" for i in num_steps:\n",
" for fb in fast_batch_size:\n",
" perform 1 episode:\n",
" for fast_batch in fast_batch_size:\n",
" rollout 1 episode:\n",
" reset corrector_strength\n",
" while not stopped:\n",
" env.step()\n",
"```\n",
"We have gathered experience and trained 8 task policies:\n",
"$$\n",
"\\varphi_{0}^1 \\rightarrow \\varphi_{k}^1 \\\\\n",
"\\vdots \\\\\n",
"\\varphi_{0}^8 \\rightarrow \\varphi_{k}^8 \\\\\n",
"\\varphi_{0}^1 \\rightarrow \\varphi_{k}^1$$\n",
"$$\\vdots$$\n",
"$$\\varphi_{0}^8 \\rightarrow \\varphi_{k}^8\n",
"$$\n",
"\n",
"The losses from the task policies are summed, and gradient descent is applied to update the meta policy $\\phi_0 \\rightarrow \\phi_1$"
Expand Down Expand Up @@ -819,7 +819,7 @@
"source": [
"<h2 style=\"color: #b51f2a\">Further Resources</h2>\n",
"\n",
"### Papers\n",
"### Papers about RL in Particle Accelerators and Large-Scale Facilities\n",
"\n",
" - [Learning-based optimisation of particle accelerators under partial observability without real-world training](https://proceedings.mlr.press/v162/kaiser22a.html) - Tuning of electron beam properties on a diagnostic screen using RL.\n",
" - [Sample-efficient reinforcement learning for CERN accelerator control](https://journals.aps.org/prab/abstract/10.1103/PhysRevAccelBeams.23.124801) - Beam trajectory steering using RL with a focus on sample-efficient training.\n",
Expand All @@ -839,15 +839,40 @@
"source": [
"<h2 style=\"color: #b51f2a\">Further Resources</h2>\n",
"\n",
"### Literature\n",
"### RL Books\n",
" \n",
" - [Reinforcement Learning: An Introduction](http://incompleteideas.net/book/the-book.html) - Standard text book on RL.\n",
" - R. S. Sutton, Reinforcement learning, Second edition. in Adaptive computation and machine learning. Cambridge, Massachusetts: The MIT Press, 2020 [Reinforcement Learning: An Introduction](http://incompleteideas.net/book/the-book.html)\n",
" - A. Agarwal, N. Jiang, S. M. Kakade, W. Sun: Reinforcement Learning: Theory and Algorithms, 2022 [https://rltheorybook.github.io/](https://rltheorybook.github.io/)\n",
" - K. P. Murphy, Probabilistic Machine Learning: An introduction. MIT Press, 2022. [https://probml.github.io/pml-book/book1.html](https://probml.github.io/pml-book/book1.html)\n",
" - K. P. Murphy, Probabilistic Machine Learning: Advanced Topics. MIT Press, 2023. [http://probml.github.io/book2](http://probml.github.io/book2)\n",
"\n",
"### Packages\n",
" - [Gym](https://www.gymlibrary.ml) - Defacto standard for implementing custom environments. Also provides a library of RL tasks widely used for benchmarking.\n",
" - [Stable Baslines3](https://github.com/DLR-RM/stable-baselines3) - Provides reliable, benchmarked and easy-to-use implementations of the most important RL algorithms.\n",
" - [Gymnasium](https://gymnasium.farama.org/index.html) - Defacto standard for implementing custom environments. Also provides a library of RL tasks widely used for benchmarking.\n",
" - [Stable Baselines3](https://github.com/DLR-RM/stable-baselines3) - Provides reliable, benchmarked and easy-to-use implementations of the most important RL algorithms.\n",
" - [Ray RLlib](https://docs.ray.io/en/latest/rllib/index.html) - Part of the *Ray* Python package providing implementations of various RL algorithms with a focus on distributed training."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Courses Online\n",
"\n",
"- Chelsea Finn (Berkley): [Deep Multi-Task and Meta Learning](https://cs330.stanford.edu/)\n",
"- Sergey Levine (Berkley): [Deep Reinforcement Learning](http://rail.eecs.berkeley.edu/deeprlcourse/)\n",
"- Emma Brunskill (Stanford): [Reinforcement Learning](https://web.stanford.edu/class/cs234/index.html)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand All @@ -867,7 +892,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
"version": "3.10.13"
},
"vscode": {
"interpreter": {
Expand Down

0 comments on commit 0a8a21d

Please sign in to comment.