Add references

RL4AA · Feb 4, 2024 · 0a8a21d · 0a8a21d
1 parent 7e1cbc5
commit 0a8a21d
Showing 1 changed file with 37 additions and 12 deletions.
diff --git a/tutorial.ipynb b/tutorial.ipynb
@@ -13,7 +13,7 @@
     "\n",
     "<h3 style=\"text-align: center; vertical-align: middle;\">Implementation example for the RL4AA'24 workshop</h3>\n",
     "\n",
-    "<p style=\"text-align: center\">Simon Hirländer, Jan Kaiser, Chenran Xu, Andrea Santamaria Garcia</p>"
+    "<p style=\"text-align: center\">Simon Hirlaender, Jan Kaiser, Chenran Xu, Andrea Santamaria Garcia</p>"
    ]
   },
   {
@@ -604,17 +604,17 @@
     "   sample 8 tasks\n",
     "   for t in tasks:\n",
     "        for i in num_steps:\n",
-    "            for fb in fast_batch_size:\n",
-    "                perform 1 episode:\n",
+    "            for fast_batch in fast_batch_size:\n",
+    "                rollout 1 episode:\n",
     "                    reset corrector_strength\n",
     "                    while not stopped:\n",
     "                        env.step()\n",
     "```\n",
     "We have gathered experience and trained 8 task policies:\n",
     "$$\n",
-    "\\varphi_{0}^1 \\rightarrow \\varphi_{k}^1 \\\\\n",
-    "\\vdots \\\\\n",
-    "\\varphi_{0}^8 \\rightarrow \\varphi_{k}^8 \\\\\n",
+    "\\varphi_{0}^1 \\rightarrow \\varphi_{k}^1$$\n",
+    "$$\\vdots$$\n",
+    "$$\\varphi_{0}^8 \\rightarrow \\varphi_{k}^8\n",
     "$$\n",
     "\n",
     "The losses from the task policies are summed, and gradient descent is applied to update the meta policy $\\phi_0 \\rightarrow \\phi_1$"
@@ -819,7 +819,7 @@
    "source": [
     "<h2 style=\"color: #b51f2a\">Further Resources</h2>\n",
     "\n",
-    "### Papers\n",
+    "### Papers about RL in Particle Accelerators and Large-Scale Facilities\n",
     "\n",
     " - [Learning-based optimisation of particle accelerators under partial observability without real-world training](https://proceedings.mlr.press/v162/kaiser22a.html) - Tuning of electron beam properties on a diagnostic screen using RL.\n",
     " - [Sample-efficient reinforcement learning for CERN accelerator control](https://journals.aps.org/prab/abstract/10.1103/PhysRevAccelBeams.23.124801) - Beam trajectory steering using RL with a focus on sample-efficient training.\n",
@@ -839,15 +839,40 @@
    "source": [
     "<h2 style=\"color: #b51f2a\">Further Resources</h2>\n",
     "\n",
-    "### Literature\n",
+    "### RL Books\n",
     " \n",
-    " - [Reinforcement Learning: An Introduction](http://incompleteideas.net/book/the-book.html) - Standard text book on RL.\n",
+    " - R. S. Sutton, Reinforcement learning, Second edition. in Adaptive computation and machine learning. Cambridge, Massachusetts: The MIT Press, 2020 [Reinforcement Learning: An Introduction](http://incompleteideas.net/book/the-book.html)\n",
+    " - A. Agarwal, N. Jiang, S. M. Kakade, W. Sun: Reinforcement Learning: Theory and Algorithms, 2022 [https://rltheorybook.github.io/](https://rltheorybook.github.io/)\n",
+    " - K. P. Murphy, Probabilistic Machine Learning: An introduction. MIT Press, 2022. [https://probml.github.io/pml-book/book1.html](https://probml.github.io/pml-book/book1.html)\n",
+    " - K. P. Murphy, Probabilistic Machine Learning: Advanced Topics. MIT Press, 2023. [http://probml.github.io/book2](http://probml.github.io/book2)\n",
     "\n",
     "### Packages\n",
-    " - [Gym](https://www.gymlibrary.ml) - Defacto standard for implementing custom environments. Also provides a library of RL tasks widely used for benchmarking.\n",
-    " - [Stable Baslines3](https://github.com/DLR-RM/stable-baselines3) - Provides reliable, benchmarked and easy-to-use implementations of the most important RL algorithms.\n",
+    " - [Gymnasium](https://gymnasium.farama.org/index.html) - Defacto standard for implementing custom environments. Also provides a library of RL tasks widely used for benchmarking.\n",
+    " - [Stable Baselines3](https://github.com/DLR-RM/stable-baselines3) - Provides reliable, benchmarked and easy-to-use implementations of the most important RL algorithms.\n",
     " - [Ray RLlib](https://docs.ray.io/en/latest/rllib/index.html) - Part of the *Ray* Python package providing implementations of various RL algorithms with a focus on distributed training."
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Courses Online\n",
+    "\n",
+    "- Chelsea Finn (Berkley): [Deep Multi-Task and Meta Learning](https://cs330.stanford.edu/)\n",
+    "- Sergey Levine (Berkley): [Deep Reinforcement Learning](http://rail.eecs.berkeley.edu/deeprlcourse/)\n",
+    "- Emma Brunskill (Stanford): [Reinforcement Learning](https://web.stanford.edu/class/cs234/index.html)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
   }
  ],
  "metadata": {
@@ -867,7 +892,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.9.12"
+   "version": "3.10.13"
   },
   "vscode": {
    "interpreter": {