From 175e6dc224e330fb18bfc6444e8fba53792ab500 Mon Sep 17 00:00:00 2001
From: ifilot <ivo@ivofilot.nl>
Date: Thu, 24 Aug 2023 12:11:46 +0200
Subject: [PATCH 1/9] Improving installation instructions

---
 docs/installation.rst | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/docs/installation.rst b/docs/installation.rst
index 27ab8e6..59278ea 100644
--- a/docs/installation.rst
+++ b/docs/installation.rst
@@ -11,6 +11,7 @@ available to you:
 * `Eigen3 <https://eigen.tuxfamily.org>`_ (matrix algebra)
 * `Boost <https://www.boost.org/>`_ (common routines)
 * `TCLAP <https://tclap.sourceforge.net/>`_ (command line instruction library)
+* `CMake <https://cmake.org/>`_ (build tool)
 
 On Debian-based operating systems, one can run the following::
 
@@ -21,6 +22,9 @@ On Debian-based operating systems, one can run the following::
    is to use `Debian for Windows Subsystem for Linux (WSL) <https://apps.microsoft.com/store/detail/debian/9MSVKQC78PK6>`_.
    The compilation instructions below can be readily used.
 
+.. warning::
+   In order to compile for GPU using CUDA, one needs Eigen3 version **3.4.0** or higher.
+
 Compilation
 -----------
 
@@ -47,8 +51,8 @@ CUDA support
    version 2, you can use CUDA from the Linux environment under Windows.
    Detailed instructions are given `here <https://docs.nvidia.com/cuda/wsl-user-guide/index.html>`_.
 
-The similarity analysis functionality of :program:`Bramble` significantly
-benefits from the availability of a graphical card. To compile :program:`Bramble`
+The similarity analysis functionality of :program:`Bramble` can
+benefit from the availability of a graphical card. To compile :program:`Bramble`
 with CUDA support, run CMake with::
 
     cmake ../src -DMOD_CUDA=1 -DCUDA_ARCH=<ARCH>
@@ -56,7 +60,7 @@ with CUDA support, run CMake with::
 wherein `<ARCH>` is replaced with the architecture of your graphical card. For
 example, if you use an RTX 4090, you would use ``-DCUDA_ARCH=sm_89``. To
 test that :program:`Bramble` can use your GPU, you can run the ``bramblecuda``
-tool::
+tool whose sole function is to test for the availability of a GPU on the system::
 
     ./bramblecuda
 
@@ -79,9 +83,12 @@ Typical output should look as follows::
       Peak Memory Bandwidth (GB/s): 1008.1
 
 .. note::
-   There is currently no support for using multiple GPUs. :program:`Bramble`
-   automatically selects the first GPU available and executes the code on this
-   GPU. Multi-GPU support is however in development.
+   * There is currently no support for using multiple GPUs. :program:`Bramble`
+     automatically selects the first GPU available and executes the code on this
+     GPU. Multi-GPU support is however in development.
+   * The functionality of `bramblecuda` is only for showing information on the
+     GPUs available on your system. The actual GPU-accelerated calculation is
+     still handled by the `bramble` executable.
 
 Testing
 -------

From ca2039597c8e985b7f2b6d480afdb2c97f6cf17a Mon Sep 17 00:00:00 2001
From: Ivo Filot <ivo@ivofilot.nl>
Date: Sat, 26 Aug 2023 21:32:43 +0200
Subject: [PATCH 2/9] Adding execution times

---
 docs/examples.rst | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/docs/examples.rst b/docs/examples.rst
index bbfcef2..c86ac8f 100644
--- a/docs/examples.rst
+++ b/docs/examples.rst
@@ -163,3 +163,20 @@ to bulk atoms, :math:`\mu_{ij} \approx 36` is found.
 
 .. figure:: _static/img/similarity_analysis_co1121.png
     :align: center
+
+Execution times
+***************
+
+To get an impression of typical execution times and the benefit of GPU
+acceleration, we refer to the Table as seen below.
+
+.. list-table:: Execution times for the Co HCP 11-21
+   :widths: 50 50
+   :header-rows: 1
+
+   * - System
+     - Execution time (averaged)
+   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz
+     - 172.58s
+   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz + RTX 4090
+     - 90.18s

From 22e7d17fc52fa96bf68b4700abd41b5b9ce9ca9a Mon Sep 17 00:00:00 2001
From: Ivo Filot <ivo@ivofilot.nl>
Date: Sat, 26 Aug 2023 21:57:21 +0200
Subject: [PATCH 3/9] Migrating execution times

---
 .gitignore        |  2 ++
 docs/examples.rst | 35 ++++++++++++++++++-----------------
 2 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/.gitignore b/.gitignore
index 1219a46..4765fcd 100644
--- a/.gitignore
+++ b/.gitignore
@@ -39,3 +39,5 @@ examples/data
 # Sphinx docs
 docs/_build/
 docs/userguide/build*
+pa_*.txt
+sa_*.txt
diff --git a/docs/examples.rst b/docs/examples.rst
index c86ac8f..b611b1c 100644
--- a/docs/examples.rst
+++ b/docs/examples.rst
@@ -61,10 +61,26 @@ between the surface and bulk atoms amounts to :math:`\mu_{ij} = 30.8`.
 .. figure:: _static/img/similarity_analysis_rh111.png
     :align: center
 
+Execution times
+***************
+
+To get an impression of typical execution times and the benefit of GPU
+acceleration, we refer to the Table as seen below.
+
+.. list-table:: Execution times for the Rh FCC111 example
+   :widths: 50 50
+   :header-rows: 1
+
+   * - System
+     - Execution time (averaged)
+   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz
+     - 172.58s
+   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz + RTX 4090
+     - 90.18s
+
 Co HCP 11-21
 ------------
 
-
 The following code is used to run this example::
 
      ./build/bramble -p patterns/patterns.json -i src/test/data/POSCAR_Co1121 -o pa_co1121.txt
@@ -146,7 +162,7 @@ atoms are automatically recognized.
 
 Continuing the study by  performing a similarity analysis by running::
 
-    ./build/bramble -s -i src/test/data/POSCAR_Rh111 -o sa_fcc111.txt
+    ./build/bramble -s -i src/test/data/POSCAR_Co1121 -o sa_fcc111.txt
 
 yields the result as shown in the image below. Comparing the image with the
 CNA pattern per atom above, we can readily interpret this result. The light
@@ -164,19 +180,4 @@ to bulk atoms, :math:`\mu_{ij} \approx 36` is found.
 .. figure:: _static/img/similarity_analysis_co1121.png
     :align: center
 
-Execution times
-***************
-
-To get an impression of typical execution times and the benefit of GPU
-acceleration, we refer to the Table as seen below.
-
-.. list-table:: Execution times for the Co HCP 11-21
-   :widths: 50 50
-   :header-rows: 1
 
-   * - System
-     - Execution time (averaged)
-   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz
-     - 172.58s
-   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz + RTX 4090
-     - 90.18s

From 0a9315e8466d6795521cfabccacb87f40fd143c0 Mon Sep 17 00:00:00 2001
From: Ivo Filot <ivo@ivofilot.nl>
Date: Sun, 27 Aug 2023 10:19:57 +0200
Subject: [PATCH 4/9] Adding calculation times for Co1121 example

---
 docs/examples.rst | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/docs/examples.rst b/docs/examples.rst
index b611b1c..5d3705c 100644
--- a/docs/examples.rst
+++ b/docs/examples.rst
@@ -73,9 +73,9 @@ acceleration, we refer to the Table as seen below.
 
    * - System
      - Execution time (averaged)
-   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz
+   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads)
      - 172.58s
-   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz + RTX 4090
+   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads) + RTX 4090
      - 90.18s
 
 Co HCP 11-21
@@ -180,4 +180,19 @@ to bulk atoms, :math:`\mu_{ij} \approx 36` is found.
 .. figure:: _static/img/similarity_analysis_co1121.png
     :align: center
 
+Execution times
+***************
+
+To get an impression of typical execution times and the benefit of GPU
+acceleration, we refer to the Table as seen below.
 
+.. list-table:: Execution times for the Co HCP 11-21 example
+   :widths: 50 50
+   :header-rows: 1
+
+   * - System
+     - Execution time (averaged)
+   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads)
+     - 2368.63s
+   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads) + RTX 4090
+     - 5207.93s

From a00b16ffab4e3102ca5400376209b725db7562e4 Mon Sep 17 00:00:00 2001
From: Ivo Filot <ivo@ivofilot.nl>
Date: Sun, 27 Aug 2023 10:22:07 +0200
Subject: [PATCH 5/9] Add notation execution times

---
 docs/examples.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/examples.rst b/docs/examples.rst
index 5d3705c..cbb7856 100644
--- a/docs/examples.rst
+++ b/docs/examples.rst
@@ -193,6 +193,6 @@ acceleration, we refer to the Table as seen below.
    * - System
      - Execution time (averaged)
    * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads)
-     - 2368.63s
+     - 2368.63s (39m28s)
    * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads) + RTX 4090
-     - 5207.93s
+     - 5207.93s (1h26m47s)

From 0af2fc5b6862c3d54a7b5ae9bbc62558567cd45f Mon Sep 17 00:00:00 2001
From: ifilot <ivo@ivofilot.nl>
Date: Mon, 4 Sep 2023 19:42:46 +0200
Subject: [PATCH 6/9] Expanding documentation

---
 docs/examples.rst        |  8 +++++--
 docs/execution_model.rst | 51 ++++++++++++++++++++++++++++++++++++++++
 docs/index.rst           |  1 +
 docs/publications.rst    |  4 ++++
 docs/user_interface.rst  | 15 ++++++++++++
 5 files changed, 77 insertions(+), 2 deletions(-)
 create mode 100644 docs/execution_model.rst

diff --git a/docs/examples.rst b/docs/examples.rst
index cbb7856..99fbf2d 100644
--- a/docs/examples.rst
+++ b/docs/examples.rst
@@ -77,6 +77,10 @@ acceleration, we refer to the Table as seen below.
      - 172.58s
    * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads) + RTX 4090
      - 90.18s
+   * - Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz (6 threads)
+     - 311.84s
+   * - Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz (6 threads) + RTX 2070
+     - 125.08s
 
 Co HCP 11-21
 ------------
@@ -193,6 +197,6 @@ acceleration, we refer to the Table as seen below.
    * - System
      - Execution time (averaged)
    * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads)
-     - 2368.63s (39m28s)
-   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads) + RTX 4090
      - 5207.93s (1h26m47s)
+   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads) + RTX 4090
+     - 2368.63s (39m28s)
diff --git a/docs/execution_model.rst b/docs/execution_model.rst
new file mode 100644
index 0000000..f065dc2
--- /dev/null
+++ b/docs/execution_model.rst
@@ -0,0 +1,51 @@
+.. _execution_model:
+.. index:: Execution model
+
+Execution model
+===============
+
+When :program:`Bramble` is compiled with the CUDA module, one can use GPU
+acceleration to speed up the execution. This is especially beneficial when
+performing a similarity analysis. :program:`Bramble` supports multi-GPU
+setups, so one can use multiple GPUs if more than one GPU is available.
+
+When performing the similarity analysis, an inventory of all the jobs is made.
+``N+1`` OpenMP threads are being spawned where ``N`` equals the number of GPUs.
+Each GPU gets assigned a CPU thread and jobs are relayed to the GPU via the CPU
+thread. The remaining OpenMP thread employs so-called nested parallellism and
+executes another OpenMP parallel environment which uses all CPUs.
+
+Obviously, this implies that the ``N`` CPU threads which are involved in
+managing the GPUs are also used for other parts of the calculation. Since the
+computational load of managing the GPUs is however relatively minimal, this does
+come at a huge impact on performance. In fact, not using these CPUs is worse
+than partially also using them to manage the GPUs.
+
+When no GPUs are available, :program:`Bramble` uses no nested parallelism and
+simply executes all jobs concurrently wherein each job uses OpenMP parallelism
+on a per-job basis.
+
+.. _memory_load:
+
+Memory load
+-----------
+
+Performing calculations is quite memory expensive and as a rule of thumb, one
+needs roughly 8GB of memory per execution thread. For example, if one uses
+two GPUs, one needs roughly 24GB of memory. If memory is limited, one option
+is to use swapping, however this comes at a great cost on performance. Nevertheless,
+it might still be beneficial.
+
+Assuming the user has root privileges, one can use the following instructions
+to increase the amount of swap memory::
+
+    sudo mkswap /swapfile
+    sudo chmod 600 /swapfile
+    sudo swapon /swapfile
+    sudo swapon --show
+
+Typical output would yield::
+
+    NAME      TYPE      SIZE   USED PRIO
+    /dev/sdb3 partition 976M   976M   -2
+    /swapfile file        8G 196.3M   -3
diff --git a/docs/index.rst b/docs/index.rst
index 04a4846..c0ef303 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -50,6 +50,7 @@ requests are ideally submitted via the `github issue tracker
 
    installation
    background
+   execution_model
    gallery
    user_interface
    examples
diff --git a/docs/publications.rst b/docs/publications.rst
index e155b65..834bbe5 100644
--- a/docs/publications.rst
+++ b/docs/publications.rst
@@ -6,6 +6,10 @@ Publications
 
 The following publications make use of :program:`Bramble`
 
+* *Unraveling the Role of Metal–Support Interactions on the Structure Sensitivity
+  of Fischer–Tropsch Synthesis*, van Etten, M.P.C., de Laat, M.E., Hensen, E.J.M.,
+  Filot, I.A.W., J. Phys. Chem. C, **2023**, 127, 31, 15148-15156,
+  DOI: `10.1021/acs.jpcc.3c02240 <https://doi.org/10.1021/acs.jpcc.3c02240>`_
 * *Enumerating Active Sites on Metal Nanoparticles: Understanding the Size
   Dependence of Cobalt Particles for CO Dissociation*, van Etten M.P.C.,
   Zijlstra B., Hensen E.J.M., Filot, I.A.W., ACS Catal., **2021**, 11, 14,
diff --git a/docs/user_interface.rst b/docs/user_interface.rst
index 54570d4..fbaa066 100644
--- a/docs/user_interface.rst
+++ b/docs/user_interface.rst
@@ -13,6 +13,14 @@ validate the pattern library.
     line arguments as long as any instructions belonging to a specific
     argument are directly after that argument.
 
+.. warning::
+    * Note that :program:`Bramble` uses roughly 8GB per execution thread, where
+      the number of execution threads is ``N+1`` where ``N`` is the number of GPUs.
+      See also :ref:`this page <memory_load>`.
+    * For systems having **multiple** GPUs, one needs to explicitly set
+      ``--ngpu <number of gpus>`` to make use of all GPUs. If not, only one of
+      the GPUs is being used.
+
 Bramble
 -------
 
@@ -57,6 +65,13 @@ mandatory command line arguments::
 * ``-o``, ``--output`` ``<output-file>``
     Where to write the output to.
 
+* ``-g``, ``--ngpu`` ``<number of gpus>``
+    Number of GPUs to use. This option is only available when :program:`Bramble`
+    is compiled with the CUDA module. If more GPUs are allocated via this tag
+    than the number of GPUs available, the number is automatically lowered to
+    match the number of GPUs available. The default value is 1, so for multi-GPU
+    systems, the user needs to manually adjust this value.
+
 *Example*: ``./bramble -p ../patterns/patterns.json -i ../src/test/data/co_np.geo -o result.txt``
 
 Typical output looks as follows::

From 1dfeb18b133506ade91e6793a1ceab17009cc405 Mon Sep 17 00:00:00 2001
From: ifilot <ivo@ivofilot.nl>
Date: Tue, 5 Sep 2023 12:44:50 +0200
Subject: [PATCH 7/9] Adding computational times

---
 docs/examples.rst | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/docs/examples.rst b/docs/examples.rst
index 99fbf2d..392ee66 100644
--- a/docs/examples.rst
+++ b/docs/examples.rst
@@ -73,13 +73,13 @@ acceleration, we refer to the Table as seen below.
 
    * - System
      - Execution time (averaged)
-   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads)
+   * - Intel(R) Core(TM) i9-10900K (20 threads)
      - 172.58s
-   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads) + RTX 4090
+   * - Intel(R) Core(TM) i9-10900K (20 threads) + RTX 4090
      - 90.18s
-   * - Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz (6 threads)
+   * - Intel(R) Core(TM) i5-8400 (6 threads)
      - 311.84s
-   * - Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz (6 threads) + RTX 2070
+   * - Intel(R) Core(TM) i5-8400 (6 threads) + RTX 2070
      - 125.08s
 
 Co HCP 11-21
@@ -196,7 +196,15 @@ acceleration, we refer to the Table as seen below.
 
    * - System
      - Execution time (averaged)
-   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads)
+   * - Intel(R) Core(TM) i9-10900K (20 threads)
      - 5207.93s (1h26m47s)
-   * - Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz (20 threads) + RTX 4090
+   * - Intel(R) Core(TM) i9-10900K (20 threads) + RTX 4090
      - 2368.63s (39m28s)
+   * - Intel(R) Core(TM) i5-8400 (6 threads) + RTX 2070
+     - 2986.00s (49m46s)
+   * - Intel(R) Xeon(R) Gold 6234 (16 threads) + A5000
+     - 3912.19 (65m12s)
+   * - Intel(R) Core(TM) i5-12400F (12 threads) + 1x GTX 1080 Ti
+     - 2759.49 (45m59s)
+   * - Intel(R) Core(TM) i5-12400F (12 threads) + 2x GTX 1080 Ti
+     - 2067.24s (34m27s)

From 61d3bab4ca5544fce8b5c37b4651d166cc60f18b Mon Sep 17 00:00:00 2001
From: ifilot <ivo@ivofilot.nl>
Date: Thu, 7 Sep 2023 08:41:01 +0200
Subject: [PATCH 8/9] Adding easybuild

---
 docs/installation.rst        | 14 ++++++++++----
 src/test/test_similarity.cpp | 12 +++++++++---
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/docs/installation.rst b/docs/installation.rst
index 59278ea..e337c82 100644
--- a/docs/installation.rst
+++ b/docs/installation.rst
@@ -55,11 +55,9 @@ The similarity analysis functionality of :program:`Bramble` can
 benefit from the availability of a graphical card. To compile :program:`Bramble`
 with CUDA support, run CMake with::
 
-    cmake ../src -DMOD_CUDA=1 -DCUDA_ARCH=<ARCH>
+    cmake ../src -DMOD_CUDA=1
 
-wherein `<ARCH>` is replaced with the architecture of your graphical card. For
-example, if you use an RTX 4090, you would use ``-DCUDA_ARCH=sm_89``. To
-test that :program:`Bramble` can use your GPU, you can run the ``bramblecuda``
+To test that :program:`Bramble` can use your GPU, you can run the ``bramblecuda``
 tool whose sole function is to test for the availability of a GPU on the system::
 
     ./bramblecuda
@@ -127,3 +125,11 @@ Typical output should look as follows::
     100% tests passed, 0 tests failed out of 9
 
     Total Test time (real) =   1.73 sec
+
+EasyBuild Installation
+----------------------
+
+For HPC infrastructure, there is also the option to install :program:`Bramble` using EasyBuild.
+Make a copy of `bramble-1.1.0.eb` and run::
+
+    eb bramble-1.1.0.eb --minimal-toolchains --add-system-to-minimal-toolchains --robot
diff --git a/src/test/test_similarity.cpp b/src/test/test_similarity.cpp
index e9a6782..9882c07 100644
--- a/src/test/test_similarity.cpp
+++ b/src/test/test_similarity.cpp
@@ -22,6 +22,9 @@
 #include <boost/test/unit_test.hpp>
 
 #include "similarity_analysis.h"
+#ifdef MOD_CUDA
+#include "card_manager.h"
+#endif
 
 // check that we can read .geo files
 BOOST_AUTO_TEST_CASE(test_similarity) {
@@ -57,9 +60,12 @@ BOOST_AUTO_TEST_CASE(test_similarity) {
     BOOST_TEST(ans2 == ans3, boost::test_tools::tolerance(1e-7));
 
     #ifdef MOD_CUDA
-    float ans4 = sa.calculate_distance_metric_cuda(dm3, dm4, &permvec[0]);
-    BOOST_TEST(ans2 == ans4, boost::test_tools::tolerance(1e-7));
-    BOOST_TEST(ans3 == ans4, boost::test_tools::tolerance(1e-7));
+    CardManager cm;
+    if(cm.get_num_gpus() > 0) {
+        float ans4 = sa.calculate_distance_metric_cuda(dm3, dm4, &permvec[0]);
+        BOOST_TEST(ans2 == ans4, boost::test_tools::tolerance(1e-7));
+        BOOST_TEST(ans3 == ans4, boost::test_tools::tolerance(1e-7));
+    }
     #endif // MOD_CUDA
 
     //-------------------------------------------------------------------------

From 3d03a820bdd6507b1bc394981512cc981315af3d Mon Sep 17 00:00:00 2001
From: ifilot <ivo@ivofilot.nl>
Date: Thu, 7 Sep 2023 09:07:14 +0200
Subject: [PATCH 9/9] Expanding on documentation

---
 docs/execution_model.rst | 6 ++++++
 docs/installation.rst    | 3 ++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/docs/execution_model.rst b/docs/execution_model.rst
index f065dc2..7cc4274 100644
--- a/docs/execution_model.rst
+++ b/docs/execution_model.rst
@@ -9,6 +9,12 @@ acceleration to speed up the execution. This is especially beneficial when
 performing a similarity analysis. :program:`Bramble` supports multi-GPU
 setups, so one can use multiple GPUs if more than one GPU is available.
 
+.. warning::
+    :program:`Bramble` requires a GPU with at least 8Gb of memory. :program:`Bramble`
+    will check whether the GPU supports the calculation prior to execution and throws
+    an error when the GPU is not supported. You can also check the memory available
+    on your GPU by running ``bramblecuda``.
+
 When performing the similarity analysis, an inventory of all the jobs is made.
 ``N+1`` OpenMP threads are being spawned where ``N`` equals the number of GPUs.
 Each GPU gets assigned a CPU thread and jobs are relayed to the GPU via the CPU
diff --git a/docs/installation.rst b/docs/installation.rst
index e337c82..54090e2 100644
--- a/docs/installation.rst
+++ b/docs/installation.rst
@@ -23,7 +23,8 @@ On Debian-based operating systems, one can run the following::
    The compilation instructions below can be readily used.
 
 .. warning::
-   In order to compile for GPU using CUDA, one needs Eigen3 version **3.4.0** or higher.
+   * In order to compile for GPU using CUDA, one needs Eigen3 version **3.4.0** or higher.
+   * Your GPU needs at least 8Gb of memory in order to use Bramble.
 
 Compilation
 -----------