pyiron-workshop · jan-janssen · Sep 28, 2023
diff --git a/day1.ipynb b/day1.ipynb
diff --git a/day2.ipynb b/day2.ipynb
diff --git a/day3.ipynb b/day3.ipynb
@@ -1 +1 @@
-{"cells":[{"metadata":{},"id":"chronic-dubai","cell_type":"markdown","source":"# Day 3 - Beyond atomistics\nAs illustrated by the first two days pyiron was originally designed for atomistic simulation or ab intio thermodynamics to be more specific. Still since the [publication of pyiron](https://www.sciencedirect.com/science/article/pii/S0927025618304786) and the release of pyiron as opensource software package on [Github](https://github.com/pyiron) pyiron has been extended beyond atomistics. To foster these activities pyiron just like AiiDA joined [numfocus](https://numfocus.org) as affiliated project to promote open science and open code in the materials science community. So the content of the third day is focused on how to use pyiron beyond atomistics and our general activities in the materials science community to promote open code. "},{"metadata":{},"id":"residential-excitement","cell_type":"markdown","source":"## Implement your own class\nOne very direct way to use pyiron beyond atomistics is by implementing your own job class, either as a `ScriptJob` as demonstrated yesterday or as a pyiron job class. In this example the pyiron `TemplateJob` class is used to derive a `ToyJob` class which basically just copies the input to the output. The input consists of a single entry `input_energy` which is set to $100$ and copied to `job[\"output/generic/energy_tot\"]`. The important part is the simplicity of implementing new classes, all that is required is bascially an `write_input()` and a `collect_output()` function. The `write_input()` function takes the generic input defined in `job.input` as a python dictionary and writes the corresponding input files for a given simulation code. Afterwards the executable is called, in this simple case it is just a shell command to copy the input in the file `input` to the output file `output` using `cat input > output`. Finally the `collect_output()` function reads the output file, parses the output variables, in this example just a single one and stores the output in the HDF5 file using the `self.project_hdf5` interface, which basically accepts any kind of dictionary. So the taks of the `collect_output()` function can be summarized as parsing the output and returing a python dictionary for pyiron to store. "},{"metadata":{"trusted":true},"id":"primary-gardening","cell_type":"code","source":"from os.path import join\nfrom pyiron_base import TemplateJob, Project","execution_count":1,"outputs":[]},{"metadata":{"trusted":true},"id":"mineral-faith","cell_type":"code","source":"class ToyJob(TemplateJob):\n    def __init__(self, project, job_name):\n        super().__init__(project, job_name) \n        self.input['input_energy'] = 100\n        self.executable = \"cat input > output\"\n\n    def write_input(self): \n        self.input.write_file( \n            file_name=\"input\",\n            cwd=self.working_directory\n        )\n\n    def collect_output(self):\n        file = join(self.working_directory, \"output\") \n        with open(file) as f:\n            line = f.readlines()[0]\n        energy = float(line.split()[1]) \n        with self.project_hdf5.open(\"output/generic\") as h5out: \n            h5out[\"energy_tot\"] = energy","execution_count":2,"outputs":[]},{"metadata":{},"id":"thick-biotechnology","cell_type":"markdown","source":"After the `ToyJob` class is defined it can be used like any other pyiron class: "},{"metadata":{"trusted":true},"id":"opposed-purse","cell_type":"code","source":"pr = Project('test')","execution_count":3,"outputs":[]},{"metadata":{},"id":"phantom-carnival","cell_type":"markdown","source":"Only the creation of the job is slightly different, instead of selecting the `job_type` from `pr.job_type.*` the `ToyJob` class is set directly: "},{"metadata":{"trusted":true},"id":"catholic-information","cell_type":"code","source":"job = pr.create_job(job_type=ToyJob, job_name=\"toy\")","execution_count":4,"outputs":[]},{"metadata":{},"id":"dressed-vacation","cell_type":"markdown","source":"Then the job can be executed like any other job - at least when the jupyter notebook is executed inline. For the case of submitting a custom job class defined in a Jupyter notebook to a remote computing cluser it is recommended to just submit the whole jupyter notebook as `ScriptJob`. "},{"metadata":{"trusted":true},"id":"falling-nightlife","cell_type":"code","source":"job.run()","execution_count":5,"outputs":[{"output_type":"stream","text":"The job toy was saved and received the ID: 68\n","name":"stdout"}]},{"metadata":{},"id":"greater-cookie","cell_type":"markdown","source":"Finally the output can be accessed in the same way as already demonstrated on day one and two: "},{"metadata":{"trusted":true},"id":"heavy-collective","cell_type":"code","source":"job['output/generic/energy_tot']","execution_count":6,"outputs":[{"output_type":"execute_result","execution_count":6,"data":{"text/plain":"100.0"},"metadata":{}}]},{"metadata":{},"id":"synthetic-programmer","cell_type":"markdown","source":"So while the number of simulation codes currently implemented in pyiron is restricted to the ones we primarly use, it is easy to add new simulation codes. Two examples would be: \n* [pyiron-cp2k](https://github.com/jan-janssen/pyiron-cp2k)\n* [pyiron-quantum-espresso](https://github.com/jan-janssen/pyiron-quantum-espresso)\n\nStill both of these interfaces are currently in a highly experimental state and we are still looking for experienced users who would be interested to use pyiron for their research and can help us with their code specific knowledge. Apart from this pyiron can in principle used to implement any kind of calculation which benefits form a unified storage layer and the interface to HPC infrastructure."},{"metadata":{},"id":"sunset-fitting","cell_type":"markdown","source":"## Publish your workflow\nIn the same direction pyiron not only supports developers by providing them with a platform to integrate their plugins but also scientists in general with a framework to publish scientific workflows developed with pyiron. The [pyiron-publication-template](https://github.com/pyiron/pyiron-publication-template) is a combination of continous integration on the Github platform, with jupyterbook for a simple website and mybinder for an interactive user experience. \n\nAs a scientist you can upload: \n* your jupyter notebook which defines the physical steps of your method. \n* your conda environment as `environment.yml` file to fix the versions of the executables you used. \n* your resources, like existing pyiron calculation, which can be extracted using `pr.pack()` or specific parameter databases like interatomic potentials for example. \n\nWhile the technology is rather new, there are already a hand ful of examples available which demonstrate the advantage of having such a template to have a unifed way to publish new simulation protocols and workflows:\n* https://github.com/pyiron/pyiron_meltingpoint\n* https://github.com/pyiron/pyiron_generalized_dipole\n* https://github.com/pyiron/pyiron_md_montecarlo\n* https://github.com/materialdigital/pyiron-workflow-TEMImageSegmentation\n* https://github.com/materialdigital/pyiron-workflow-damask"},{"metadata":{},"id":"brutal-procedure","cell_type":"markdown","source":"## Open science \nFinally beyond the `pyiron` project in general the developer in the pyiron project also contribute to other opensource software packages in the materials science community like `ASE` or `pymatgen`. Other contributions include: \n* The release of components originally developed for pyiron as standalone packages as they might also be relevant to users outside the pyiron community. For example a [simple interface to HPC queuing systems](https://github.com/pyiron/pysqa) based on the idea that mondern queuing systems offer a lot of different settings but most users stick to a set of predefined templates or a [prallel interface to the lammps library](https://github.com/pyiron/pylammpsmpi) based on mpi4py which is directly accessible from a jupyter notebook which is executed in a serial process. \n* As demonstrated on the first day, installing pyiron is easy because all dependencies are already included in conda-forge. This was not the case a few years ago. Over the last few years the pyiron developers [contributed over 100 packages](https://github.com/jan-janssen/conda-forge-contribution) to the conda-forge community ranging from simple python codes to DFT codes written in Fortran. \n* Finally this workshop as well as our other virtual workshops extensively use jupyterhub in combination with Docker containers to provide [virtual environments](https://hub.docker.com/u/pyiron) for the participants. These virtual environments are constructed using conda packages and Github based continous integration. The build process is also [publicly available](https://github.com/pyiron/docker-stacks) and could be used to automate the creation of virtual environments for other workshops as well. "},{"metadata":{},"id":"quick-belle","cell_type":"markdown","source":"## Summary\nThe third day highlighted:\n* the development of own classes which do not even have to be related to atomistics or materials science in general. \n* the workflow to publish jupyter notebooks created with pyiron as a new way of sharing your work. \n* further contributions of the pyiron developers to the open science community in general. \n\nThank you for your attention. "},{"metadata":{"trusted":true},"id":"attractive-skiing","cell_type":"code","source":"","execution_count":null,"outputs":[]}],"metadata":{"kernelspec":{"name":"python3","display_name":"Python 3","language":"python"},"language_info":{"name":"python","version":"3.7.10","mimetype":"text/x-python","codemirror_mode":{"name":"ipython","version":3},"pygments_lexer":"ipython3","nbconvert_exporter":"python","file_extension":".py"}},"nbformat":4,"nbformat_minor":5}
+{"metadata":{"kernelspec":{"name":"python3","display_name":"Python 3 (ipykernel)","language":"python"},"language_info":{"name":"python","version":"3.11.5","mimetype":"text/x-python","codemirror_mode":{"name":"ipython","version":3},"pygments_lexer":"ipython3","nbconvert_exporter":"python","file_extension":".py"}},"nbformat_minor":5,"nbformat":4,"cells":[{"cell_type":"markdown","source":"# Notebook 3 - Beyond atomistics\nAs illustrated by the first two notebooks pyiron was originally designed for atomistic simulation or ab intio thermodynamics to be more specific. Still since the [publication of pyiron](https://www.sciencedirect.com/science/article/pii/S0927025618304786) and the release of pyiron as opensource software package on [Github](https://github.com/pyiron) pyiron has been extended beyond atomistics. To foster these activities pyiron just like AiiDA joined [numfocus](https://numfocus.org) as affiliated project to promote open science and open code in the materials science community. So the content of the third day is focused on how to use pyiron beyond atomistics and our general activities in the materials science community to promote open code. ","metadata":{},"id":"chronic-dubai"},{"cell_type":"markdown","source":"## Implement your own class\nOne very direct way to use pyiron beyond atomistics is by implementing your own job class, either as a `ScriptJob` as demonstrated yesterday or as a pyiron job class. In this example the pyiron `TemplateJob` class is used to derive a `ToyJob` class which basically just copies the input to the output. The input consists of a single entry `input_energy` which is set to $100$ and copied to `job[\"output/generic/energy_tot\"]`. The important part is the simplicity of implementing new classes, all that is required is bascially an `write_input()` and a `collect_output()` function. The `write_input()` function takes the generic input defined in `job.input` as a python dictionary and writes the corresponding input files for a given simulation code. Afterwards the executable is called, in this simple case it is just a shell command to copy the input in the file `input` to the output file `output` using `cat input > output`. Finally the `collect_output()` function reads the output file, parses the output variables, in this example just a single one and stores the output in the HDF5 file using the `self.project_hdf5` interface, which basically accepts any kind of dictionary. So the taks of the `collect_output()` function can be summarized as parsing the output and returing a python dictionary for pyiron to store. ","metadata":{},"id":"residential-excitement"},{"cell_type":"code","source":"from os.path import join\nfrom pyiron_base import TemplateJob, Project","metadata":{"trusted":true},"execution_count":1,"outputs":[],"id":"primary-gardening"},{"cell_type":"code","source":"class ToyJob(TemplateJob):\n    def __init__(self, project, job_name):\n        super().__init__(project, job_name) \n        self.input['input_energy'] = 100\n        self.executable = \"cat input.yaml > output.yaml\"\n\n    def write_input(self): \n        self.input.write( \n            file_name=join(self.working_directory, \"input.yaml\")\n        )\n\n    def collect_output(self):\n        file = join(self.working_directory, \"output.yaml\") \n        with open(file) as f:\n            line = f.readlines()[0]\n        energy = float(line.split()[1]) \n        with self.project_hdf5.open(\"output/generic\") as h5out: \n            h5out[\"energy_tot\"] = energy","metadata":{"trusted":true},"execution_count":2,"outputs":[],"id":"mineral-faith"},{"cell_type":"markdown","source":"After the `ToyJob` class is defined it can be used like any other pyiron class: ","metadata":{},"id":"thick-biotechnology"},{"cell_type":"code","source":"pr = Project('test')","metadata":{"trusted":true},"execution_count":3,"outputs":[],"id":"opposed-purse"},{"cell_type":"markdown","source":"Only the creation of the job is slightly different, instead of selecting the `job_type` from `pr.job_type.*` the `ToyJob` class is set directly: ","metadata":{},"id":"phantom-carnival"},{"cell_type":"code","source":"job = pr.create_job(job_type=ToyJob, job_name=\"toy\")","metadata":{"trusted":true},"execution_count":4,"outputs":[],"id":"catholic-information"},{"cell_type":"markdown","source":"Then the job can be executed like any other job - at least when the jupyter notebook is executed inline. For the case of submitting a custom job class defined in a Jupyter notebook to a remote computing cluser it is recommended to just submit the whole jupyter notebook as `ScriptJob`. ","metadata":{},"id":"dressed-vacation"},{"cell_type":"code","source":"job.run()","metadata":{"trusted":true},"execution_count":5,"outputs":[{"name":"stdout","text":"The job toy was saved and received the ID: 1\n","output_type":"stream"}],"id":"falling-nightlife"},{"cell_type":"markdown","source":"Finally the output can be accessed in the same way as already demonstrated on day one and two: ","metadata":{},"id":"greater-cookie"},{"cell_type":"code","source":"job['output/generic/energy_tot']","metadata":{"trusted":true},"execution_count":6,"outputs":[{"execution_count":6,"output_type":"execute_result","data":{"text/plain":"100.0"},"metadata":{}}],"id":"heavy-collective"},{"cell_type":"markdown","source":"So while the number of simulation codes currently implemented in pyiron is restricted to the ones we primarly use, it is easy to add new simulation codes. Two examples would be: \n* [pyiron-cp2k](https://github.com/jan-janssen/pyiron-cp2k)\n* [pyiron-quantum-espresso](https://github.com/jan-janssen/pyiron-quantum-espresso)\n\nStill both of these interfaces are currently in a highly experimental state and we are still looking for experienced users who would be interested to use pyiron for their research and can help us with their code specific knowledge. Apart from this pyiron can in principle used to implement any kind of calculation which benefits form a unified storage layer and the interface to HPC infrastructure.","metadata":{},"id":"synthetic-programmer"},{"cell_type":"markdown","source":"## Publish your workflow\nIn the same direction pyiron not only supports developers by providing them with a platform to integrate their plugins but also scientists in general with a framework to publish scientific workflows developed with pyiron. The [pyiron-publication-template](https://github.com/pyiron/pyiron-publication-template) is a combination of continous integration on the Github platform, with jupyterbook for a simple website and mybinder for an interactive user experience. \n\nAs a scientist you can upload: \n* your jupyter notebook which defines the physical steps of your method. \n* your conda environment as `environment.yml` file to fix the versions of the executables you used. \n* your resources, like existing pyiron calculation, which can be extracted using `pr.pack()` or specific parameter databases like interatomic potentials for example. \n\nWhile the technology is rather new, there are already a hand ful of examples available which demonstrate the advantage of having such a template to have a unifed way to publish new simulation protocols and workflows:\n* https://github.com/pyiron/pyiron_meltingpoint\n* https://github.com/pyiron/pyiron_generalized_dipole\n* https://github.com/pyiron/pyiron_md_montecarlo\n* https://github.com/materialdigital/pyiron-workflow-TEMImageSegmentation\n* https://github.com/materialdigital/pyiron-workflow-damask","metadata":{},"id":"sunset-fitting"},{"cell_type":"markdown","source":"## Open science \nFinally beyond the `pyiron` project in general the developer in the pyiron project also contribute to other opensource software packages in the materials science community like `ASE` or `pymatgen`. Other contributions include: \n* The release of components originally developed for pyiron as standalone packages as they might also be relevant to users outside the pyiron community. For example a [simple interface to HPC queuing systems](https://github.com/pyiron/pysqa) based on the idea that mondern queuing systems offer a lot of different settings but most users stick to a set of predefined templates or a [prallel interface to the lammps library](https://github.com/pyiron/pylammpsmpi) based on mpi4py which is directly accessible from a jupyter notebook which is executed in a serial process. \n* As demonstrated on the first day, installing pyiron is easy because all dependencies are already included in conda-forge. This was not the case a few years ago. Over the last few years the pyiron developers [contributed over 100 packages](https://github.com/jan-janssen/conda-forge-contribution) to the conda-forge community ranging from simple python codes to DFT codes written in Fortran. \n* Finally this workshop as well as our other virtual workshops extensively use jupyterhub in combination with Docker containers to provide [virtual environments](https://hub.docker.com/u/pyiron) for the participants. These virtual environments are constructed using conda packages and Github based continous integration. The build process is also [publicly available](https://github.com/pyiron/docker-stacks) and could be used to automate the creation of virtual environments for other workshops as well. ","metadata":{},"id":"brutal-procedure"},{"cell_type":"markdown","source":"## Summary\nThe third notebook highlighted:\n* the development of own classes which do not even have to be related to atomistics or materials science in general. \n* the workflow to publish jupyter notebooks created with pyiron as a new way of sharing your work. \n* further contributions of the pyiron developers to the open science community in general. \n\nThank you for your attention. ","metadata":{},"id":"quick-belle"},{"cell_type":"code","source":"","metadata":{},"execution_count":null,"outputs":[],"id":"attractive-skiing"}]}
diff --git a/environment.yml b/environment.yml
@@ -2,9 +2,9 @@ channels:
 - conda-forge
 dependencies:
 - python
-- pyiron =0.4.3
-- pyiron_atomistics =0.2.9
-- lammps =2021.02.10=*openmpi*_4
-- gpaw =21.1.0
-- sphinxdft =2.7.0
-- nglview =2.7.7
+- pyiron =0.5.0
+- pyiron_atomistics =0.3.2
+- lammps =2023.08.02
+- gpaw =23.9.1
+- sphinxdft =3.0.7
+- nglview =3.0.8