The pynodelauncher tool launches pleasantly parallel tasks on multiple nodes on a HPC cluster.
It is using MPI through the mpi4py package which in turn is using system MPI library. A task can be any command, typically a call of a script with a series of parameters.
A series of independent, serial tasks can be executed one after each other, but they don't have to. pynodelauncher starts each individual task on a separate core, possibly spread over multiple nodes, using MPI depending on job submission on HPC. Each individual task can be additionally parallelized within itself when the MPI command or job submission accounts for that.
If you use this software, please cite it using DOI, information provided on Zenodo, or the content of the CITATION file. Each version (release) has its own DOI and DOI 10.5281/zenodo.4779075 refers to all versions.
The pynodelauncher tool can be installed using pip from the Git repository.
Alternatively, the pynodelauncher.py
script can be used directly if needed.
Use pip to install pynodelauncher:
python -m pip install git+https://github.com/ncsu-landscape-dynamics/pynodelauncher.git
This will also install the mpi4py package (again using pip) if it is not already installed (using pip or in any other way).
Here we assume basic Ubuntu 20.04 with almost no software installed.
sudo apt update
sudo apt install python3 python3-pip git libopenmpi-dev
python3 -m pip install git+https://github.com/ncsu-landscape-dynamics/pynodelauncher.git
To execute, use python3
instead of just python
unless you also set
the default Python to be Python 3.
Otherwise, the general usage (below) applies.
module load conda
conda activate /path/to/env
module load PrgEnv-intel
pip install git+https://github.com/ncsu-landscape-dynamics/pynodelauncher.git
conda deactivate
Replace /path/to/env
by the path to conda environment you are using or
the whole module load conda
and conda activate ...
by a module load
which sets up a conda environment.
Note that the mpi4py and pynodelauncher installations will go to your home directory. Here, this is desired because 1) you are not using the mpi4py through conda and 2) you can use any conda environment for your actual work. However, you need to have enough space for the installation in your home directory. That should not be an issue unless you are using the home directory for things which need to be outside of it such as data or conda environments or cache.
See also the official documentation for installing mpi4py.
If you are using R with module load R
(and not with conda), you need to load
the R module before loading PrgEnv-intel, because R has a conflict warning for
PrgEnv-intel but not the other way around.
Typical usage consists of two steps:
- Preparing a text file with list of tasks (commands) to execute.
- Executing the script using MPI.
Prepare file tasks.txt
:
echo "Hello from $HOSTNAME"
echo "Hello from $HOSTNAME"
echo "Hello from $HOSTNAME"
echo "Hello from $HOSTNAME"
echo "Hello from $HOSTNAME"
Each row in the file is a task. A row can contain one or more commands
(separated by ;
). The syntax is syntax of your shell, e.g., Bash.
Run from command line:
mpiexec -n 4 python -m mpi4py -m pynodelauncher tasks.txt
Notice the python -m mpi4py -m
part which is needed for avoiding
deadlocks (and thus stalled processes) in certain cases.
See the mpi4py.run documentation for details.
The submission script needs to include the LSF parameters
(esp. -n 5
for number of MPI processes), the MPI library setup,
the conda setup, and mpiexec call. For example:
#!/bin/tcsh
#BSUB -n 5 # number of MPI processes
#BSUB -W 00:10 # maximum time
#BSUB -oo tasks_out
#BSUB -eo tasks_err
#BSUB -J tasks # job name
module load PrgEnv-intel
module load conda
conda activate /path/to/env
mpiexec python -m mpi4py -m pynodelauncher tasks.txt
The module load...conda activate...
part should be modified as needed
in the same way as for the installation (see above).
Assuming the file above is called submit_job.csh
, call bsub:
bsub < submit_job.csh
This section discusses common issues with installation and usage.
After a successful installation, you may get the following warning about the directory in which the (executable) script is not being in the system environment PATH variable:
WARNING: The script pynodelauncher is installed in '.../.local/bin' which is not on PATH.
This means that you want be able to execute pynodelauncher directly as a command
(without specifying path including filename extension). However, the MPI execution
is done using module execution (with -m
) not through an executable on PATH.
If you want to do thing like calling pynodelauncher as a command to get usage
and help, then follow the further instructions in this section.
The PATH variable is an environmental variable which the operating system uses to find
executable files (scripts or binaries). It contains paths (directories) where the
executable files are separated by a platform-depended separator (usually :
or ;
).
Hence, the message either means you should install the package in some other way on
your system, or, more likely, that you should add the path .../.local/bin
to your
PATH variable. Replace the ...
part (or whole .../.local/bin
) by what appears in
the warning message (...
will be likely your home directory).
Here is how the modification of the PATH variable is done in Bash (either in command line
or in .bashrc
file):
export PATH="$HOME/.local/bin:$PATH"
The python -m mpi4py
piece assumes you installed pynodelauncher with pip
and accesses pynodelauncher as a module (with -m
).
If you just have the pynodelauncher.py
script somewhere, you need to pass it
as a file to mpi4py, so:
... python -m mpi4py /path/to/pynodelauncher.py
Executing with python -m mpi4py
never uses executable files (commands) on PATH,
but you can can use combination of which command and command substitution
to achieve similar behavior if you want to test your shell skills.
If you get a Python traceback with ImportError saying it can't find a shared object file like the one below and you are on HPC, you likely forgot to load the appropriate MPI module or loaded a wrong one.
Traceback (most recent call last):
File ".../.local/bin/pynodelauncher", line 5, in <module>
from pynodelauncher import main
File ".../.local/lib/python3.7/site-packages/pynodelauncher.py", line 4, in <module>
from mpi4py import MPI
ImportError: libimf.so: cannot open shared object file: No such file or directory
If you get traceback like this on your local machine, you need to investigate how to install and test mpi4py properly on your system.
If one of your tasks takes an hour and your other tasks take 5 minutes each, many cores will be idle while waiting on the long-running task to finish. You either need to ask for much less cores (e.g. 3) or submit the long-running task as a separate job. Having many allocated cores idle is not acceptable use on many HPC systems including NC State's Henry2. So, you need to plan your allocation well and monitor the job especially when task run time may vary.
There is no error checking of execution of individual commands as of yet.
This is experimental software, so check existing issues and please give us feedback, e.g., by opening new issues.
The software does not check your data integrity or compliance with HPC usage policies, so be mindful of that. In other words, use of this software is at your own risk.
- Vaclav Petras, NC State University, Center for Geospatial Analytics
- Lisa L. Lowe, NC State University, Office of Information Technology, Advanced Computing
Copyright (C) 2021 The Authors
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.