We run the AutoMorph batch workflow on Yale's grace-next cluster, using the Slurm scheduler and a batching utility called Dead Simple Queue (dSQ).
ssh -Y <netid>@grace-next.hpc.yale.edu
- Slurm Documentation
- dSQ Documentation (including tips on how to monitor your tasks and identifying failed tasks)
- Add the following to your
.bashrc
file on Grace
. /home/geo/hull/ph269/software/etc/hull_bashrc
module load Tools/dSQ
- Copy the ZereneStacker license, etc into your home directory
cp -r /home/geo/hull/ph269/.ZereneStacker ~/
- Create a directory in Grace's scratch60. This will be
raw_images_root
below.
mkdir /gpfs/scratch60/geo/hull/data/<your_project>
-
Copy your stacks to your directory, either with Globus or rsync on Omega's data transfer node (email Kaylea if you have questions about this step).
-
Create directory in your project space for the segmented output files. This will be
output_root
below.
mkdir $HOME/project/<your_project>
-
Clone this repository into a new directory in your home directory on Grace.
-
Edit and run
list_dirs.py
- specify
raw_images_root
- specify
output_root
- specify
-
Edit and run
write_segment_settings.py
- specify
output_root
(see above) - set
mode = 'sample'
- set
threshold_range
to the sample threshold range - run
write_segment_settings.py
- specify
-
Run
list_dirs.py
. This will create three directories:dirs_stacks.txt
: file listing all the input directoriesdirs_stacks.csv
: same contents asdirs_stacks.txt
, but fields should be added for when segment is run infinal
mode.dirs_segmented.txt
: file listing the output directories that will contain the settings file and output images from segment.
python list_dirs.py presegment
- Create taskfile for dSQ
python build_taskfile.py segment
- Run dSQ to create submission script
dSQ --taskfile taskfile_segment.txt > submit_segment.sh
- Optional: It may be necessary to increase the memory that the submission script requests per task since Segment can be very memory hungry with bigtiff files. If so, edit
submit_segment.sh
and increase--mem-per-cpu
. For example:
#SBATCH --mem-per-cpu=40G
- Submit the submission script to Slurm
sbatch submit_segment.sh
- See dSQ Documentation for instructions on checking task status.
- Edit
dirs_stacks.csv
to add final threshold values. For example:
/gpfs/scratch60/geo/hull/data/porosity/CH82_150-250um_1-102_sp,0.1
-
Edit and run
write_segment_settings.py
- specify
output_root
(if not already set, see above) - set
mode = 'final'
- run
write_segment_settings.py
- specify
-
Create taskfile for dSQ
python build_taskfile.py segment
- Run dSQ to create submission script
dSQ --taskfile taskfile_segment.txt > submit_segment.sh
- Optional: It may be necessary to increase the memory that the submission script requests per task since Segment can be very memory hungry with bigtiff files. If so, edit
submit_segment.sh
and increase--mem-per-cpu
. For example:
#SBATCH --mem-per-cpu=40G
- Submit the submission script to Slurm
sbatch submit_segment.sh
- See dSQ Documentation for instructions on checking task status.
- If this is your first step with the dataset. Run
list_dirs.py
.dirs_segmented.txt
: file listing the output directories that contain the output images from segment.
python list_dirs.py segmented
- Create taskfile for SimpleQueue.
python build_taskfile.py focus
- Run sqCreateScript to create submission script. Set num_workers equal to a whole number that is about 10-25% of the number of tasks in your taskfile.
sqCreateScript -w 24:00:00 -n 5 taskfile_focus.txt > submit_focus.sh
- Edit
submit_focus.sh
. Add the additional directive to ensure one task per node. Put it with the similar looking lines, otherwise order doesn't matter.
#SBATCH --ntasks-per-node=1
- Submit the submission script to Slurm
sbatch submit_focus.sh
For 3dmorph, just substitute 3dmorph
for 2dmorph
in the following instructions.
- Create list of successfully focused directories (this will create
dirs_focused.txt
)
python list_dirs.py focused
-
Open
write_2dmorph_settings.py
and configure the settings. There is also a variable at the top namedtwodmorph_run_name
. You can change this variable for different 2dmorph settings. It will then create a directory list calleddirs_<twodmorph_run_name>.txt
and set the output to a directory with that name inoutput_root
as configured above. -
Run
write_2dmorph_settings.py
python write_2dmorph_settings.py
- Create a task list for 2dmorph.
python build_taskfile.py 2dmorph -d dirs_<twodmorph_run_name>.txt
-
Select a name for this run (this will be used below as
twodmorph_run_name
). -
Copy or move
dirs_focused.txt
todirs_<twodmorph_run_name>.csv
and add fields for the settings you need to edit. -
Open
write_2dmorph_settings.py
and configure the variable at the top namedtwodmorph_run_name
. You can change this variable for different 2dmorph settings so they output in unique locations. It will also create a directory list calleddirs_<twodmorph_run_name>.txt
and create output directories with that name inoutput_root
as configured above. -
Modify
write_2dmorph_settings.py
to read in the settings you added to your csv,dirs_<twodmorph_run_name>.csv
. Also configure the global settings. -
Run
write_2dmorph_settings.py
python write_2dmorph_settings.py
- Create a task list for 2dmorph
python build_taskfile.py 2dmorph -d dirs_<twodmorph_run_name>.csv
- Use dead simple queue to create submission script and submit. For 3dmorph, also add the
-t 24:00:00
flag, since 3dmorph can take a very long time to run.
2dmorph:
dSQ --taskfile taskfile_2dmorph.txt > submit_2dmorph.sh
sbatch submit_2dmorph.sh
3dmorph:
dSQ -t 24:00:00 --taskfile taskfile_3dmorph.txt > submit_3dmorph.sh
sbatch submit_3dmorph.sh