You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Does Parsl (more specifically HighTroughputExecutor+SlurmProvider) provide some built-in method to notify running tasks when their job allocation is about to hit walltime? Task runtimes are not always predictable, and some option to gracefully kill a task (close/checkpoint files, prep for a restart) could prevent losing workflow progress. I know about the drain option for HTEx, but it does not affect running jobs, if I understood correctly.
Describe alternatives you've considered
In Slurm, you can use the --signal flag to send a signal before walltime, however I have not found an easy way to propagate this signal to tasks running through workers.
You could wrap bash apps with timeout (e.g, timeout 60m python myscript.py), but that does not really work for tasks started halfway through the job allocation (you don't know how much walltime will be left when any task starts).
Job allocation details can be found in the shell environment (with Slurm, anyway), so every app could manage and decide when to shutdown by itself. I would say this is not the responsibility of apps.
You could play it safe and always checkpoint periodically. Brute-forcing it should work in most scenarios, but feels somewhat inelegant.
Currently, my hacky workaround is to launch a simple background Python script - before starting the process_worker_pool - which sleeps until right before the job allocation ends and then signals any (sub)processes created by workers (see below). This approach seems to work fine, but is bound to fail under some circumstances. There must be a better/cleaner way.
graceful_shutdown.py
"""
Set a shutdown timer and kill everything
TODO: make exit_window variable
"""
import os
import signal
import psutil
import datetime
EXIT_SIGNAL = signal.SIGUSR1
EXIT_WINDOW = 30 # seconds
EXIT_CODE = 69
def signal_handler(signum, frame):
""""""
pass
def signal_handler_noop(signum, frame):
"""Dummy handler that does not do anything."""
print(f'Received signal {signum} at frame {frame}')
print('What do I do with this omg..')
exit(EXIT_CODE)
def find_processes() -> list[psutil.Process]:
""""""
pkill = psutil.Process()
pmain = pkill.parent()
job_id, node_id = [pmain.environ().get(_) for _ in ('SLURM_JOBID', 'SLURM_NODELIST')]
procs_node = [p for p in psutil.process_iter()]
# only consider procs originating from this job
procs_job, procs_denied = [], []
for p in procs_node:
try:
if p.environ().get('SLURM_JOBID') == job_id:
procs_job.append(p)
except psutil.AccessDenied:
procs_denied.append(p)
pwork = [p for p in procs_job if p.name() == 'process_worker_']
pworker = sum([p.children() for p in pwork], [])
ptasks = sum([p.children() for p in pworker], [])
print(
f'Job processes (job_id={job_id}, node_id={node_id}):', *procs_job,
'Main process :', pmain, 'Kill process:', pkill,
'Workers:', *pworker, 'Running tasks:', *ptasks,
sep='\n'
)
return ptasks
def main():
""""""
time_start = datetime.datetime.fromtimestamp(float(os.environ.get('SLURM_JOB_START_TIME', 0)))
time_stop = datetime.datetime.fromtimestamp(float(os.environ.get('SLURM_JOB_END_TIME', 0)))
duration = time_stop - time_start
print(f'Job allocation (start|stop|duration): {time_start} | {time_stop} | {duration}')
print('Awaiting the app-ocalypse..')
signal.alarm((time_stop - datetime.datetime.now()).seconds - EXIT_WINDOW)
signal.signal(signal.SIGALRM, signal_handler)
signal.sigwait([signal.SIGALRM])
# TODO: this kills process?
print(f'Received signal {signal.SIGALRM.name} at {datetime.datetime.now()}')
for p in find_processes():
print(f'Sending {EXIT_SIGNAL.name} to process {p.pid}..')
os.kill(p.pid, EXIT_SIGNAL)
exit(EXIT_CODE)
if __name__ == "__main__":
main()
job script/logs
The originating Python script controlling Parsl uses some custom code, but all of that is irrelevant. Essentially, we launch bash apps that sleep indefinitely until they catch a signal.
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd: error: *** JOB 50069807 ON node3506.doduo.os CANCELLED AT 2024-11-04T15:09:57 ***
0: slurmstepd: error: *** STEP 50069807.0 ON node3506.doduo.os CANCELLED AT 2024-11-04T15:09:57 ***
You could play it safe and always checkpoint periodically. Brute-forcing it should work in most scenarios, but feels somewhat inelegant.
This is pretty much the traditional approach that parsl's worker model has had, but in recent times we've been pushing more towards managing the end of things a bit better, mostly with things like the drain time and trying to avoid placing tasks on soon-to-end workers (see also #3323).
Having the worker pool send unix signals to launched bash apps is probably an interesting thing to implement - triggered by either the external batch system or by knowledge of the environment (drain style)
Is your feature request related to a problem? Please describe.
Does Parsl (more specifically HighTroughputExecutor+SlurmProvider) provide some built-in method to notify running tasks when their job allocation is about to hit walltime? Task runtimes are not always predictable, and some option to gracefully kill a task (close/checkpoint files, prep for a restart) could prevent losing workflow progress. I know about the drain option for HTEx, but it does not affect running jobs, if I understood correctly.
Describe alternatives you've considered
--signal
flag to send a signal before walltime, however I have not found an easy way to propagate this signal to tasks running through workers.timeout 60m python myscript.py
), but that does not really work for tasks started halfway through the job allocation (you don't know how much walltime will be left when any task starts).Currently, my hacky workaround is to launch a simple background Python script - before starting the
process_worker_pool
- which sleeps until right before the job allocation ends and then signals any (sub)processes created by workers (see below). This approach seems to work fine, but is bound to fail under some circumstances. There must be a better/cleaner way.graceful_shutdown.py
job script/logs
The originating Python script controlling Parsl uses some custom code, but all of that is irrelevant. Essentially, we launch bash apps that sleep indefinitely until they catch a signal.
parsl.hpc_htex.block-0.1730729332.0925918
parsl.hpc_htex.block-0.1730729332.0925918.stderr
parsl.hpc_htex.block-0.1730729332.0925918.stdout
The text was updated successfully, but these errors were encountered: