-
Notifications
You must be signed in to change notification settings - Fork 3
STNS01 20 Release BEETaskManager
Develop BEETaskManager.
The BEETaskManager daemon runs on the HPC cluster login node. It accepts tasks from the BEEWorkflowManager, turns those tasks into HPC resource manager jobs (e.g. a slurm job script), and submits the job to the cluster resource manager. The BEETaskManager then tracks the status of the job (pending, running, complete) and updates the BEEWorkflowManager. The BEETaskManager will also cancel a queued or running job when commanded to do so by the BEEWorkflowManager. The first release of the BEETaskManager will support the Slurm resource manager and the Charliecloud linux container runtime.
This milestone will be complete when the BEETaskManager can successfully perform the following functionality on a production Slurm HPC Cluster at LANL.
- Accept a task from the BEEWorkflowManager
- Format the accepted task as a Slurm job script
- Use the Charliecloud linux container runtime to execute the task in the Slurm job
- Submit the Slurm job to the HPC cluster
- Report back to the BEEWorkflowManager the status of the submitted job
- Cancel a submitted but not yet completed job when commanded to do so by the BEEWorkflowManager
March 31, 2020