Skip to content

Commit

Permalink
plugin: enable multiple plugins for the same hook
Browse files Browse the repository at this point in the history
CRIU provides two plugins for checkpoint/restore of GPU applications:
amdgpu and cuda. Both plugins use the `RESUME_DEVICES_LATE` hook to
enable restore:

    CR_PLUGIN_REGISTER_HOOK(CR_PLUGIN_HOOK__RESUME_DEVICES_LATE, amdgpu_plugin_resume_devices_late)
    CR_PLUGIN_REGISTER_HOOK(CR_PLUGIN_HOOK__RESUME_DEVICES_LATE, cuda_plugin_resume_devices_late)

However, CRIU currently does not support running more than one plugin
for the same hook. As a result, when both plugins are installed, the
resume function for CUDA applications is not executed. To fix this,
we need to make sure that both `plugin_resume_devices_late()` functions
return `-ENOTSUP` when restore is not supported.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
  • Loading branch information
rst0git authored and avagin committed Aug 7, 2024
1 parent 7a27427 commit c4ba553
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 2 deletions.
3 changes: 2 additions & 1 deletion plugins/amdgpu/amdgpu_plugin.c
Original file line number Diff line number Diff line change
Expand Up @@ -1809,7 +1809,7 @@ int amdgpu_plugin_resume_devices_late(int target_pid)
fd = open(AMDGPU_KFD_DEVICE, O_RDWR | O_CLOEXEC);
if (fd < 0) {
pr_perror("failed to open kfd in plugin");
return -1;
return -ENOTSUP;
}

args.pid = target_pid;
Expand All @@ -1818,6 +1818,7 @@ int amdgpu_plugin_resume_devices_late(int target_pid)
if (kmtIoctl(fd, AMDKFD_IOC_CRIU_OP, &args) == -1) {
if (errno == ESRCH) {
pr_info("Pid %d has no kfd process info\n", target_pid);
exit_code = -ENOTSUP;
} else {
pr_perror("restore late ioctl failed");
exit_code = -1;
Expand Down
2 changes: 1 addition & 1 deletion plugins/cuda/cuda_plugin.c
Original file line number Diff line number Diff line change
Expand Up @@ -408,7 +408,7 @@ int resume_device(int pid, int checkpointed)
int cuda_plugin_resume_devices_late(int pid)
{
if (plugin_disabled) {
return 0;
return -ENOTSUP;
}

return resume_device(pid, 1);
Expand Down

0 comments on commit c4ba553

Please sign in to comment.