You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the only option to inform determined of the number of GPUs a node has is with the maxSlotsPerPod field in the values.yaml file. However, Some GPUs that NVIDIA offers have the ability to partition the memory (MIG) for smaller workloads that don't saturate the GPU memory. In addition Nebuly offers a way to dynamically partition GPUs. Therefore, in order to better utilize resources, a nice feature would be the ability to specify MIG resources in something like maxSlotsPerPod.
Describe the solution you'd like
Possible yaml to describe 2 40GB A100s and requesting 8 MIGs of 10GB each spread across the two GPUs.
we are planning to address this by allowing to specify arbitrary gpu resource requests per slot at resource pool level (e.g. multiple mig slices nvidia.com/mig-1g.5gb: 3 instead of fixed nvidia.com/gpu: 1).
Describe the problem
Currently the only option to inform determined of the number of GPUs a node has is with the
maxSlotsPerPod
field in the values.yaml file. However, Some GPUs that NVIDIA offers have the ability to partition the memory (MIG) for smaller workloads that don't saturate the GPU memory. In addition Nebuly offers a way to dynamically partition GPUs. Therefore, in order to better utilize resources, a nice feature would be the ability to specify MIG resources in something likemaxSlotsPerPod
.Describe the solution you'd like
Possible yaml to describe 2 40GB A100s and requesting 8 MIGs of 10GB each spread across the two GPUs.
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: