💡[feat] NVIDIA Multi-Instance GPU (MIG) Support #6204

TrevorM15 · 2023-03-10T18:45:22Z

Describe the problem

Currently the only option to inform determined of the number of GPUs a node has is with the maxSlotsPerPod field in the values.yaml file. However, Some GPUs that NVIDIA offers have the ability to partition the memory (MIG) for smaller workloads that don't saturate the GPU memory. In addition Nebuly offers a way to dynamically partition GPUs. Therefore, in order to better utilize resources, a nice feature would be the ability to specify MIG resources in something like maxSlotsPerPod.

Describe the solution you'd like

Possible yaml to describe 2 40GB A100s and requesting 8 MIGs of 10GB each spread across the two GPUs.

maxSlotsPerPod:
  value: 2
  mig:
    - 1g.5gb: 7
    - 2g.10gb: 4
    - 3g.20gb: 2

resources:
  slots_per_trial:
    2g.10gb: 8

Describe alternatives you've considered

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

ioga · 2023-03-13T16:45:56Z

thank you for your suggestion.

we are planning to address this by allowing to specify arbitrary gpu resource requests per slot at resource pool level (e.g. multiple mig slices nvidia.com/mig-1g.5gb: 3 instead of fixed nvidia.com/gpu: 1).

Wildshire · 2023-12-04T11:53:00Z

Hello!

Do we have any news on this topic?

Best

ioga · 2023-12-04T19:19:22Z

No progress on this yet, sorry.

TrevorM15 added the feature Feature requests label Mar 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

💡[feat] NVIDIA Multi-Instance GPU (MIG) Support #6204

💡[feat] NVIDIA Multi-Instance GPU (MIG) Support #6204

TrevorM15 commented Mar 10, 2023

ioga commented Mar 13, 2023 •

edited

Loading

Wildshire commented Dec 4, 2023

ioga commented Dec 4, 2023

💡[feat] NVIDIA Multi-Instance GPU (MIG) Support #6204

💡[feat] NVIDIA Multi-Instance GPU (MIG) Support #6204

Comments

TrevorM15 commented Mar 10, 2023

Describe the problem

Describe the solution you'd like

Describe alternatives you've considered

Additional context

ioga commented Mar 13, 2023 • edited Loading

Wildshire commented Dec 4, 2023

ioga commented Dec 4, 2023

ioga commented Mar 13, 2023 •

edited

Loading