Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💡[feat] NVIDIA Multi-Instance GPU (MIG) Support #6204

Open
TrevorM15 opened this issue Mar 10, 2023 · 3 comments
Open

💡[feat] NVIDIA Multi-Instance GPU (MIG) Support #6204

TrevorM15 opened this issue Mar 10, 2023 · 3 comments
Labels
feature Feature requests

Comments

@TrevorM15
Copy link

Describe the problem

Currently the only option to inform determined of the number of GPUs a node has is with the maxSlotsPerPod field in the values.yaml file. However, Some GPUs that NVIDIA offers have the ability to partition the memory (MIG) for smaller workloads that don't saturate the GPU memory. In addition Nebuly offers a way to dynamically partition GPUs. Therefore, in order to better utilize resources, a nice feature would be the ability to specify MIG resources in something like maxSlotsPerPod.

Describe the solution you'd like

Possible yaml to describe 2 40GB A100s and requesting 8 MIGs of 10GB each spread across the two GPUs.

maxSlotsPerPod:
  value: 2
  mig:
    - 1g.5gb: 7
    - 2g.10gb: 4
    - 3g.20gb: 2
resources:
  slots_per_trial:
    2g.10gb: 8

Describe alternatives you've considered

No response

Additional context

No response

@TrevorM15 TrevorM15 added the feature Feature requests label Mar 10, 2023
@ioga
Copy link
Contributor

ioga commented Mar 13, 2023

thank you for your suggestion.

we are planning to address this by allowing to specify arbitrary gpu resource requests per slot at resource pool level (e.g. multiple mig slices nvidia.com/mig-1g.5gb: 3 instead of fixed nvidia.com/gpu: 1).

@Wildshire
Copy link

Hello!

Do we have any news on this topic?

Best

@ioga
Copy link
Contributor

ioga commented Dec 4, 2023

No progress on this yet, sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature requests
Projects
None yet
Development

No branches or pull requests

3 participants