Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource Reclaim Between Different Queues #3842

Open
barrycheng05 opened this issue Nov 26, 2024 · 4 comments
Open

Resource Reclaim Between Different Queues #3842

barrycheng05 opened this issue Nov 26, 2024 · 4 comments
Labels
kind/question Categorizes issue related to a new question

Comments

@barrycheng05
Copy link

Please describe your problem in detail

I am trying to test the effect of queue deserved with the reclaim action, but job-b remains in the Pending state.

The queue and job YAML configurations were modified based on this [Issue](#3729).

Here is part of the volcano-scheduler log. Could you please help me understand why the reclaim process is not triggered?

I1126 09:42:58.038748       1 reclaim.go:40] Enter Reclaim ...
I1126 09:42:58.038752       1 reclaim.go:49] There are <2> Jobs and <3> Queues in total for scheduling.
I1126 09:42:58.038757       1 reclaim.go:67] Added Queue <first> for Job <default/job-a-c035b82a-7643-4d96-ad70-476de2489dd6>
I1126 09:42:58.038763       1 capacity.go:255] Queue <first> can not reclaim, deserved <cpu 20000.00, memory 262144000.00>, allocated <cpu 40000.00, memory 262144000.00, pods 5.00>, share <2>
I1126 09:42:58.038775       1 reclaim.go:99] Queue <first> can not reclaim by preempt others, ignore it.
I1126 09:42:58.038783       1 reclaim.go:220] Leaving Reclaim ...
I1126 09:42:58.038790       1 backfill.go:44] Enter Backfill ...
I1126 09:42:58.038796       1 backfill.go:110] Leaving Backfill ...
I1126 09:42:58.038927       1 session.go:214] Queue <first> allocated resource keeps equal, no need to update queue status <map[cpu:{{40000 -3} {<nil>}  DecimalSI} memory:{{262144000 0} {<nil>}  BinarySI} pods:{{5 0} {<nil>}  DecimalSI}]>.
I1126 09:42:58.038966       1 session.go:214] Queue <second> allocated resource keeps equal, no need to update queue status <map[cpu:{{0 -3} {<nil>}  DecimalSI} memory:{{0 0} {<nil>}  BinarySI}]>.
I1126 09:42:58.038975       1 session.go:214] Queue <default> allocated resource keeps equal, no need to update queue status <map[cpu:{{0 -3} {<nil>}  DecimalSI} memory:{{0 0} {<nil>}  BinarySI}]>.
I1126 09:42:58.038982       1 session.go:244] Close Session fc38d2d3-695d-4f12-ad1b-bd1c3b50e5c6
I1126 09:42:58.038992       1 scheduler.go:129] End scheduling ...

Below are the related YAML configurations. If additional information is required, I can provide it.

Thank you.


scheduler-config.yaml

apiVersion: v1
data:
  volcano-scheduler.conf: |
    actions: "enqueue, allocate, reclaim, backfill"
    tiers:
    - plugins:
      - name: priority
      - name: gang
        enablePreemptable: false
      - name: conformance
    - plugins:
      - name: overcommit
      - name: drf
        enablePreemptable: false
      - name: predicates
      - name: capacity
      - name: nodeorder
      - name: binpack
kind: ConfigMap
metadata:
  name: volcano-scheduler-configmap
  namespace: volcano-system

queue.yaml

apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
  name: first
spec:
  reclaimable: true
  deserved:
    cpu: 20
    memory: 2Gi
  capability:
    cpu: 40
    memory: 2Gi
---
apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
  name: second
spec:
  reclaimable: true
  deserved:
    cpu: 20
    memory: 2Gi
  capability:
    cpu: 40
    memory: 2Gi

job-a.yaml

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: job-a
spec:
  schedulerName: volcano
  queue: first
  minAvailable: 2
  tasks:
    - replicas: 5
      name: "master"
      template:
        metadata:
          annotations:
            volcano.sh/preemptable: "true"
        spec:
          containers:
            - image: nginx:1.14.2
              name: nginx
              resources:
                requests:
                  cpu: "8"
                  memory: "50Mi"
          restartPolicy: OnFailure

job-b.yaml

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: job-b
spec:
  schedulerName: volcano
  queue: second
  minAvailable: 2
  tasks:
    - replicas: 5
      name: "worker"
      template:
        metadata:
          annotations:
            volcano.sh/preemptable: "true"
        spec:
          containers:
            - image: nginx:1.14.2
              name: nginx
              resources:
                requests:
                  cpu: "8"
                  memory: "50Mi"
          restartPolicy: OnFailure

Current Status

The cluster has approximately 48 cores, and other running Pods are using around 5 cores.

$ kubectl apply -f queue.yaml
queue.scheduling.volcano.sh/first created
queue.scheduling.volcano.sh/second created

$ kubectl apply -f job-a.yaml
job.batch.volcano.sh/job-a created

$ kubectl apply -f job-b.yaml
job.batch.volcano.sh/job-b created

$ kubectl get po
NAME             READY   STATUS    RESTARTS   AGE
job-a-master-0   1/1     Running   0          25s
job-a-master-1   1/1     Running   0          25s
job-a-master-2   1/1     Running   0          25s
job-a-master-3   1/1     Running   0          25s
job-a-master-4   1/1     Running   0          25s

$ kubectl get vcjob
NAME    STATUS    MINAVAILABLE   RUNNINGS   AGE
job-a   Running   2              5          42s
job-b   Pending   2                         38s

Any other relevant information

No response

@barrycheng05 barrycheng05 added the kind/question Categorizes issue related to a new question label Nov 26, 2024
@PigNatovsky
Copy link
Contributor

PigNatovsky commented Nov 26, 2024

I think that there is something more - looks like job 2 has other issues.
Look at the share of first queue - it's 2. It means, that it consumes twice as much, as it deserves.
Have You checked events?

@PigNatovsky
Copy link
Contributor

I mean that job is in pending status, there is no pod scheduled and waiting for resources. I think that it's more related to controller, not to the scheduler.

@barrycheng05
Copy link
Author

Is the "Event" referring to job-b?

$ kubectl describe vcjob job-b
......
Status:
  Conditions:
    Last Transition Time:  2024-11-26T09:39:23Z
    Status:                Pending
  Min Available:           2
  State:
    Last Transition Time:  2024-11-26T09:39:23Z
    Phase:                 Pending
Events:
  Type     Reason           Age   From                   Message
  ----     ------           ----  ----                   -------
  Warning  PodGroupPending  15h   vc-controller-manager  PodGroup default:job-b unschedule,reason: 2/0 tasks in gang unschedulable: pod group is not ready, 2 minAvailable

When I remove job-a, job-b can be successfully created.

$ kubectl get vcjob
NAME    STATUS    MINAVAILABLE   RUNNINGS   AGE
job-a   Running   2              5          15h
job-b   Pending   2                         15h

$ kubectl delete vcjob job-a
job.batch.volcano.sh "job-a" deleted

$ kubectl get vcjob
NAME    STATUS    MINAVAILABLE   RUNNINGS   AGE
job-b   Pending   2                         15h

$ kubectl get vcjob
NAME    STATUS    MINAVAILABLE   RUNNINGS   AGE
job-b   Running   2              5          15h

By the way, I'm using Volcano Scheduler version 1.9.0.
Thanks for your reply.

@barrycheng05
Copy link
Author

I noticed this log later, and it seems like the overcommit plugin is kicking job-b out of the queue. The expectation was that once a job enters the Pending state, it shouldn’t be considered for preemption. After I removed the overcommit plugin, job-b was able to allocate normally.

I1127 08:26:56.828424       1 enqueue.go:45] Enter Enqueue ...
I1127 08:26:56.828429       1 enqueue.go:63] Added Queue <second> for Job <default/job-b-7a171232-8367-4d99-b301-233e98264f25>
I1127 08:26:56.828438       1 enqueue.go:74] Added Job <default/job-b-7a171232-8367-4d99-b301-233e98264f25> into Queue <second>
I1127 08:26:56.828442       1 enqueue.go:63] Added Queue <first> for Job <default/job-a-b60497e4-2892-4687-929d-5284e94a8871>
I1127 08:26:56.828449       1 enqueue.go:79] Try to enqueue PodGroup to 1 Queues
I1127 08:26:56.828459       1 overcommit.go:128] Resource in cluster is overused, reject job <default/job-b-7a171232-8367-4d99-b301-233e98264f25> to be inqueue
I1127 08:26:56.828483       1 enqueue.go:104] Leaving Enqueue ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/question Categorizes issue related to a new question
Projects
None yet
Development

No branches or pull requests

2 participants