This repository has been archived by the owner on Dec 13, 2023. It is now read-only.
Had a native thread creation error while using Redis Cluster backend with about 4,000 tasks and 10,000 threads per pod #3761
Unanswered
paxsonyang
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all,
We're using Conductor (3.9.2, with redis_cluster backend) as a central workflow engine supporting about 4,000 tasks for company-wide IT automations. Recently, we had an OOM (Out of Memory) error due to the native thread creation error (java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached).
After the troubleshooting, I found that we are hitting an internal constraint of our K8S setting which allows only 10,000 threads per pod. Reading the source code of RedisDynoQueue, QueueMonitor, and StatsMonitor, I found that it would use 1 to more threads per task queue, so it is easy to hit the 10,000 threads limit because all pods would load 4,000 tasks at once.
I don't know whether 10,000 threads per pod is reasonable or not, but our K8S team refuses to change it for central management. So what we can do is moving from K8S to VM-based containized environment without this 10,000 threads limit. However, due to the implementaiton of QueueMonitor, I don't think the use of QueueMonitor (servo) is 100% scalable even it uses thread pool with pool size = 1.
One of the other teams uses Conductor 3.4 with mysql-persistence without such a thread limit because I don't see any QueueMonitor from mysql-persistence, not to mention that it doesn't exist in the versions after Conductor 3.5 so I will use it in my version (3.9.2).
Do you have any suggestions on this kind of issues? Thank you very much.
Paxson
Beta Was this translation helpful? Give feedback.
All reactions