Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pollers can not poll any tasks from Cadence server even if there're tasks available #6320

Open
tuannh982 opened this issue Oct 1, 2024 · 3 comments

Comments

@tuannh982
Copy link

Version of Cadence server, and client(which language)
This is very important to root cause bugs.

  • Server version: v1.2.13
  • Client version: 2.7.8
  • Client langauge: Java

Describe the bug
Sometimes, the pollers (activity task pollers and decision task pollers) are not able to poll any task from Cadence server even if both workflow worker instances and Cadence servers are still running normally.

After restarting worker instances, everything back to normal.

To Reproduce
Is the issue reproducible?

  • No

Steps to reproduce the behavior:
A clear and concise description of the reproduce steps.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context

  • Metrics shown that the task pollers are still working but receive no task from Cadence server.
  • When calling to http://${CADENCE_WEB_HOST}/api/domains/${DOMAIN}/task-lists/${TASK_QUEUE}/pollers to get poller list, this list is empty, while normally worker name must be appeared in this list
@natemort
Copy link
Member

natemort commented Oct 3, 2024

I'm not aware of any recent changes to the server that could cause this, nor have we seen this issue. The version of the Java client that you're using is quite old (2.7.8, released 2020-10-02), could you see if the issue still occurs with the newer versions?

Additionally, could you provide more information on your configuration? Do you have partitioning or any custom configuration enabled for the task list that the workers are polling? On the client side, what do you have configured for your WorkerOptions and WorkerFactoryOptions?

@ibarrajo ibarrajo added the bug label Oct 31, 2024
@ibarrajo
Copy link
Contributor

@tuannh982 If you could share the WorkerOptions and in particular PollerOptions it would be great to help debug what is happening.

I'm suspecting if you set pollOnlyIfExecutorHasCapacity to true, in high loads, the worker would stop polling. Or perhaps an issue with the server configuration that would close the open port used for long polling.

@tuannh982
Copy link
Author

tuannh982 commented Nov 1, 2024

@ibarrajo here's our WorkerOptions settings:

  • maxConcurrentWorkflowExecutionSize: 40
  • maxConcurrentActivityExecutionSize: 40
  • maxConcurrentWorkflowTaskPollers: 2
  • maxConcurrentActivityTaskPollers: 2

I don't think the problem is the worker stop polling, because I saw the activity-poll-no-task metric is still being emitted, which indicates that the poller did request to Cadence server but receive no task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants