Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limiting incoming requests queue #212

Open
kinnalru opened this issue Oct 12, 2023 · 5 comments
Open

Limiting incoming requests queue #212

kinnalru opened this issue Oct 12, 2023 · 5 comments

Comments

@kinnalru
Copy link

Is it possible to limit incoming request queue by queue length or timeout?
Something like max_request_queue_size and max_request_queue_time from Phusion Passenger?

@ioquatix
Copy link
Member

At this time, there is no such feature and I agree it's a deficiency. I am planning to address this (configurable load balancing, rate limiting, etc), but it may be a paid feature. How do you feel about that?

@kinnalru
Copy link
Author

@ioquatix

it will be great if falcon starts bringing you money. But maintaining both versions (enterprise and free) is difficult.

@slewsys
Copy link

slewsys commented Jan 7, 2024

At this time, there is no such feature and I agree it's a deficiency. I am planning to address this (configurable load balancing, rate limiting, etc), but it may be a paid feature. How do you feel about that?

Nginx offers rate limiting and load balancing. You might have more success offering a hosting service, rather than charging for tooling.

@platbr
Copy link

platbr commented Oct 9, 2024

Is it possible to limit incoming request queue by queue length or timeout? Something like max_request_queue_size and max_request_queue_time from Phusion Passenger?
HAProxy offers a queue feature using maxconn.
NGINX open source doesn't have a queue feature, only Plus version.

@platbr
Copy link

platbr commented Oct 11, 2024

Thoughts on Queueing Requests/Scaling

We moved from Puma to Falcon last week because it is much easier to handle scaling without needing to worry about the number of necessary threads or how CPU resources are allocated. We have five applications working together to provide all the functionality of our website (we sell tickets).

Before Falcon, we scaled based on Puma's backlog and CPU usage, but now we scale solely based on CPU. This change allows us to ensure that CPU resources are fully utilized. However, we had to drastically increase the amount of memory, which I assume is because Falcon handles more requests simultaneously, leading to this behavior.

Since Falcon doesn't have a request queue, I’ve been considering placing it behind HAProxy. However, this might reintroduce the same issue we faced with Puma, where we couldn’t fully utilize available CPU. On the other hand, it could help limit the number of simultaneous requests.

I believe a good approach for queueing requests could be to base it on available CPU. Is it possible to control the queue this way? Can available CPU be monitored effectively in a shared cloud environment , Kubernetes, inside a container, etc?

@ioquatix , Do you think this would be a good approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants