Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throttle get word graph #489

Open
johnml1135 opened this issue Sep 12, 2024 · 10 comments
Open

Throttle get word graph #489

johnml1135 opened this issue Sep 12, 2024 · 10 comments
Assignees
Labels
bug Something isn't working

Comments

@johnml1135
Copy link
Collaborator

So, if a person is just moving their mouse around, they may have a few word graph requests per second - leading to a barrage of requests that are each being handled simultaneously. Can we do anything else?
image

If too many come, it will get backed up and then lead to Timeouts:
image

Can we make a queue of one where one one word graph or translation is handled at a time and if the waiting queue has two or more items, only the last item is kept and the others are discarded (before they are processed by thot)?

@johnml1135 johnml1135 added the bug Something isn't working label Sep 12, 2024
@github-project-automation github-project-automation bot moved this to 🆕 New in Serval Sep 12, 2024
@ddaspit
Copy link
Contributor

ddaspit commented Sep 12, 2024

Serval can't really do this, because it doesn't know which user is making the request. Scripture Forge would need to do this kind of throttling. Serval could put a limit on the number of inferencing requests that it will handle within a certain amount of time, but Scripture Forge would still need to deal with it.

@johnml1135
Copy link
Collaborator Author

I see - SF could implement a similar strategy, if an existing request is already out to Serval, it could wait until the request is back until sending the next one, or some other means.

@Nateowami, would this be a possibility?

@Nateowami
Copy link
Collaborator

@johnml1135 Do you allow canceling requests?

@johnml1135
Copy link
Collaborator Author

I don't believe we can do that right now for two reasons - we do not supply an end point for cancelling and the C++ code cannot be cancelled itself. It is not a huge need, only that if the user moves the cursor around a lot, suggestions may get very laggy.

@Enkidu93
Copy link
Collaborator

What is actionable here, @johnml1135? Should this just be closed as not planned?

@ddaspit
Copy link
Contributor

ddaspit commented Oct 10, 2024

The Serval client and server support canceling a request using cancellation tokens. Unfortunately, we don't have any way to cancel the request once the call has entered the Thot library, since there is no way to pass the cancellation token into native code. The only thing that is actionable for this issue would be to implement cancellation support in Thot.

@Enkidu93
Copy link
Collaborator

The Serval client and server support canceling a request using cancellation tokens. Unfortunately, we don't have any way to cancel the request once the call has entered the Thot library, since there is no way to pass the cancellation token into native code. The only thing that is actionable for this issue would be to implement cancellation support in Thot.

OK, I see. What priority does this have? I remember that we discussed strategies, but I don't recall what the conclusion was. Is this something that I should take a stab at, @ddaspit, or something it would be better for you to do yourself then?

@johnml1135
Copy link
Collaborator Author

There is one other potential - if a person is pressing up/down on their cursor, they can spin off 5-10 get-word-graphs in a second or two. Assuming that we just don't want to get too hung up, why not queue up the requests? One potential would be to update the RWLock to include a "sequential read" that is really just a write lock underneath. Therefore, the cancellation would work on all requests except the one being done in thot (because they wouldn't get to thot yet).

@ddaspit
Copy link
Contributor

ddaspit commented Oct 14, 2024

@Enkidu93 You will need to work in Thot to do add Eflomal, so this might be a good opportunity to start getting familiar with the codebase.

@ddaspit
Copy link
Contributor

ddaspit commented Oct 14, 2024

@johnml1135 If there are multiple users getting word graphs from the same engine, we don't want to queue up their requests. We would only
want to queue up requests from a single user, but Serval does not know anything about users, so it would be up to SF to perform the queueing.

In any case, SF should cancel the previous request before starting a new one. I believe that was how I implemented it originally, but the code might have changed since then. If we add cancellation support to Thot, it might fix the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 🆕 New
Development

No branches or pull requests

4 participants