Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable lsp shutdown timeout #5266

Closed
wants to merge 1 commit into from

Conversation

jackos
Copy link

@jackos jackos commented Dec 23, 2022

What

Makes the lsp shutdown timeout configurable but retains the default of 3s,

Why

I use a proxy server: https://github.com/pr2502/ra-multiplex so I can have multiple instances of helix open via a multiplexer using the same instance of rust-analyzer, every time I want to close a window gracefully it takes 3s and is a little painful, this allows me to set it to 0 via the option:

[editor.lsp]
shutdown-timeout = 0

@kirawi kirawi added A-helix-term Area: Helix term improvements S-waiting-on-review Status: Awaiting review from a maintainer. labels Dec 24, 2022
@the-mikedavis
Copy link
Member

This is an interesting use-case. How does ra-multiplex respond to shutdown requests and exit notifications? It ignores them?

@jackos
Copy link
Author

jackos commented Dec 25, 2022

It just drops the connection and doesn't actually send the shutdown response, just saw that this PR fixes it: pr2502/ra-multiplex@3be7a2f

@jackos
Copy link
Author

jackos commented Dec 25, 2022

But even if that's fixed, I don't want any lag at all when I'm closing windows, for example if I closed down ra-multiplex beforehand or it has an error. I'm not sure why Helix needs to wait for shutdown confirmation from the lsp before closing, but I assumed there is a reason hence the configurable option.

@the-mikedavis
Copy link
Member

The LSP spec is not very explicit about this but I believe that waiting for the server is the proper thing to do during shutdown. If you kill the server process without warning it could hypothetically cause some bad effects. For example a server could have some database connection it needs to close during shutdown and killing the server could hypothetically cause some data loss or corruption.

On the other hand, it would be useful to be able to configure this as an escape hatch to deal with servers that ignore shutdown. Ra-multiplex for one until that PR is merged but also taplo (TOML lsp) has a bug that it ignores shutdowns and this would help with that: #5174

So I'm not sure. What do you think @archseer?

Copy link
Member

@dead10ck dead10ck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine to me. I think there has been previous discussion about making this configurable. There are people that are really adamant about quick shutdown times, risks or not, so I don't see a problem with letting them make that choice.

@crabdancing

This comment was marked as spam.

@crabdancing

This comment was marked as spam.

@archseer
Copy link
Member

What's the behavior in VSCode/elsewhere?

I'm a bit wary about adding a config option that works around a couple very specific LSPs and allows everyone else to shoot themselves in the foot. It's good practice to let the LSP clean up any temporary state (e.g. I'm sure rust-analyzer uses target/ to some extent), and if the server is correctly implemented the shutdown should still be instant (this is a local connection and the lsp responding to shutdown then exiting shouldn't take more than a millisecond if it doesn't have anything else to do)

Both taplo and ra-multiplex have been fixed by now so I'd really need to see examples of more language servers that require this.

@crabdancing
Copy link

@archseer
I think there are some, but I can't recall off the top of my head. Regardless, it doesn't seem unreasonable to have a configuration option that has a lot of warnings around it, for working around bad behavior from specific LSPs. Even if it turned out there were no LSPs with this issue -- there will be again.

I think in this case the potential for footguns is meager enough to not be a concern. As long as people know that overriding the LSP timeout to be too low, or zero, can cause bad behavior, or even corruption if the LSP is caching things to disk or rewriting its own config files. (Should be noted, I haven't seen any such LSPs thus far, so this is most likely a purely speculative future compliance concern at the moment.)

It should be noted that this feature isn't just useful for working around bad behavior from LSPs -- if the user is doing something niche, like having a LSP running off of the machine, for some reason, rather than connecting over loopback (e.g., on-board embedded development), network congestion or lag issues may prompt them to do this sort of thing to improve usability. Or maybe something else is slowing down the LSP -- who knows?

Configurable timeouts are a commonplace feature whenever multiple pieces of software need to interact -- and it seems like a good idea to add that here too.

Copy link
Member

@the-mikedavis the-mikedavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @archseer here that we don't want to introduce this option unless it has specific use-cases and I don't see one other than a language server breaking spec or having a bug that prevents fast-enough shutdown. IME whenever we introduce config options for timeout values like this they tend to be tuned arbitrarily and irresponsibly and it just makes debugging upstream issues harder, so I'm not enthusiastic about adding a config option for this.

But if we do allow configuring this, I want to see it be per-language-server instead of global and it should be documented with a warning that you almost certainly don't want to use it as it might cause bad side-effects for your language server.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change could probably be reset - we don't see this file checked in very often

serialize_with = "serialize_duration_millis",
deserialize_with = "deserialize_duration_millis"
)]
pub shutdown_timeout: Duration,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a config option per-language-server (here: https://docs.helix-editor.com/languages.html#language-server-configuration) rather than editor-wide

@crabdancing
Copy link

I agree completely with making this configuration per-LSP.

@archseer
Copy link
Member

if the user is doing something niche, like having a LSP running off of the machine, for some reason, rather than connecting over loopback (e.g., on-board embedded development), network congestion or lag issues may prompt them to do this sort of thing to improve usability. Or maybe something else is slowing down the LSP -- who knows?

Such a case isn't actually possible since the language server inspects the code and usually needs direct access to the file tree. The server also runs over a file descriptor rather than the network.

(Should be noted, I haven't seen any such LSPs thus far, so this is most likely a purely speculative future compliance concern at the moment.)

Right, but I also haven't seen any LSPs that would need this workaround (apart from two that are fixed for 6-12 months now) so merging this workaround based on speculation that there could be a usecase eventually doesn't seem right either.
I'm willing to merge this if a practical need emerges, but it seems redundant to me right now. There's all sorts of timeouts that could actually use a config value (e.g. timeouts that occur when you actually use the LSP for completion etc.) and we haven't even seen demand for those.

IME whenever we introduce config options for timeout values like this they tend to be tuned arbitrarily and irresponsibly and it just makes debugging upstream issues harder, so I'm not enthusiastic about adding a config option for this.

Exactly, now any time there's an issue we have to ask if they set the timeout to 0 by any chance. We already had enough random issues pop up from users setting the idle timeout to 0, even though the docs say not to do that in bold.

@crabdancing
Copy link

Such a case isn't actually possible since the language server inspects the code and usually needs direct access to the file tree. The server also runs over a file descriptor rather than the network.

Actually, it's totally possible! I've done setups similar to this seamlessly combining remote and local development with SSHFS.

@crabdancing
Copy link

crabdancing commented Nov 28, 2023

Exactly, now any time there's an issue we have to ask if they set the timeout to 0 by any chance. We already had enough random issues pop up from users setting the idle timeout to 0, even though the docs say not to do that in bold.

Honestly, IMO that's what bots & issue templating were made for. Denying the user choice because of what stupidity they might get up to kind of rubs me the wrong way.

In my opinion, either 1) the issue template should request that the user post all their config files, or 2) the config file contents should be printed in the debug log. Finding robust ways of dealing with incomplete debug reports is going to become critical for the Helix team sooner or later, and this much concern over this issue suggests to me that the current approach is not scaling.

EDIT:

I honestly wasn't trying to do this, but I just found yet another example of yet another LSP taking too long to shut down -- namely, nu.

@dead10ck
Copy link
Member

Finding robust ways of dealing with incomplete debug reports is going to become critical for the Helix team sooner or later, and this much concern over this issue suggests to me that the current approach is not scaling.

Dealing with bad user bug reports is the bread and butter of any open source project. People ignore warnings, they ignore templates, they don't make any debugging attempts and just drop in and expect you to fix it for them, etc, etc. It's a people issue, not a process issue. There isn't a process you could come up with that would fix that.

I honestly wasn't trying to do this, but I just found yet another example of yet another LSP taking too long to shut down -- namely, nu.

Yeah I've tried to use that myself — it's hilariously incomplete; I'm shocked that they saw fit to "release" it in the state that it's in. Insofar as it affects Helix, they're simply not listening for the shutdown request.

@crabdancing
Copy link

. There isn't a process you could come up with that would fix that.

Fix no. Ameliorate? Yes.

Yeah I've tried to use that myself — it's hilariously incomplete; I'm shocked that they saw fit to "release" it in the state that it's in.

Nushell is MVP-grade software. I don't blame them for shipping incomplete features, when they're not even at 1.0, and this feature is strictly opt-in and doesn't affect normal usage.

@pascalkuthe
Copy link
Member

Closing this as this PR has gone stale and we are not looking to add this kind of config. Thank you for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-helix-term Area: Helix term improvements S-waiting-on-review Status: Awaiting review from a maintainer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants