Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Optimize] graceful shutdown issues in some special cases. #47

Open
EdmondFrank opened this issue Nov 12, 2024 · 2 comments
Open

[Optimize] graceful shutdown issues in some special cases. #47

EdmondFrank opened this issue Nov 12, 2024 · 2 comments

Comments

@EdmondFrank
Copy link

Hi akash-aky, First of all, thank you for creating Exile, it's a very amazing library! I recently ran into some problems using it to execute parallel this application, here are my debug result:

~$ parallel --tagstring "{}|" --line-buffer tail -F {} ::: compass-web-service/log/{slow_queries,bullet}.log
service/log/bullet.log|	Call stack
service/log/bullet.log|	  /home/git/service/app/models/lab_model.rb:61:in `default_version'
service/log/bullet.log|	
service/log/bullet.log|	2024-10-23 14:12:21[WARN] user: git
service/log/bullet.log|	POST /api/graphql
service/log/bullet.log|	AVOID eager loading detected
service/log/bullet.log|	  LabModelVersion => [:lab_dataset]
service/log/bullet.log|	  Remove from your query: .includes([:lab_dataset])
service/log/bullet.log|	Call stack
service/log/bullet.log|	


parallel: SIGTERM received. No new jobs will be started.
parallel: Waiting for these 2 jobs to finish. Send SIGTERM again to stop now.
parallel: tail -F service/log/bullet.log
parallel: tail -F service/log/slow_queries.log

As you can see above, while running parallel with Exile.stream!, I closed my application, but the external processes opened by Exile were not cleaned up correctly.

I think the reason may be that these particular processes need to send the SIGTERM signal multiple times for a graceful shutdown.

I tried to look up the code and found that Exile uses SIGKILL to force the process to terminate directly after sending a SIGTERM timeout, resulting in the tail process opened by parallel not being properly shut down even though the parallel process is force-terminated.

So I tried to make a simple and rough patch to deal with this scenario: trying to send SIGTERM 3 times at first, if it can't be closed gracefully then send SIGKILL to force kill this external process.

I'll make a pull request if you'd like to merge in this patch.

@akash-akya
Copy link
Owner

Hey @EdmondFrank, thanks for reporting and the patch.

So you are spawning parallel using Exile correct? and you are trying to shutdown parallel (and linked child processes) gracefully? From a cursory glace, the patch seems to be one simple way of solving this, but I have look on how such scenarios are usually handled elsewhere.

If possible can you share the Elixir code for the same? if you have it at hand :)

@EdmondFrank
Copy link
Author

EdmondFrank commented Nov 14, 2024

Hey @EdmondFrank, thanks for reporting and the patch.

So you are spawning parallel using Exile correct? and you are trying to shutdown parallel (and linked child processes) gracefully? From a cursory glace, the patch seems to be one simple way of solving this, but I have look on how such scenarios are usually handled elsewhere.

If possible can you share the Elixir code for the same? if you have it at hand :)

Sure, I created a minimal test repository for you based on my use case: https://github.com/EdmondFrank/exile_log_collector

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants