Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container does not respond to sigterm #61

Open
samgurtman-zz opened this issue Jan 18, 2018 · 5 comments
Open

Container does not respond to sigterm #61

samgurtman-zz opened this issue Jan 18, 2018 · 5 comments

Comments

@samgurtman-zz
Copy link

samgurtman-zz commented Jan 18, 2018

On a SIGTERM 15, the docker container does not seem to exit. Instead docker times out and sends a SIGKILL.

Might be due to -t option?

exec /opt/SumoCollector/collector console -- -t

@bin3377
Copy link

bin3377 commented Jan 23, 2018

-t should be not related here. It means the collector registration on backend is ephemeral (e.g. will be deleted after goes offline for a certain period of time.)

What is the scenario you want the SIGTERM works for? In most of cases, we have an assumption that "container running == collector running". So you should pause/stop/remove the container if you want to stop the collector.

@samgurtman-zz
Copy link
Author

It causes shutdown to take much longer and it hampers error detection. Docker best practice is to respond to SIGTERM sent to the root process. The SIGKILL that's sent after timeout is a workaround for processes that don't shutdown gracefully.

@bin3377
Copy link

bin3377 commented Jan 24, 2018

Did some investigation on it. Looks like the collector handles the SIGTERM but somehow when running docker stop the signal is not passed into collector process. The evidence is the signal can be received when using kill <pid> in another attached console inside the container:

$ docker exec -it 04a47d8088259b0f2fe98ccb525615298513d1f075991dc7361d5c96742415de /bin/bash
root@04a47d808825:/# tail -f /opt/SumoCollector/logs/collector.out.log
INFO   | jvm 1    | 2018/01/24 00:18:45 | `+.|=|`+. |  | |  | |  |  | |  | | |  | |  |
INFO   | jvm 1    | 2018/01/24 00:18:45 | .    |  | |  | |  | |  |  | |  | | |  | |  |
INFO   | jvm 1    | 2018/01/24 00:18:45 | |`+. |  | |  | |  | |  |  | |  | | |  | |  |
INFO   | jvm 1    | 2018/01/24 00:18:45 | `+.|=|.+' `+.|=|.+' `+.|  |.|  |+' `+.|=|.+'
INFO   | jvm 1    | 2018/01/24 00:18:45 | Sumo Logic Collector Version 19.209-25
INFO   | jvm 1    | 2018/01/24 00:18:45 | Sumo Logic Build Hash a76f595
INFO   | jvm 1    | 2018/01/24 00:18:45 | current folder:/opt/SumoCollector
INFO   | jvm 1    | 2018/01/24 00:18:45 |   * See /opt/SumoCollector/./logs for more details.
INFO   | jvm 1    | 2018/01/24 00:18:45 |   * Connecting to https://nite-events.sumologic.net.
INFO   | jvm 1    | 2018/01/24 00:18:48 |   * Retrieved configuration from service.
STATUS | wrapper  | 2018/01/24 00:23:00 | TERM trapped.  Shutting down.

The last line indicates there is a TERM signal handled properly but it's not there if using docker stop. I was just kicked out because of the container stopped:

$ docker exec -it 04a47d8088259b0f2fe98ccb525615298513d1f075991dc7361d5c96742415de /bin/bash
root@04a47d808825:/# tail -f /opt/SumoCollector/logs/collector.out.log
INFO   | jvm 1    | 2018/01/24 00:17:57 | `+.|=|`+. |  | |  | |  |  | |  | | |  | |  |
INFO   | jvm 1    | 2018/01/24 00:17:57 | .    |  | |  | |  | |  |  | |  | | |  | |  |
INFO   | jvm 1    | 2018/01/24 00:17:57 | |`+. |  | |  | |  | |  |  | |  | | |  | |  |
INFO   | jvm 1    | 2018/01/24 00:17:57 | `+.|=|.+' `+.|=|.+' `+.|  |.|  |+' `+.|=|.+'
INFO   | jvm 1    | 2018/01/24 00:17:57 | Sumo Logic Collector Version 19.209-25
INFO   | jvm 1    | 2018/01/24 00:17:57 | Sumo Logic Build Hash a76f595
INFO   | jvm 1    | 2018/01/24 00:17:57 | current folder:/opt/SumoCollector
INFO   | jvm 1    | 2018/01/24 00:17:57 |   * See /opt/SumoCollector/./logs for more details.
INFO   | jvm 1    | 2018/01/24 00:17:57 |   * Connecting to https://nite-events.sumologic.net.
INFO   | jvm 1    | 2018/01/24 00:17:59 |   * Retrieved configuration from service.

@samgurtman-zz
Copy link
Author

samgurtman-zz commented Jan 24, 2018 via email

@colinbjohnson
Copy link

Here is a snapshot of the processes running within the container as well as me attempting to run a kill -9 1 (kill process ID 1):

colinjohnson@cjohnson07 sumologic_docker_file % docker exec -it a9791bf5cdcc  /bin/bash
root@a9791bf5cdcc:/# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   4904  1920 ?        Ss   03:42   0:00 /bin/sh /opt/SumoCollector/collector console
root        59  0.0  0.0  21800  3640 ?        Sl   03:42   0:00 /opt/SumoCollector/./wrapper /opt/SumoCollector/./config/wrapper.conf wrapper.syslog.ident=collector wrapper.pidfile=/opt/SumoCollector/./collector.pid wrapper.name=collector wrapper.displayname=SumoLogic 
root        61  2.3  2.4 4520024 298912 ?      Sl   03:42   0:16 /opt/SumoCollector/jre/bin/java -XX:+UseParallelGC -server -Djava.security.egd=file:/dev/./urandom -Xms64m -Xmx128m -Djava.library.path=./19.319-4/bin/native/lib -classpath ./19.319-4/lib/HikariCP-java7-2.
root       114  1.0  0.0  18516  3388 pts/0    Ss   03:54   0:00 /bin/bash
root       124  0.0  0.0  34412  2792 pts/0    R+   03:54   0:00 ps aux
root@a9791bf5cdcc:/# kill 1
root@a9791bf5cdcc:/# kill -9 1
root@a9791bf5cdcc:/# 

For the above container, if I do a kill -9 59 the container exits.

root@a9791bf5cdcc:/# kill -9 59
root@a9791bf5cdcc:/# %                                                                                                                                                                                                                                                        colinjohnson@cjohnson07 sumologic_docker_file % 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants