You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sometimes the app crashes with monit not being able to restart/recover the app successfully - which may be caused by the amount of time needed to start the app for the app also has to build a complete new index at start time.
To-long=didn't-analyze: we should set up this app like the others by using the ES-cluster.
@quaoar1:$ lsof -i:7200
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
[...]
java 3034314 sol 325u IPv4 51257987 0t0 TCP *:7200 (LISTEN)
curl 3102272 sol 5u IPv4 52433188 0t0 TCP localhost:38266->localhost:7200 (ESTABLISHED)
curl 3135530 sol 5u IPv4 53006460 0t0 TCP localhost:57358->localhost:7200 (ESTABLISHED)
curl 3164024 sol 5u IPv4 53554123 0t0 TCP localhost:46688->localhost:7200 (ESTABLISHED)
curl 3191906 sol 5u IPv4 54111679 0t0 TCP localhost:35858->localhost:7200 (ESTABLISHED)
curl 3221519 sol 5u IPv4 54682867 0t0 TCP localhost:50984->localhost:7200 (ESTABLISHED)
curl 3253338 sol 5u IPv4 55253727 0t0 TCP localhost:40050->localhost:7200 (ESTABLISHED)
curl 3298502 sol 5u IPv4 55841758 0t0 TCP localhost:58684->localhost:7200 (ESTABLISHED)
curl 3330911 sol 5u IPv4 56408940 0t0 TCP localhost:46594->localhost:7200 (ESTABLISHED)
curl 3361849 sol 5u IPv4 56966024 0t0 TCP localhost:51888->localhost:7200 (ESTABLISHED)
curl 3392987 sol 5u IPv4 57527590 0t0 TCP localhost:39482->localhost:7200 (ESTABLISHED)
curl 3420683 sol 5u IPv4 58070162 0t0 TCP localhost:59582->localhost:7200 (ESTABLISHED)
curl 3456898 sol 5u IPv4 58651768 0t0 TCP localhost:50824->localhost:7200 (ESTABLISHED)
sol@quaoar1:$ ps -ef|grep 3456898
sol 3456898 3456897 0 05:00 ? 00:00:24 curl --verbose -XPOST http://localhost:7200/organisations/transform
... and the other curls are also hanging transform processes dating back ca. two weeks when I (supposedly) cleansed them manually (as I will do now by killall curl).
Although the process was running (see PID 3034314) the app wasn't serving anything. This led to a timeout in apache who did the HA proxy to quaoar2. Had to kill -9 3034314. Restarted automatically serving fine.
Added to monit a check if 7200 is not only up but also serving:
check host lobid.org/organisations with address lobid.org every 6 cycles
stop program = "/home/sol/git/lobid-organisations/monit_restart.sh lobid-organisations stop 7200"
as uid sol and gid sol
if failed url http://localhost:7200/organisations/search?q=
content = "Library"
then exec "/home/sol/git/lobid-organisations/restart.sh lobid-organisations start 7200"
Added also force killing in start scripts.
The text was updated successfully, but these errors were encountered:
Increased cycles when checking lobid-organisations. lobid-organisations needs around 20 minutes to start. For this time the
daemon is not serving at port 7200. So we needed a bigger timeframe when testing if the app started successfully. Let's see if this helps.
Also increased the java heap-space by adding -Xms1024m,-Xmx8192m for there were errors in the application.log:
java.lang.OutOfMemoryError: GC overhead limit exceeded
As long as #409 isn't resolved we have to live with the situation.
Comes from #409 and moved here.
#409 (comment):
#409 (comment):
#409 (comment):
The text was updated successfully, but these errors were encountered: