-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Docker healthcheck for /10.0 #115
base: master
Are you sure you want to change the base?
Conversation
That URL (as well as /) cause Odoo to create a session, which is stored in the sessions store in data_dir. There are two problems in continuosly checking the URL (and originating sessions):
We need to indentify (or create) a good target for healthcheck interrogations. |
Good catch. In hind-sight, I also think it needs a URL with a much smaller return body. This is one option:
Also, its static so I don't expect it to grow. I understand that in certain health checks (AWS ELB for example), only the status code is checked and the body is essentially thrown away, however, I think the request should still be for a small static asset, in order to be as efficient as possible. I only mention AWS ELB, because I think if someone happens to put their containers behind a load balancer that checks the return status of the health check endpoint .. they might use the same endpoint used for the Docker health check. So how about that one? |
Doesn't work either. The static files do not reflect odoo's status. You can shutdown the database it will still think that odoo is fine. I tried this command it worked fine: HEALTHCHECK CMD curl --fail -b /tmp/cookies --cookie-jar /tmp/cookies -s -H "Content-Type: application/json" -X POST -d '{"jsonrpc":"2.0","method":"call","params":{"method":"list","service":"db","args":{}}}' 0.0.0.0:8069/jsonrpc || exit 1 This command will execute a jsonrpc call to list available databases. This forces odoo to check the database connection and call the connection as well. What also means that if the disk is full this status should go to unhealthy as well. Sadly I could not avoid odoo from creating a session, however I gave curl a cookie-jar so odoo doesn't keep creating sessions. |
That's a great idea. Perhaps at this point its important to define what the objective is of the health-check. Should it be comprehensive or simple? Is it just check the basic health of the container or the system as a whole. For example, you could have another situation where the instance can connect to the DB and create cookies, however, due to running out of RAM, less assets don't compile so the UI is hosed. Is that something that should be accounted for as well? |
Another thing to consider is compliance with deployment recommendations. So for example, @ https://www.odoo.com/documentation/10.0/setup/deploy.html#security its recommended to "set list_db configuration option to False, to prevent listing databases entirely". I know that there are some people who follow this recommendation religiously ... so in that case, listing the DBs would not work. |
Yes, it is true. |
Camptocamp seems to have found a solution to this problem by expiring the session immediately: https://github.com/camptocamp/odoo-cloud-platform/blob/11.0/monitoring_status/controllers/main.py |
I am current using my proposal from above to get a static file, namely: Features:
An implementation: https://github.com/idazco/odoo-docker-ez/blob/master/10/Dockerfile#L115 @thomas15v , regarding What we should define first, is what status are we trying to check .. or at least .. what is reasonable to check for a stock deployment:
or
So the basic advocated objective of the |
Great Idea, but it has performance issue when you have to list hundreds of databases. |
@ysinjab why do you have 100 databases on a single Odoo container? Wouldn't it be better to scale those databases to different odoo containers and use an HTTP proxy to sort these out with domain names? |
The system we have requires separate tenant for every user. so I have one postgresql server and a database for every user. I have RESTful api as a proxy for odoo where every user is connected to his own database. |
@ysinjab Container health and Odoo system status / database status are different things. The point of the container health check is just to give you the basic running health status of the container itself ... not all parts of the application in that container. The basic functionality behind Odoo is the Werkzeug web server. If thats down then you have a fundamental problem with the health of the container. Everything beyond that .. such as database access ... file system access etc, is the subject of a separate instrumention solution. So based on what the actual purpose of a health check is, it works perfectly. |
Folks you need to keep one thing in mind, the container starts a Python process with a PID attached to PID 1. The point of the container health check is to make sure that the app attached to PID 1 in that container is not hung. That's it. Don't make this more complicated based on your own use cases. |
what if the server is started with |
We have our own Dockerfile that wraps the default odoo docker image and adds some scripts. One of the scripts is for health checks (or liveness probe as it's called in kubernetes terms). It prevents the continuously growing number of sessions quite well. (Of course it won't work with Maybe some of you will benefit from this: #!/bin/bash
RESTORE_LOCK_FILE="$HOME/.restore.lock"
LIVENESS_PROBE_URL="localhost:8069"
LIVENESS_PROBE_FILE="$HOME/.liveness-probe-session"
# Succeed if a restore process is in progress to prevent interruption
[ -f "$RESTORE_LOCK_FILE" ] && exit 0
ensure_odoo_is_running()
{
if [ $? -gt 0 ]; then
# Odoo is down or does not respond
echo "ERROR: No connection to \"http://${LIVENESS_PROBE_URL}\"!"
exit 1
fi
}
# Use this method to reliably get a session cookie
initialize_session()
{
SESSION_COOKIE="$1"
[ -n "$SESSION_COOKIE" ] && CURL_SESSION_ARG=(--header "cookie: session_id=$SESSION_COOKIE")
SESSION_COOKIE="$(
curl "http://${LIVENESS_PROBE_URL}/web/database/selector" \
"${CURL_SESSION_ARG[@]}" \
--silent \
--cookie-jar - \
--output /dev/null \
| grep 'session_id' \
| sed -e 's/^.*session_id.//g' \
| sed -e 's/[ \t].*$//g'
)"
ensure_odoo_is_running
if [ -z "$SESSION_COOKIE" ]; then
echo "ERROR: Could not get session cookie!"
exit 1
else
echo -n "$SESSION_COOKIE" > "$LIVENESS_PROBE_FILE"
exit 0
fi
}
# If there is already a liveness probe session use that one
if [ -f "$LIVENESS_PROBE_FILE" ]; then
# Get existing session
SESSION_COOKIE="$( cat $LIVENESS_PROBE_FILE )"
RESPONSE="$(
curl "http://${LIVENESS_PROBE_URL}/web/login" \
--silent \
--head \
--header "cookie: session_id=$SESSION_COOKIE"
)"
ensure_odoo_is_running
HTTP_STATUS_CODE="$( echo "${RESPONSE[@]}" | head -1 | awk '{print $2}' )"
RESPONSE_SESSION_COOKIE="$(
echo "${RESPONSE[@]}" \
| grep "session_id=" \
| sed -e 's/^.*session_id=//g' \
| sed -e 's/;.*$//g'
)"
if [ -n "$RESPONSE_SESSION_COOKIE" ] && [ "$RESPONSE_SESSION_COOKIE" != "$SESSION_COOKIE" ]; then
echo "$RESPONSE_SESSION_COOKIE" > "$LIVENESS_PROBE_FILE"
export SESSION_COOKIE="$RESPONSE_SESSION_COOKIE"
fi
if [ $HTTP_STATUS_CODE -ge 200 ] && [ $HTTP_STATUS_CODE -le 299 ]; then
# That's what we wanna see!
exit 0
elif [ $HTTP_STATUS_CODE -ge 300 ] && [ $HTTP_STATUS_CODE -le 399 ]; then
# Get the URL where we are redirected to
REDIRECTION_URL="$(
echo "${RESPONSE[@]}" \
| grep "Location:" \
| sed -e "s/^.*${LIVENESS_PROBE_URL}//g"
)"
# If there's no database in the system yet, we're getting forwarded to the database selector
if [ "$REDIRECTION_URL" = "/web/database/selector" ]; then
initialize_session "$SESSION_COOKIE"
fi
elif [ $HTTP_STATUS_CODE -ge 400 ] && [ $HTTP_STATUS_CODE -le 499 ]; then
initialize_session "$SESSION_COOKIE"
else
# It seems there is an internal server error (5XX)
echo "ERROR: Liveness probe failed!"
echo "HTTP status code: $HTTP_STATUS_CODE"
echo "HTTP session cookie: $SESSION_COOKIE"
echo "HTTP headers:"
echo "${RESPONSE[@]}"
exit 1
fi
else
initialize_session
fi |
I think you're all over thinking it. Healthcheck is pretty much an implementation detail. The other thing is that an healthy worker can be different things for different people. If I have a worker for which the response time increase to more than let say 50% slower than the average request time... I'd say the worker is unhealthy. Or may be the response time of the average request is increasing or maybe the amount of memory is way too high... The issue is that a health check doesn't mean that the server is down, may be it's just slow and would tell you to scale the service up by adding more replicas. |
Is it a good idea to use XML-RPC instead? I uses this in my own customized Dockerfile. HEALTHCHECK CMD curl -H 'Content-Type: text/xml' \
-d "<?xml version='1.0'?><methodCall><methodName>version</methodName></methodCall>" \
-X POST \
--fail \
http://localhost:${ODOO_PORT}/xmlrpc/2/common EDIT: Apparently this doesn't work with odoo systems with multiple databases |
@lemniskett That is indeed much more elegant! 👍 Thanks for sharing! The only problem would be if the Odoo API would response but not the web application. |
Odoo now has (since version 15.0) an endpoint |
Is it really a health check? It doesn't seem to check anything. Also in Odoo 15, |
@lemniskett It's still better than nothing. It will at least not create useless sessions and fill your session store folder after each seconds. The downside is that it's not backward compatible with earlier versions. |
Explanation:
Added Docker
HEALTHCHECK
command. Pointing to/web/database/selector
for the health check because it is the first functional thing that should work and be available for a typical install. For custom deployments where admins wish to disable access to/web/database/selector
they could use a different URL, or they could just prevent access to that path at the proxy server, in which case nothing changes in within the Docker environment for this to work. (See Docker's HEALTHCHECK documentation for more)Purpose:
Provide rudimentary Odoo server state info to Docker service so that Docker knows if the service in the container is healthy or not. Contributes to monitoring and swarm management.