Added Docker healthcheck for /10.0 #115

sylnsr · 2017-07-17T17:51:13Z

Explanation:
Added Docker HEALTHCHECK command. Pointing to /web/database/selector for the health check because it is the first functional thing that should work and be available for a typical install. For custom deployments where admins wish to disable access to /web/database/selector they could use a different URL, or they could just prevent access to that path at the proxy server, in which case nothing changes in within the Docker environment for this to work. (See Docker's HEALTHCHECK documentation for more)

Purpose:
Provide rudimentary Odoo server state info to Docker service so that Docker knows if the service in the container is healthy or not. Contributes to monitoring and swarm management.

linkit-odoo · 2018-07-16T12:40:51Z

That URL (as well as /) cause Odoo to create a session, which is stored in the sessions store in data_dir. There are two problems in continuosly checking the URL (and originating sessions):

sessions keep accumulating -- one or more per minute, in a typical case
worse, those open sessions interfere with operations like new database creation, where the operation oftern stops with "cuncurrent update" error. I double cheched and, in my case, the healtchecks were indeed the culprit

We need to indentify (or create) a good target for healthcheck interrogations.

sylnsr · 2018-07-17T19:33:38Z

Good catch. In hind-sight, I also think it needs a URL with a much smaller return body. This is one option:
/web_editor/static/src/xml/ace.xml because:

it does not generate a session and
the return body it smaller than most.

Also, its static so I don't expect it to grow. I understand that in certain health checks (AWS ELB for example), only the status code is checked and the body is essentially thrown away, however, I think the request should still be for a small static asset, in order to be as efficient as possible. I only mention AWS ELB, because I think if someone happens to put their containers behind a load balancer that checks the return status of the health check endpoint .. they might use the same endpoint used for the Docker health check.

So how about that one?

thomas15v · 2018-07-18T13:21:28Z

Doesn't work either. The static files do not reflect odoo's status. You can shutdown the database it will still think that odoo is fine. I tried this command it worked fine:

HEALTHCHECK CMD curl --fail -b /tmp/cookies --cookie-jar /tmp/cookies -s -H "Content-Type: application/json" -X POST -d '{"jsonrpc":"2.0","method":"call","params":{"method":"list","service":"db","args":{}}}' 0.0.0.0:8069/jsonrpc || exit 1

This command will execute a jsonrpc call to list available databases. This forces odoo to check the database connection and call the connection as well. What also means that if the disk is full this status should go to unhealthy as well. Sadly I could not avoid odoo from creating a session, however I gave curl a cookie-jar so odoo doesn't keep creating sessions.

sylnsr · 2018-07-19T18:10:37Z

That's a great idea. Perhaps at this point its important to define what the objective is of the health-check. Should it be comprehensive or simple? Is it just check the basic health of the container or the system as a whole.

For example, you could have another situation where the instance can connect to the DB and create cookies, however, due to running out of RAM, less assets don't compile so the UI is hosed. Is that something that should be accounted for as well?

sylnsr · 2018-07-19T18:29:59Z

Another thing to consider is compliance with deployment recommendations. So for example, @ https://www.odoo.com/documentation/10.0/setup/deploy.html#security its recommended to "set list_db configuration option to False, to prevent listing databases entirely". I know that there are some people who follow this recommendation religiously ... so in that case, listing the DBs would not work.

SalahAdDin · 2018-08-27T12:39:06Z

Yes, it is true.

mlaitinen · 2018-12-31T12:33:51Z

Camptocamp seems to have found a solution to this problem by expiring the session immediately: https://github.com/camptocamp/odoo-cloud-platform/blob/11.0/monitoring_status/controllers/main.py

sylnsr · 2018-12-31T13:31:40Z

I am current using my proposal from above to get a static file, namely:
/web_editor/static/src/xml/ace.xml

Features:

does not create a cookie
uses stock code (no extra modules to install)
the file is really small and overhead of getting it is negligible
is very generic and can easily be changed by admins they need something more custom

An implementation: https://github.com/idazco/odoo-docker-ez/blob/master/10/Dockerfile#L115
For professional deployments we use something like Prometheus to check many parts of each deployment. 👍

@thomas15v , regarding The static files do not reflect odoo's status, I do not agree that the Docker health check should encompass the status of Odoo itself. Can we concretely define what "Odoo's status" even means? Does it encompass all unit tests passing with a stock install?

What we should define first, is what status are we trying to check .. or at least .. what is reasonable to check for a stock deployment:

Odoo is alive and can serve data (basic/simple; leaves responsibility of deep checking to other tools)

or

My Odoo implementation with all my chosen modules is OK. (impractical since each implementation is unique, and may include unreliable modules)

From the Docker documentation

The HEALTHCHECK instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running.

So the basic advocated objective of the HEALTHCHECK is very high level, just to see if the container is working or if its "unable to handle new connections". The Docker authors never advocated it to be a thorough, deep check to see if the application is functioning according to design. Since they mention checking a "web server", we should be thinking in terms of simply checking Werkzeug, that it is listening and can respond.

ysinjab · 2019-12-13T15:37:26Z

Doesn't work either. The static files do not reflect odoo's status. You can shutdown the database it will still think that odoo is fine. I tried this command it worked fine:
HEALTHCHECK CMD curl --fail -b /tmp/cookies --cookie-jar /tmp/cookies -s -H "Content-Type: application/json" -X POST -d '{"jsonrpc":"2.0","method":"call","params":{"method":"list","service":"db","args":{}}}' 0.0.0.0:8069/jsonrpc || exit 1
This command will execute a jsonrpc call to list available databases. This forces odoo to check the database connection and call the connection as well. What also means that if the disk is full this status should go to unhealthy as well. Sadly I could not avoid odoo from creating a session, however I gave curl a cookie-jar so odoo doesn't keep creating sessions.

Great Idea, but it has performance issue when you have to list hundreds of databases.

thomas15v · 2019-12-13T15:50:12Z

@ysinjab why do you have 100 databases on a single Odoo container? Wouldn't it be better to scale those databases to different odoo containers and use an HTTP proxy to sort these out with domain names?

ysinjab · 2019-12-13T15:54:35Z

@ysinjab why do you have 100 databases on a single Odoo container? Wouldn't it be better to scale those databases to different odoo containers and use an HTTP proxy to sort these out with domain names?

The system we have requires separate tenant for every user. so I have one postgresql server and a database for every user. I have RESTful api as a proxy for odoo where every user is connected to his own database.

sylnsr · 2019-12-14T00:08:43Z

@ysinjab Container health and Odoo system status / database status are different things. The point of the container health check is just to give you the basic running health status of the container itself ... not all parts of the application in that container. The basic functionality behind Odoo is the Werkzeug web server. If thats down then you have a fundamental problem with the health of the container. Everything beyond that .. such as database access ... file system access etc, is the subject of a separate instrumention solution. So based on what the actual purpose of a health check is, it works perfectly.

sylnsr · 2019-12-14T00:13:10Z

Folks you need to keep one thing in mind, the container starts a Python process with a PID attached to PID 1. The point of the container health check is to make sure that the app attached to PID 1 in that container is not hung. That's it. Don't make this more complicated based on your own use cases.

d-fence · 2020-01-22T14:08:15Z

what if the server is started with --no-http for one reason or another ?

mamiu · 2020-11-28T04:03:58Z

We have our own Dockerfile that wraps the default odoo docker image and adds some scripts. One of the scripts is for health checks (or liveness probe as it's called in kubernetes terms). It prevents the continuously growing number of sessions quite well. (Of course it won't work with --no-http as mentioned by @d-fence in the comment above.)

Maybe some of you will benefit from this:

#!/bin/bash

RESTORE_LOCK_FILE="$HOME/.restore.lock"
LIVENESS_PROBE_URL="localhost:8069"
LIVENESS_PROBE_FILE="$HOME/.liveness-probe-session"

# Succeed if a restore process is in progress to prevent interruption
[ -f "$RESTORE_LOCK_FILE" ] && exit 0

ensure_odoo_is_running()
{
    if [ $? -gt 0 ]; then
        # Odoo is down or does not respond
        echo "ERROR: No connection to \"http://${LIVENESS_PROBE_URL}\"!"
        exit 1
    fi
}

# Use this method to reliably get a session cookie
initialize_session()
{
    SESSION_COOKIE="$1"

    [ -n "$SESSION_COOKIE" ] && CURL_SESSION_ARG=(--header "cookie: session_id=$SESSION_COOKIE")

    SESSION_COOKIE="$(
        curl "http://${LIVENESS_PROBE_URL}/web/database/selector" \
            "${CURL_SESSION_ARG[@]}" \
            --silent \
            --cookie-jar - \
            --output /dev/null \
            | grep 'session_id' \
            | sed -e 's/^.*session_id.//g' \
            | sed -e 's/[ \t].*$//g'
    )"

    ensure_odoo_is_running

    if [ -z "$SESSION_COOKIE" ]; then
        echo "ERROR: Could not get session cookie!"
        exit 1
    else
        echo -n "$SESSION_COOKIE" > "$LIVENESS_PROBE_FILE"
        exit 0
    fi
}

# If there is already a liveness probe session use that one
if [ -f "$LIVENESS_PROBE_FILE" ]; then
    # Get existing session
    SESSION_COOKIE="$( cat $LIVENESS_PROBE_FILE )"

    RESPONSE="$(
        curl "http://${LIVENESS_PROBE_URL}/web/login" \
            --silent \
            --head \
            --header "cookie: session_id=$SESSION_COOKIE"
    )"

    ensure_odoo_is_running

    HTTP_STATUS_CODE="$( echo "${RESPONSE[@]}" | head -1 | awk '{print $2}' )"
    RESPONSE_SESSION_COOKIE="$(
        echo "${RESPONSE[@]}" \
        | grep "session_id=" \
        | sed -e 's/^.*session_id=//g' \
        | sed -e 's/;.*$//g'
    )"

    if [ -n "$RESPONSE_SESSION_COOKIE" ] && [ "$RESPONSE_SESSION_COOKIE" != "$SESSION_COOKIE" ]; then
        echo "$RESPONSE_SESSION_COOKIE" > "$LIVENESS_PROBE_FILE"
        export SESSION_COOKIE="$RESPONSE_SESSION_COOKIE"
    fi

    if [ $HTTP_STATUS_CODE -ge 200 ] && [ $HTTP_STATUS_CODE -le 299 ]; then
        # That's what we wanna see!
        exit 0
    elif [ $HTTP_STATUS_CODE -ge 300 ] && [ $HTTP_STATUS_CODE -le 399 ]; then
        # Get the URL where we are redirected to
        REDIRECTION_URL="$(
            echo "${RESPONSE[@]}" \
            | grep "Location:" \
            | sed -e "s/^.*${LIVENESS_PROBE_URL}//g"
        )"

        # If there's no database in the system yet, we're getting forwarded to the database selector
        if [ "$REDIRECTION_URL" = "/web/database/selector" ]; then
            initialize_session "$SESSION_COOKIE"
        fi
    elif [ $HTTP_STATUS_CODE -ge 400 ] && [ $HTTP_STATUS_CODE -le 499 ]; then
        initialize_session "$SESSION_COOKIE"
    else
        # It seems there is an internal server error (5XX)
        echo "ERROR: Liveness probe failed!"
        echo "HTTP status code: $HTTP_STATUS_CODE"
        echo "HTTP session cookie: $SESSION_COOKIE"
        echo "HTTP headers:"
        echo "${RESPONSE[@]}"

        exit 1
    fi
else
    initialize_session
fi

llacroix · 2021-02-04T00:08:46Z

I think you're all over thinking it. Healthcheck is pretty much an implementation detail.
If you use 0 workers, the healthcheck will need to be different as it's not using the same architecture as with workers > 1
With workers > 1 you're effectively spawning multiple forks of the preforkserver and you will also have multiple port to listen to. So you could have the longpolling up but xmlrcp_port down...
If you have cron workers they can't be exactly tested much...
The prefork server by itself would be the one managing its children anyway so you shouldn't even have to worry about it...

The other thing is that an healthy worker can be different things for different people. If I have a worker for which the response time increase to more than let say 50% slower than the average request time... I'd say the worker is unhealthy. Or may be the response time of the average request is increasing or maybe the amount of memory is way too high...

The issue is that a health check doesn't mean that the server is down, may be it's just slow and would tell you to scale the service up by adding more replicas.

lemniskett · 2022-08-23T06:33:45Z

Is it a good idea to use XML-RPC instead? I uses this in my own customized Dockerfile.

HEALTHCHECK CMD curl -H 'Content-Type: text/xml' \
    -d "<?xml version='1.0'?><methodCall><methodName>version</methodName></methodCall>" \
    -X POST \
    --fail \
    http://localhost:${ODOO_PORT}/xmlrpc/2/common

EDIT: Apparently this doesn't work with odoo systems with multiple databases

mamiu · 2022-08-23T06:54:19Z

@lemniskett That is indeed much more elegant! 👍 Thanks for sharing!

The only problem would be if the Odoo API would response but not the web application.
But I think that's an edge case most people can ignore.

mlaitinen · 2022-08-23T09:46:02Z

Odoo now has (since version 15.0) an endpoint /web/health that can be used for health checks. See: https://github.com/odoo/odoo/blob/05abba7e58d5a49a42ace9338d785b90fd5f7671/addons/web/controllers/main.py#L911

lemniskett · 2022-08-23T10:46:30Z

Odoo now has (since version 15.0) an endpoint /web/health that can be used for health checks. See: https://github.com/odoo/odoo/blob/05abba7e58d5a49a42ace9338d785b90fd5f7671/addons/web/controllers/main.py#L911

Is it really a health check? It doesn't seem to check anything.

Also in Odoo 15, /web/database/selector doesn't even return internal error when it's not connected to PostgreSQL.

llacroix · 2022-08-24T19:11:47Z

@lemniskett It's still better than nothing. It will at least not create useless sessions and fill your session store folder after each seconds.

The downside is that it's not backward compatible with earlier versions.

Added Docker healthcheck for /10.0

6086962

amh-mw mentioned this pull request Aug 25, 2020

[ADD] Health checks for load balancers. odoo/odoo#56522

Closed

amh-mw mentioned this pull request Nov 2, 2022

[IMP] Odoo 15.0-17.0: Add support for docker healthcheck. #425

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Docker healthcheck for /10.0 #115

Added Docker healthcheck for /10.0 #115

sylnsr commented Jul 17, 2017

linkit-odoo commented Jul 16, 2018

sylnsr commented Jul 17, 2018 •

edited

Loading

thomas15v commented Jul 18, 2018 •

edited

Loading

sylnsr commented Jul 19, 2018 •

edited

Loading

sylnsr commented Jul 19, 2018

SalahAdDin commented Aug 27, 2018

mlaitinen commented Dec 31, 2018

sylnsr commented Dec 31, 2018 •

edited

Loading

ysinjab commented Dec 13, 2019

thomas15v commented Dec 13, 2019

ysinjab commented Dec 13, 2019 •

edited

Loading

sylnsr commented Dec 14, 2019 •

edited

Loading

sylnsr commented Dec 14, 2019 •

edited

Loading

d-fence commented Jan 22, 2020

mamiu commented Nov 28, 2020 •

edited

Loading

llacroix commented Feb 4, 2021

lemniskett commented Aug 23, 2022 •

edited

Loading

mamiu commented Aug 23, 2022 •

edited

Loading

mlaitinen commented Aug 23, 2022

lemniskett commented Aug 23, 2022

llacroix commented Aug 24, 2022

Added Docker healthcheck for /10.0 #115

Are you sure you want to change the base?

Added Docker healthcheck for /10.0 #115

Conversation

sylnsr commented Jul 17, 2017

linkit-odoo commented Jul 16, 2018

sylnsr commented Jul 17, 2018 • edited Loading

thomas15v commented Jul 18, 2018 • edited Loading

sylnsr commented Jul 19, 2018 • edited Loading

sylnsr commented Jul 19, 2018

SalahAdDin commented Aug 27, 2018

mlaitinen commented Dec 31, 2018

sylnsr commented Dec 31, 2018 • edited Loading

ysinjab commented Dec 13, 2019

thomas15v commented Dec 13, 2019

ysinjab commented Dec 13, 2019 • edited Loading

sylnsr commented Dec 14, 2019 • edited Loading

sylnsr commented Dec 14, 2019 • edited Loading

d-fence commented Jan 22, 2020

mamiu commented Nov 28, 2020 • edited Loading

llacroix commented Feb 4, 2021

lemniskett commented Aug 23, 2022 • edited Loading

mamiu commented Aug 23, 2022 • edited Loading

mlaitinen commented Aug 23, 2022

lemniskett commented Aug 23, 2022

llacroix commented Aug 24, 2022

sylnsr commented Jul 17, 2018 •

edited

Loading

thomas15v commented Jul 18, 2018 •

edited

Loading

sylnsr commented Jul 19, 2018 •

edited

Loading

sylnsr commented Dec 31, 2018 •

edited

Loading

ysinjab commented Dec 13, 2019 •

edited

Loading

sylnsr commented Dec 14, 2019 •

edited

Loading

sylnsr commented Dec 14, 2019 •

edited

Loading

mamiu commented Nov 28, 2020 •

edited

Loading

lemniskett commented Aug 23, 2022 •

edited

Loading

mamiu commented Aug 23, 2022 •

edited

Loading