Dashengine provides boilerplate code for quickly setting up Plotly Dash applications on the Google Cloud Platform.
Such boilerplate includes:
- A mechanism for automatically loading and indexing new pages from a
pages
directory. - A simple API for running standard and parametrised queries against BigQuery.
- A caching mechanism for the BigQuery results.
- A standard profiling page where cached queries can be analysed.
- GitHub actions workflow for deployment to Google Cloud Run.
See the demo
subdirectory for an example of a complete Dashengine
application including multiple pages, (parametrised) queries and profiling.
Originally Dashengine was developed with Google App Engine (GAE) in mind (hence the name) although deploying on Google Cloud Run may provide a better fit. The core of the project is provided by a Docker container (built by the root Dockerfile) and should be extended with application pages. The resulting container can be naturally be deployed to any infrastructure that supports containers, but Dashengine assumes that default GCP authentication is available.
This repository is automatically deployed to Cloud Run via GitHub Actions.
To build the container image:
docker build . -t dashengine
Which can take some time (in building the various python dependencies). To run the container for local testing:
docker run -p 8050:8050 -v "/Users/<username>/.config:/root/.config" dashengine
where the path /Users/<username>/.config
points to the config directory used
by Google's default authentication credentials.
Dashboard pages are loaded automatically from the pages
directory, which is
provided by the container extending Dashengine. For examples see the demo
directory. If no pages
directory is provided, the demo application is used.
A core part of the Dashengine infrastructure is the querying system. It has a number of features.
- Queries are stored and versioned independently of the overall code.
- Queries are parametrisable, through the use of BQ Parameterized Queries. Parametrisation at the level of projects will not be supported, however query scopes can be restricted to project level simply by not including a project ID in the query. This can be useful in the case of multiple project environments (DEV, PROD).
- Queries can be performed asynchronously for performance.
- Query results can be cached per dashboard page-view for performance.
- Queries have their performance metrics (time, data use) recorded for analysis.
Query results are cached via flask-caching. Note that the default in-memory caching means the developer must be careful about memory usage to fit into application memory limits. The general principle being that any heavy lifting should be done in the SQL queries rather than on the application instance. Furthermore this caching, being in-instance-memory, is not preserved across instances. This can be easily modified by using an external cache e.g Redis, for which support is built-in.
The query profiler provides summary information on the performance of cached queries. The profiler can work (although maybe not perfectly) even in a multi-threaded environment and even with a simple (in-memory) cache. Queries are referenced in the profiler by a query ID string and parameters only. Therefore if in any given thread the query has not been cached, the thread is able to re-run the query to display profiling information.
Are obtained through google.auth.default
.
For how to set these credentials when working locally with a project, see the documentation here.