Skip to content
This repository has been archived by the owner on Sep 21, 2023. It is now read-only.

Google Cloud Platform App Engine application in Python 3.7 runtime for text detection with Vision API.

Notifications You must be signed in to change notification settings

julzerinos/googlecloud-text-detect-appengine

Repository files navigation

Processing images using Cloud Functions in the Google App Engine environment

Introduction

This is an example application deployed on Google App Engine which involves other Google Cloud functionalities in runtime Python 3.7. Until the project is running, the app may be accessed at https://project-ii-gae.appspot.com/.

An image may be uploaded after logging into a valid google account, after which the user receives an email to the specified email with text detected in the image.

The Project Overview

Intent

The following technologies and skills are used:

  • Developing an application running on Google App Engine in Python 3.7 runtime
  • Using Google Source Repositories as source control system
  • Using Google Cloud Ruild for building container images and deployment automation
  • Using Google managed database technologies as data persistency, i.e. Cloud Datastore and Cloud Storage
  • Using Google Cloud Functions to act in response to events
  • Orchestrating Continuous Integration system to build application artefacts
  • Preparing proper deployment instrumentation

User Flow

The user flow defined for the project is as follows:

  1. As a user I can navigate to a website where I can log in using GCP account credentials (users without valid GCP credentials cannot log into the system).
  2. After successful login I can upload an image in JPG or PNG format and this image is stored in Cloud Storage bucket (see Note 2)
  3. I have a Cloud Function GCF#1 with a Cloud Storage trigger that discovers that an image was uploaded to Storage bucket#1. Cloud Function scales down the picture and stores the result in Cloud Storage bucket#2.
  4. There is another Cloud Function (GCF#2) that processes images from bucket#2. GCF#2 discovers transformed images stored in bucket#2 and puts the message in Pub/Sub channel that is a trigger for GCF#3. \
  5. GCF#3 makes a call to Vision API and discover text visible in the image. Result of Vision API call is stored in the Datastore.
  6. Finally, GCF#3 should send an email with the link to the image to GCP user and. Email should inform the user about successful or not successful operation of image transformation. Email should contain signed URLs to original and transformed image. Email should also contain text discovered for this specific image by Vision API
  7. I can press 'Logout' button to log out of the service. Once I press 'Logout' I'm navigated back to login/initial page where there is a "Login" button I can use to log back into the system.

The implementation of the above user flow takes certain liberties, but those are explained below.

Project Requirements

Requirement Fulfilled Notes
Application works according to user flow -
Application is described in a README.md This is it
Users can log into the application using GCP accounts (restricted access). HTTPS only is supported. It is possible to log out GCP accounts are assumed to be all Google accounts. Authentication is shallow (file upload authentication is done in frontend). HTTPS is secured by built-in GAE mechanics
App stores image files in a Cloud Storage bucket using unique identifiers. KMS is used to create Customer Managed Encryption on uplaoded files Timestamps are used as identifiers for images
Datastore is used to keep information on image uploader, image name, digital digest, signed url to original, signed url to rescaled and Vision API text Digital digest (hash) is calculated based on bytes string
Source code is stored in Google Source Repositories For sharing purposes, code is also stored on GitHub repositories
Google Cloud Function 1 stores and rescales images -
Google Cloud Function 3 is triggered by pub/sub, uses Vision API to find text in image and sends an email to uploader -
Cloud Build builds newer version of application when changes are commited to repo Solved with Cloud Build Triggers (beta)
Cloud Build deploys newer version of application into GAE -
Cloud Build deploys newer version of Cloud Functions on repo git push -
Unit tests are implemented for backend components A few tests are supplied as an example to how Cloud Build may run them
Code style Python code written according to Flake8, clutter removed from repository

Setting Up The Project

Within the repository one may find the setup.sh shell script. More specific assumptions may be found within the comments of the files, but in general - the script may be used in an empty project to build the project out of the box in a Google Cloud environment. Please note, that the script does not set up a Git Repository and Cloud Build. These are optional and must be created manually.

The script may be run with setup.sh [PROJECT_ID] [CLIENT_ID],

where the required fields are empty project ID and a Client ID, which may be created after configuring the application consent form (ie. first create a project, then setup the consent form and finally run the script).

If one chooses to set up the project manually, the script file may serve as a collection of instructions.

The Application Overview

The following overview sections are set in order of the project requirements table.

The Application Flow

Based on the specifications described in the section The Project Overview, presented in the diagram below is a generalized techincal flow of the application. The steps are descirbed in the next subsection.

Flow Diagram

The Application Accesspoint

Once Google App Engine deploys the application, the following frontend may be accessed by the user.

Frontend Image

The user must log in to be able to upload an image. After signing in with their Google account through OAuth, the upload form becomes available. An image may be uploaded and optionally the recipient email may be changed. After upload, the appropriate response is generated (error/success). Sign out is available or the user may repeat the process.

The page is serviced through Flask (Python package). A session is created for the logged in user to authorize uploads. The form is intercepted upon submit and then posted by AJAX. This is relatively unstable, but should pass for a small sized project.

If an error appears in file uploading, the user will be redirected to an error page with the appropriate error code.

The app is always serviced at https thanks to Google's built-in redirect system found in the app.yaml setup file.

The backend for this upload (Flask) first validates the data for correctness. Then the digital digest is calculated (to check if image exists in the database). Afterwards appropriate steps are taken to upload the image to the first bucket with a timestamp-based id. Datastore entity is initialized.

Further Processing

GCF1 - Rescale

The first GCF is triggerd upon an image upload to bucket-1. It downloads the image and rescales it to a width of 512 pixels while respecting aspect ratio.

The resulting image is uploaded to bucket-2. The related entity in Datastore is updated with both image URLs.

GCF2 - Inform

The second GCF is triggered upon an image upload to bucket-2, ie. after GCF1 runs. The sole existence of this function is to publish a message to the pub/sub topic 'rescaled-images'. The message posts an attribute with the filename of the uploaded file.

GCF3 - Vision

This third and final GCF is the bread and butter of the backend functionality. It is triggered upon a message publication to the 'rescaled-images' topic. It stores the related image into memory and uses Google's Vision API to detect visible text in the image. This text is stored in the related entity Datastore.

A simple SMTP server is set-up with a dedicated Google/Gmail account to send the results to the uploader/recipient email.

Other Functionalities

CME & KMS

Customer-Managed Encryption is available through Google's KMS. A keyring/key is created for the project and permissions are granted to the App Engine service account. Due to this, the service account will now always encrypt files uploaded to bucket-1 are encrypted with this symmetric key.

Cloudbuild

A cloudbuild.yaml file is present in the repository. If the project is integrated with a Google Cloud Source Repository, every git push will run cloudbuild.yaml if set up in the Cloud Build section. The file runs unit tests on GCF, Flask and then integration tests on both components. If these succeed, the newest versions of the GCF and Google App Engine are deployed.

Unit Tests

Very basic unit tests (Python/unittest) are prepared for the GCFs, Flask and integration of both. Their existence is rather just a use for presenting how Google Build may run these tests on git push.

Below is the list of unit tests.

  • GCF
    • GCF1 - test if image was uploaded to second bucket (test014_image_exists_in_bucket_2)
    • GCF1 - test if image was rescaled properly (test015_rescale_success)
    • (Disabled) GCF2 - test if message was properly published (test110_published_trigger)
      • Disabled due to broken Google pub/sub API. The API randomly fails to pull freshest messages
  • Flask
    • test if correct image file is successfully uploaded (test010_positive_form_post)
    • test if negative image file is unsuccessfully uploaded (test011_negative_form_post)
    • test if oversized image is unsuccessfully uploaded (test012_oversize_form_post)
    • test if undersized image is unsuccessfully uploaded (test013_undersize_form_post)
    • test if user not logged in results in unsuccessful upload (test014_not_logged_in)
    • test if invalid email format results in unsuccessful upload (test015_invalid_email)
    • test if the image has been uploaded to bucket-1 (test020_image_stored_bucket1)
  • Integration
    • test if the datastore entity has populated fields (test010_datastore_entries_exist)
    • test if text has been detected in image (test015_vision_text)
    • test if the storage blob files are public (test021_images_public)

Final Thoughts

Google App Engine is yet another solution pulled from the never-ending catalogue of Google products, specifically created for producing applications. The idea is simple: backend, frontend, side-end, whatever-end - the solution will host it and strive to run it perfectly. Unfortunately, that is rarely the case. The Google platform often riddles the user with false errors or allows itself a casual self-entitled prank by disabling random components of the Google Cloud platform every now and then. Alas, the terrible contrast between the marketing material and reality is a subject for another time.

Sources

Cloud Build

cloudbuild.yaml

General

App Engine

Functions

Git Deploy

gcloud

App engine

app.yaml

Setup for python37

HTTPS always for appspot apps

flask & google

relative paths in app engine

read Flask uploads from memory

templates in App Engine Flask

uploading files

google auth & users

google sign-in

js functionality

html form interception for authentication

images & hash

hashing

reading Flask FileStorage into Pillow

datastore

datastore python ref

GCF1

storage & images

google storage blob python ref

google storage buckets python ref

image manipulations with PIL and google storage

blobs saved as strings of bytes

maintain aspect ratio

GCF2

pubsub ref

pubsub synchron. pull

encoding as bytes

GCF3

text detection

relative paths to container names (pubsup topics)

sendgrid