Elasticsearch Amazon Rekognition Image Ingest Processor

Elasticsearch ingest processors using Amazon Rekognition for image analysis. All Rekognition detection features are supported via separate processors..

Each field that is sent through the ingest process will result in an AWS Rekognition API call, so this system is not meant for clusters with large workloads. For better performance, your Elasticsearch ingest nodes should not only be hosted in AWS, but should also be in the region used in the AWS Rekognition API (configurable).

Calls to AWS Rekognition are best suited in your ETL pipeline and not via a plugin. There are two benefits to running the code within an ingest node

Pipelines are configurable, so you can enable/disable processors without changing your ETL code.
Your language of choice for indexing is not as fast as Java.

AWS Rekognition Pricing (most regions)

Image Analysis Tiers	Price per 1,000 Images Processed
First 1 million images processed* per month	$1.00
Next 9 million images processed* per month	$0.80
Next 90 million images processed* per mont	$0.60
Over 100 million images processed* per month	$0.40

Supported Features

Building

There is no downloadable version of the plugin for two reasons:

It is difficult to release a plugin for each minor version of Elasticsearch. You can only run plugins built for the exact version of Elasticsearch.
Due to the warning at the very top regarding cost and performance, it prefered that the plugin is built and not blindly installed so that users are aware.

Only Elasticsearch 5.6+ is supported in order to take advantage of the secure keystore.

Integration tests are only run if AWS credentials are added to build.gradle. Results are subject to change based on Rekognition's results at the time of the tests.

Installation

Only basic credentials are supported. The AWS access and secret keys are added to Elasticsearch keystore, before the node is started.

Plugin Settings

Setting	Description
ingest.aws-rekognition.credentials.access_key	AWS Acesss key
ingest.aws-rekognition.credentials.secret_key	AWS Secret Key
ingest.aws-rekognition.region	AWS region used to the API call. Default region is us-east-1

AWS Credentials are not configured in elasticsearch.yml, or in the plugin settings, but in the keystore. Settings must be in place before Elasticsearch is started.

Processor settings

Name	Required	Default	Description
field	yes	-	The field to analyze
target_field	no	A new field with the name of the source field with a processor specific suffix appended	The field to assign the converted value to.
min_score	no	0 (all returned)	The minimum confidence score threshold of values to be returned
max_values	no	0 (all returned)	The number of values to return. If max_value is 1, a single value is returned and not an array. Not used in the Detect Celebrities processor.
ignore_missing	no	false	If true and field does not exist or is null, the processor quietly exits without modifying the document
remove	no	true	If true, removes the source field after processing. Recommended since storing binary data in Elasticsearch is not ideal.

Feature	Processor Name	Default suffix
Detecting Objects and Scenes	detect-objects	_objects
Detecting Celebrities	detect-celebrities	_celebrities
Detecting Text	detect-unsafe-content	_text
Detecting Unsafe Content	detect-dominant-language	_unsafe

Examples

After each pipeline is configured, the same document is indexed.

Base64 content is too large to display here and for normal curl/sense usage. Create a JSON file with the required field.

{
  "my_field" : "/9j/4gIcSUNDX1BST0.....<insert base64 encoded here, see image.base64.txt>"
}

Add the document

curl -XPUT $ES_HOST:9200/my-index/my-type/1?pipeline=aws-rekognition-pipeline -d @doc.json

Detecting Objects

PUT _ingest/pipeline/aws-rekognition-pipeline
{
   "description": "A pipeline to test AWS Rekognition",
   "processors": [
      {
         "detect-objects": {
            "field": "my_field"
         }
      }
   ]
}

Result

{
   "_index": "my-index",
   "_type": "my-type",
   "_id": "1",
   "_version": 1,
   "found": true,
   "_source": {
      "my_field_objects": [
         "Human",
         "People",
         "Person",
         "Poster",
         "Brochure",
         "Flyer",
         "Paper",
         "Collage",
         "Art",
         "Head"
      ]
   }
}

Detecting Celebrities

PUT _ingest/pipeline/aws-rekognition-pipeline
{
   "description": "A pipeline to test AWS Rekognition",
   "processors": [
      {
         "detect-celebrities": {
            "field": "my_field"
         }
      }
   ]
}

Result

{
   "_index": "my-index",
   "_type": "my-type",
   "_id": "1",
   "_version": 1,
   "found": true,
   "_source": {
      "my_field_celebrities": {
         "unknownFaces": 1,
         "celebrityFaces": [
            {
               "name": "Elvis Presley",
               "id": "tX3Fw0h"
            }
         ]
      }
   }
}

Detecting Text

PUT _ingest/pipeline/aws-rekognition-pipeline
{
   "description": "A pipeline to test AWS Rekognition",
   "processors": [
      {
         "detect-text": {
            "field": "my_field"
         }
      }
   ]
}

Result

{
   "_index": "my-index",
   "_type": "my-type",
   "_id": "1",
   "_version": 1,
   "found": true,
   "_source": {
      "my_field_text": [
         "PARAMOUNT PRESENTS ELVIS",
         "PRESLEY",
         "\"HAL WALLIS",
         "ING",
         "RELE",
         "CAROLYN JONES .WALTER MATTHAU DOLORES HART. .DEAN JAGGER-VIC MORROW",
         "PAUL STEWART VINCENTE GAZZO",
         "DIRECTED BY MICHAEL CURTIZ SCREENPLAY YBY HERBERT RT BAKER AND MICHAEL",
         "PARAMOUNT",
         "PRESENTS",
         "ELVIS",
         "PRESLEY",
         "\"HAL",
         "WALLIS",
         "ING",
         "RELE",
         "CAROLYN",
         "JONES",
         ".WALTER",
         "MATTHAU",
         "DOLORES",
         "HART.",
         ".DEAN",
         "JAGGER-VIC",
         "MORROW",
         "PAUL STEWART",
         "DIRECTED",
         "BY MICHAEL",
         "CURTIZ",
         "SCREENPLAY",
         "YBY HERBERT",
         "RT BAKER",
         "AND",
         "MICHAEL",
         "VINCENTE GAZZO"
      ]
   }
}

Detecting Unsafe Content

PUT _ingest/pipeline/aws-rekognition-pipeline
{
   "description": "A pipeline to test AWS Rekognition",
   "processors": [
      {
         "detect-unsafe-content": {
            "field": "my_field"
         }
      }
   ]
}

Result

{
   "_index": "my-index",
   "_type": "my-type",
   "_id": "1",
   "_version": 1,
   "found": true,
   "_source": {
      "my_field_unsafe": []
   }
}

Elvis is safe content!

Using another image with known unsafe content

curl -XPUT $ES_HOST:9200/my-index/my-type/2?pipeline=aws-rekognition-pipeline -d @unsafe.json

Result

{
   "_index": "my-index",
   "_type": "my-type",
   "_id": "2",
   "_version": 1,
   "found": true,
   "_source": {
      "my_field_unsafe": [
         "Explicit Nudity",
         "Nudity"
      ]
   }
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
build.gradle		build.gradle
gradle.properties		gradle.properties
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Elasticsearch Amazon Rekognition Image Ingest Processor

AWS Rekognition Pricing (most regions)

Supported Features

Building

Installation

Plugin Settings

Processor settings

Examples

About

Releases

Packages

Languages

License

brusic/elasticsearch-ingest-aws-rekognition

Folders and files

Latest commit

History

Repository files navigation

Elasticsearch Amazon Rekognition Image Ingest Processor

AWS Rekognition Pricing (most regions)

Supported Features

Building

Installation

Plugin Settings

Processor settings

Examples

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages