Indra is an efficient library and service to deliver word-embeddings and semantic relatedness to real-world applications in the domains of machine learning and natural language processing. It offers 60+ pre-build models in 15 languages and several model algorithms and corpora.
Indra is powered by spotify-annoy delivering an efficient approximate nearest neighbors function.
- Efficient approximate nearest neighbors (powered by spotify-annoy);
- 60+ pre-build models in 15 languages;
- Permissive license for commercial use (MIT License);
- Support to translated distributional relatedness;
- Easy deploy: Deploy the infrastructure in 3 steps;
- Access to the semantic models as a service;
- Supports multiple distributional semantic models and distance measures.
Indra delivers ready-to-use pre-build models using different algorithms, data set corpora and languages. For a full list of pre-build models, please check the Wiki.
- Word2Vec (W2V)
- Global Vectors (GloVe)
- Explicit Semantic Analysis (ESA)
- Dependency-Based Word Embeddings
- Latent Semantic Analysis (LSA)
- EN - English
- DE - German
- ES - Spanish
- FR - French
- PT - Portuguese
- IT - Italian
- SV - Swedish
- ZH - Chinese
- NL - Dutch
- RU - Russian
- KO - Korean
- JA - Japanese
- AR - Arabic
- FA - Persian
- EL - Greek
To install, please use the 3-step tool IndraComposed.
This guide provides the basic instructions to get you started using Indra. For further details, including the response format, additional parameters and the list of available models and language, please check the Wiki.
{
"corpus": "googlenews",
"model": "W2V",
"language": "EN",
"terms": ["love", "mother", "santa claus"]
}
For further details, check the Word Embeddings documentation.
{
"corpus": "googlenews",
"model": "W2V",
"language": "EN",
"topk": 10,
"terms": ["love", "mother", "santa"]
}
For further details, check the Nearest Neighbors documentation.
{
"corpus": "googlenews",
"model": "W2V",
"language": "EN",
"topk": 10,
"scoreFunction": "COSINE",
"terms": ["love", "mother", "santa"]
}
For further details, check the Nearest Neighbors documentation.
{
"corpus": "wiki-2018",
"model": "W2V",
"language": "EN",
"scoreFunction": "COSINE",
"pairs": [{
"t2": "love",
"t1": "mother"
},
{
"t2": "love",
"t1": "santa claus"
}]
}
For further details, check the Semantic Similarity documentation.
{
"corpus": "wiki-2018",
"model": "W2V",
"language": "EN",
"scoreFunction": "COSINE",
"one" : "love",
"many" : ["mother", "father", "child"]
}
For further details, check the Semantic Similarity documentation.
For translated word embeddings and translated semantic similarity just append "mt" : true in the JSON payload.
We have a public endpoint for demonstration only hence you can try right now with cURL on the command line.
curl -X POST -H "Content-Type: application/json" -d '{
"corpus": "wiki-2018",
"model": "W2V",
"language": "EN",
"terms": ["love", "mother", "santa claus"]
}' "http://indra.lambda3.org/vectors"
curl -X POST -H "Content-Type: application/json" -d '{
"corpus": "wiki-2018",
"model": "W2V",
"language": "EN",
"scoreFunction": "COSINE",
"pairs": [{
"t2": "love",
"t1": "mother"
},
{
"t2": "love",
"t1": "santa claus"
}]
}' "http://indra.lambda3.org/relatedness"
Please cite Indra, if you use it in your experiments or project.
@InProceedings{indra,
author="Sales, Juliano Efson and Souza, Leonardo and Barzegar, Siamak and Davis, Brian and Freitas, Andr{\'e} and Handschuh, Siegfried",
title="Indra: A Word Embedding and Semantic Relatedness Server",
booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
month = {May},
year = {2018},
address = {Miyazaki, Japan},
publisher = {European Language Resources Association (ELRA)},
}
- Andre Freitas
- Brian Davis
- Juliano Sales
- Leonardo Souza
- Siamak Barzegar
- Siegfried Handschuh