Skip to content

librairy/KGQA

Repository files navigation

Question-Answering over Knowledge Graphs

Preparation

  1. Prepare a Python 3 environment with Conda installed
  2. Clone this repo
    git clone https://github.com/librairy/MuHeQA.git
    
  3. Move into the root directory.
      cd MuHeQA
    
  4. Download the RDF Verbalizer model into the application/summary/kg/nlg/model folder
    wget -O application/summary/kg/nlg/model/pytorch_model.bin https://delicias.dia.fi.upm.es/nextcloud/index.php/s/bRxnH93Df9Psaeo/download
    
  5. Download the answer classifier and unzip into the root project directory. The folder resources_dir/ is created.
    wget -O resources.zip https://delicias.dia.fi.upm.es/nextcloud/index.php/s/Jp5FeoBn57c8k4M/download
    unzip resources.zip
    
  6. Install Wikidata DB for Entity Linking:
    python -m spacy_entity_linker "download_knowledge_base"
    
  7. Install dependencies (in case you have a device based on Apple's M1 chip skip to the M1 Environment step):
    pip install -r requirements.txt

M1 Environments (only for Apple's M1 devices )

  1. Install the Apple edition of tensorflow
    pip install --upgrade --force --no-dependencies tensorflow-macos
    pip install --upgrade --force --no-dependencies tensorflow-metal
    
  2. Compile and install the tokenizers module from Huggingface:
    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    cd /Users/cbadenes/Projects
    git clone https://github.com/huggingface/tokenizers
    cd tokenizers/bindings/python
    pip install setuptools_rust
    python setup.py install
    
  3. Compile and install the transformers module from Huggingface:
    pip install git+https://github.com/huggingface/transformers
    
  4. And finally, install the rest of dependencies:
    pip install -r min-requirements.txt    
    

Service start-up

  1. Once the environment is ready, just execute the following command (runserver for development mode and runprodserver for production mode ):

    python manage.py runserver
    
  2. It may take some minutes to load some external resources. The following logs will appear when everything is ready:

    Loading RDF2nlg model: /Users/cbadenes/Projects/muheqa/application/summary/kg/nlg/model ..
    model ready
    Linked to DBpedia(en): http://dbpedia.org/sparql
    Linked to Wikidata (en): http://query.wikidata.org/sparql
    Ready to answer question from the English edition of CORD-19 collection
    Loading bert-large-uncased-whole-word-masking-finetuned-squad model..
    model ready
    Loading deepset/roberta-base-squad2-covid model..
    model ready
    Loading deepset/roberta-base-squad2 model..
    model ready
    English answerer is ready
     * Serving Flask app "application.app" (lazy loading)
     * Environment: production
       WARNING: This is a development server. Do not use it in a production deployment.
       Use a production WSGI server instead.
     * Debug mode: off
     * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
    

Server routes

The message body must contain the question field with the natural language question, and the query parameter evidence sets whether the summary generated has to be retrieved or not.

The availabe URIs are:

  • /muheqa/dbpedia : solve questions using the English edition of DBpedia.
  • /muheqa/wikidata: solve questions using the English edition of Wikidata.
  • /muheqa/cord19: solve questions using the Covid-19 Open Research Dataset.
  • /muheqa/all: solve questions using all sources of information.

Example

To answer the question Where was Fernando Alonso born? using DBpedia:

 curl --location --request GET 'http://127.0.0.1:5000/muheqa/dbpedia/en?evidence=false' --form 'question="Where was Fernando Alonso born?"'

And the response:

{
 "answer": "Oviedo, Asturias, Spain",
 "confidence": 0.801,
 "evidence": {
 	"end": 149,
 	"summary": "  The car number of Fernando Alonso is 14.   The Last win of Fernando Alonso is 2013.   The birth place of Fernando Alonso is Oviedo, Asturias, Spain.   The name of Fernando Alonso is Fernando Alonso.   The First win of Fernando Alonso is 2003.   The last season of Fernando Alonso is 2018.   The birth name of Fernando Alonso is Fernando Alonso D\u00edaz.   The caption of Fernando Alonso is Alonso in 2016.   The First race of Fernando Alonso is 2001.   The image size of Fernando Alonso is 240.   The last win of Fernando Alonso is 2013 Spanish Grand Prix.   The nationality of Fernando Alonso is Spanish.   The title of Fernando Alonso is Fernando Alonso achievements, Fernando Alonso teams and series.   The first race of Fernando Alonso is 2001 Australian Grand Prix.   The 2021 Team of Fernando Alonso is Alpine F1, Renault in Formula One.   The source  of Fernando Alonso is Alonso's race engineer at Ferrari, Andrea Stella, on Alonso's ability and similarities to Michael Schumacher.   The first win of Fernando Alonso is 2003 Hungarian Grand Prix.  .  ",
 	"start": 126
 },
 "question": "where was Fernando Alonso born?"
}