Skip to content

Latest commit

 

History

History
31 lines (23 loc) · 1.48 KB

README.md

File metadata and controls

31 lines (23 loc) · 1.48 KB

LiveQA submission for TREC-2016

Introduction

This project is based on the TREC-2016 track LiveQA. In the heart of it uses Latent Dirichlet Allocation (LDA) to infer the semantic topics and uses this model to construct a probability distribution for each of the retrieved documents from the knowledge base. Finally the Jensen-Shannon Distance (JSD) is calculated to have a symilarity measure and the most similar answer is selected as the returned answer. The knowledge base used right now is the yahoo answers database.

Leverages on:

Future Work

  • Add more resources other than YahooAnswers.
  • Improve query construction when searching for candidate question/answer tuples.
  • Add more similarity metrics (aggregation, semantic).
  • Improve NLP processing.
  • Add multi-document summarization when possible.

References