-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WikiQA Corpus for Question Answering #31
Comments
|
Thanks @aneesh-joshi, useful dataset 👍 (don't forget to raise an issues for other datasets that you used for evaluation) |
how can we use this dataset in a question answering system?/ |
@anagha1198 If you group by the first column (ie, you group all the rows with the same Question ID, you will get all the corresponding document IDs) |
Link : https://download.microsoft.com/download/E/5/F/E5FCFCEE-7005-4814-853D-DAA7C66507E0/WikiQACorpus.zip
Paper: https://aclweb.org/anthology/D15-1237
Description:
Wikiqa is a QA dataset which is well studied for QA systems. It has a predefined trin/dev/test split and comes in a .tsv and .txt format.
Basically, there is a question(q) and for every question there are several candidate documents (d1, d2, ..). for the question-document pair there is a relevance value. 1 : relevant, 0 : not relevant.
Here is an example from the dataset:
I also provide a data reader which will make the dataset easily available for use.
The text was updated successfully, but these errors were encountered: