Skip to content

deba-iitbh/TREC-CT_2022_QSA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TREC-CT_2022_QSA

Query Summarized Attention for TREC-CT 2022 Pipeline

BM25

we need to install the elasticsearch package from your distro repo.

sudo systemctl start elasticsearch
python BM25.py

This will save the BM25.txt file in input folder.

MonoBERT

BERT architectures.

BERT_CAT Architecture

The query and document are passed together into the transformer for semantic similarity.

To train the monobert model run,

python monobert.py

BERT_DOT Architecture

The Document and query embedding are produced separately and then dot product is done to get the relevance.

To train the colBERT model run,

python colbert.py

BERT Models

  • SciBERT - Finedtuned on Scintific text/literature
  • BlueBERT - Finetuned on PubMed dataset
  • BioClinicalBERT - Finetuned on MIMIC-III dataset

Change the model name in config file.

Final Ranking

After the BERT reranking, we will consolidate all the scores. To do that run,

python output-consolidation.py

Metric

The final re-ranked output is stored in output/final.txt We can check the metric by running,

trec_eval -m "trec_official" input/qrels2022.txt output/final.txt

About

Query Summarized Attention for TREC-CT 2022

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published