In this repo, we conduct a preliminary analysis of different methods to address the Textual Entailment Recognition (RTE) task in Portuguese. We use the ASSIN-2 dataset as a benchmark to evaluate our models. Our work combines various textual representation approaches, including bag of words and word em- beddings, with machine learning models. Additionally, we present a rule-based approach. Our highest performance was achieved by the BERTimbau-large model fine-tuned on ASSIN-2, which attained an F 1 score of 0.89%, positioning it just 1% below the current state-of-the-art. Our ongoing experiment aims to combine our different approaches to leverage their full potential.
This repo was created as part of an activity from the Natural Language Processing course at ICMC - USP.