POeTiSA is a long-term project aimed at developing syntax-based resources and tools for the Brazilian Portuguese language, striving to achieve world state-of-the-art results.
- Corpus Production: Creating a large, comprehensive multi-genre corpus based on Universal Dependencies for part of speech and syntactically annotated texts.
- Genres: News texts, User-generated content (tweets and online comments).
- Model Training: Investigating recent neural and distributional-based methods for training robust parsing models for Portuguese.
- Opinion Mining: Utilizing syntactic knowledge for various applications:
- Opinion summarization
- Helpfulness prediction
- Aspect identification
- Deception detection
- Emotion classification
This project is part of the Natural Language Processing initiative (NLP²) of the Center for Artificial Intelligence (C4AI) at the University of São Paulo. Sponsored by IBM and FAPESP (grant #2019/07665-4), the center is a part of the FAPESP Engineering Research Centers Program and is committed to state-of-the-art research in Artificial Intelligence, exploring both foundational issues and applied research.
See the web portal of NLP² at this link.
The POeTiSA initiative is also supported by the Ministry of Science, Technology and Innovation, with resources from Law n. 8,248, of October 23, 1991, under the PPI-SOFTEX, coordinated by Softex and published as Residency in TIC 13, DOU 01245.010222/2022-44. The project also benefits from an additional research grant for a related project coordinated by Prof. Ivandré Paraboni (FAPESP #2021/08213-0).
The DANTE project is a specific initiative within POeTiSA, focusing on the creation of corpora on various topics. This involves compiling and annotating texts from multiple genres and sources to build a rich dataset that can be used for training and evaluating NLP models.
We welcome contributions from the community! Please see our contribution guidelines for more information on how you can help.
This project is licensed under the MIT License - see the LICENSE file for details.
For more information, please contact us at email@example.com.
This project is part of the broader efforts of the Natural Language Processing initiative (NLP²) at the Center for Artificial Intelligence (C4AI), University of São Paulo.