Octopus is a neural machine generation toolkit for Arabic Natural Lnagauge Generation (NLG) that described in our ArabiNLP 2023 paper: OCTOPUS: A Multitask Model and Toolkit for Arabic Natural Language Generation.
Octopus designed for eight machine generation tasks, encompassing diacritization, grammatical error correction, news headlines generation, paraphrasing, question answering, question generation, and transliteration. This comprehensive package includes a Python library along with associated command-line scripts.
- To install octopus and develop directly GitHub repo using pip:
pip install -U git+https://github.com/UBC-NLP/octopus.git
- To install octopus and develop locally:
git clone https://github.com/UBC-NLP/octopus.git
cd octopus
pip install .
The full documentation contains instructions for getting started, translation using diffrent methods, intergrate OCTOPUS with your code, and provides more examples.
Command | Content | Colab link |
octopus |
|
|
octopus_interactive |
|
Functions | Content | Colab link |
generate generate_from_file |
|
octopus(-py) is Apache-2.0 licensed. The license applies to the pre-trained models as well.
If you use OCTOPUS toolkit or the pre-trained models for your scientific publication, or if you find the resources in this repository useful, please cite our paper as follows (to be updated):
@misc{elmadany2023octopus,
title={Octopus: A Multitask Model and Toolkit for Arabic Natural Language Generation},
author={AbdelRahim Elmadany and El Moatez Billah Nagoudi and Muhammad Abdul-Mageed},
year={2023},
eprint={2310.16127},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
We gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada (NSERC; RGPIN-2018-04267), the Social Sciences and Humanities Research Council of Canada (SSHRC; 435-2018-0576; 895-2020-1004; 895-2021-1008), Canadian Foundation for Innovation (CFI; 37771), ComputeCanada (CC), UBC ARC-Sockeye and Advanced Micro Devices, Inc. (AMD). Any opinions, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of NSERC, SSHRC, CFI, CC, AMD, or UBC ARC-Sockeye.