Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena
-
Updated
Aug 26, 2023 - Python
Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena
Hrrformer: A Neuro-symbolic Self-attention Model (ICML23)
[ICML 2024] Official implementation of "LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions."
HGConv: Holographic Global Convolutional Networks
Streamlined variant of Long-Range Arena with pinned dependencies, automated data downloads, and deterministic shuffling.
The PyTorch implementation of Paramixer.
The PyTorch implementation of Linformer.
Add a description, image, and links to the long-range-arena topic page so that developers can more easily learn about it.
To associate your repository with the long-range-arena topic, visit your repo's landing page and select "manage topics."