Skip to content

An LLM continually pre-trained specifically for Danish.

License

Notifications You must be signed in to change notification settings

nlpnorth/snakmodel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SnakModel Logo

SnakModel is a 7B-parameter, autoregressive language model specifically designed for Danish. There are both an instruction-tuned variant, as well as a base version for further fine-tuning. Our models build upon Llama 2, which we continuously pre-train on a diverse collection of Danish corpora comprising 350M documents and 13.6B words, before tuning it on 3.7M Danish instruction-answer pairs.

Developers

🧭 NLPnorth research unit at the IT University of Copenhagen, Denmark.

Resources

  • 💬 SnakModeller:
  • ⚙️ Model Training Dynamics:
    • Research Paper: coming in Q1 2025.
    • Codebase: coming soon to this repository.
  • 🇩🇰 Cultural Awareness Evaluation:
    • Research Paper: coming in Q1 2025 (pre-print coming soon).
    • Codebase: coming soon to this repository.
    • Web-based LLM Evaluation Interface: coming soon.

About

An LLM continually pre-trained specifically for Danish.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published