LLaMA3 (Large Language Model by META AI) is a leading-edge large language model that excels in AI technology. π This repositoryπ is intended to provide information necessary to kick-start various projectsπ using LLaMA3.
- Official Website π
- Access Request π¬
- Meta Llama Model Card π΄
- Kaggle Meta π
- Meta Github πββ¬
Name | Description | Link |
---|---|---|
Grok | High-performance AI Chip enabling LLaMA3 inference and API calls | Grok π |
AWS | Bedrock support for LLaMA at AWS, currently only Llama2 available | AWS π |
Azure | Support for 8B/70B models on Microsoft Azure, searchable via Azure Marketplace | Azure π |
GCP | Google Cloud Vertax AI support for LLaMA3 | GCP π |
together.ai | Support for Llama2, CodeLlama, Llama3 8b/70b instances | together.ai π |
replicate | Llama3 API support (Node.js, Python, HTTP) | replicate π |
llama AI | Support for Llama3 8B/70B, supports other OpenLLMs | llama AI π |
aimlapi | Supports various openLLMs as APIs | AI/ML API |
Nvidia API | Multiple OpenLLM models available Nvidia devloper | llama AI π |
Meta AI(github) | Connect to Meta AI api | MetaAI π |
Platform Name | Description | Link |
---|---|---|
HuggingFace | Llama 8B model | Link π |
HuggingFace | Llama 70B model | Link π |
HuggingFace | Llama 8B Instruct model | Link π |
HuggingFace | Llama 70B Instruct model | Link π |
HuggingFace | Llama Guard-2-8B(policy model) | Link π |
HuggingFace | Llama 3 70B - FP8(friendliAI) | Link π |
HuggingFace | Llama 3 70B Instruct - FP8(friendliAI) | Link π |
HuggingFace | Llama 3 8B - FP8(friendliAI) | Link π |
HuggingFace | Llama 3 8B Instruct - FP8(friendliAI) | Link π |
HuggingFace | Llama 8B KO (made beomi) | Link π |
Ollama | Support for various lightweight Llama3 models | Link π |
Name | Description | Link |
---|---|---|
gradientai/Llama-3-8B-Instruct-Gradient-1048k | 1M Long Context | Link π |
Trelis/Meta-Llama-3-70B-Instruct-function-calling | function calling | Link π |
Trelis/Meta-Llama-3-8B-Instruct-function-calling | function calling | Link π |
cognitivecomputations/dolphin-2.9-llama3-8b | Uncensored fine-tuning | Link π |
McGill-NLP/Llama-3-8B-Web | Zero-shot internet link selection capability | Link π |
teddylee777/Llama-3-Open-Ko-8B-Instruct-preview-gguf | Korean quantizatied GGUF model for ollama use | Link π |
beomi/Llama-3-Open-Ko-8B-Instruct-preview | Korean model trained with the Chat vector method | Link π |
Name | Link |
---|---|
HuggingChat | Link π |
Groq | Link π |
together.ai | Link π |
replicate Llama chat(local) | Link π |
perplexity.ai(lightweight model) | Link π |
openrouter.ai | Link π |
MetaAI (Not available in Korea) | Link π |
Morphic(multimodal offerings) | Link π |
Nvidia AI | [Link π |
Name | Type | Link |
---|---|---|
Langchain | RAG | Link π |
llamaindex | RAG | Link π |
llama.cpp | convert | Link π |
Name | Link |
---|---|
Meta | Link π |
torchrune | Link π |
LLaMAFactory | Link π |
axolotl | Link π |
Information | Link |
---|---|
Prompt Engineering Guide | Link π |
Using llama3 with WEB UI | Link π |
API with Ollama, LangChain and ChromaDB with Flask API and PDF upload | Link π |
Guide for tuning and inference with Llama on MacBook | Link π |
Fine-tune Llama 3 with ORPO | Link π |
Qlora_aplaca_llama3 finetune | Link π |
fully local RAG agents with LLama3 | Link π |
RAG Chatbot LLama3(HF) | Link π |
llama index RAG llama3 | Link π |
ollama RAG + UI(Gradio) | Link π |
LangGraph + Llama3 | Link π |
RAG(Re-Ranking) | Link π |
Information | Link |
---|---|
HuggingFaceFW/fineweb | Link π |
mlabonne/orpo-dpo-mix-40k | Link π |
Information | Link |
---|---|
FSDP+QLORA finetunning | Link π |
Category | M3 Max | M1 Pro | RTX 4090 |
---|---|---|---|
CPU Cores | 16 cores | 10 cores | 16 cores AMD |
Memory | 128GB | 16GB /32GB | 32GB |
GPU Memory | 16 core CPU & 40 core GPU, 400GB/s memory bandwidth | 10 core CPU(8 performance cores & 2 efficiency cores) 16 core GPU 200GB/s memory bandwidth | 24GB |
Model 7B | - Performs well on all computers | - Performs well on all computers | - Performs well on all computers, similar performance to M3 Max |
Model 13B | - Good performance | - Third best performance | - Best performance |
Model 70B | - Runs quickly, utilizing 128GB memory | - Lacks memory at 16GB, prone to crashes and reboots | - Cannot run on GPU, very slow on CPU |
Lightening | Not necessary with sufficient memory | Should be considered | Quantization compromises necessary |
Power Consumption | 65W | 250-300W | |
Value for Money | Excellent ($4600) | Relatively low ($6000 for A6000 GPU) |
Would you like to contribute to this repository? Feel free to leave your comments on Issues or send a Pull Request. All types of contributions are welcome!
Need more information or wish to collaborate? Click here to send me a message. Let's share knowledge together!