Skip to content

Here's how to use Lama3 for beginners and what services are being used.

Notifications You must be signed in to change notification settings

jh941213/LLaMA3_cookbook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

84 Commits
Β 
Β 

Repository files navigation

LLaMA3_Cookbook πŸ¦™βœ¨

d

LLaMA3 (Large Language Model by META AI) is a leading-edge large language model that excels in AI technology. 🌟 This repositoryπŸ“ is intended to provide information necessary to kick-start various projectsπŸš€ using LLaMA3.

Official Website and Information 🌐

⚑️ Cloud API

API Calls Available πŸ”Œ

Name Description Link
Grok High-performance AI Chip enabling LLaMA3 inference and API calls Grok 🌐
AWS Bedrock support for LLaMA at AWS, currently only Llama2 available AWS 🌐
Azure Support for 8B/70B models on Microsoft Azure, searchable via Azure Marketplace Azure 🌐
GCP Google Cloud Vertax AI support for LLaMA3 GCP 🌐
together.ai Support for Llama2, CodeLlama, Llama3 8b/70b instances together.ai 🌐
replicate Llama3 API support (Node.js, Python, HTTP) replicate 🌐
llama AI Support for Llama3 8B/70B, supports other OpenLLMs llama AI 🌐
aimlapi Supports various openLLMs as APIs AI/ML API
Nvidia API Multiple OpenLLM models available Nvidia devloper llama AI 🌐
Meta AI(github) Connect to Meta AI api MetaAI 🌐

πŸ€– Inference 🧠

Inference Platforms πŸ–₯️

Platform Name Description Link
HuggingFace Llama 8B model Link 🌐
HuggingFace Llama 70B model Link 🌐
HuggingFace Llama 8B Instruct model Link 🌐
HuggingFace Llama 70B Instruct model Link 🌐
HuggingFace Llama Guard-2-8B(policy model) Link 🌐
HuggingFace Llama 3 70B - FP8(friendliAI) Link 🌐
HuggingFace Llama 3 70B Instruct - FP8(friendliAI) Link 🌐
HuggingFace Llama 3 8B - FP8(friendliAI) Link 🌐
HuggingFace Llama 3 8B Instruct - FP8(friendliAI) Link 🌐
HuggingFace Llama 8B KO (made beomi) Link 🌐
Ollama Support for various lightweight Llama3 models Link 🌐

HuggingFace Models πŸ₯

Name Description Link
gradientai/Llama-3-8B-Instruct-Gradient-1048k 1M Long Context Link 🌐
Trelis/Meta-Llama-3-70B-Instruct-function-calling function calling Link 🌐
Trelis/Meta-Llama-3-8B-Instruct-function-calling function calling Link 🌐
cognitivecomputations/dolphin-2.9-llama3-8b Uncensored fine-tuning Link 🌐
McGill-NLP/Llama-3-8B-Web Zero-shot internet link selection capability Link 🌐
teddylee777/Llama-3-Open-Ko-8B-Instruct-preview-gguf Korean quantizatied GGUF model for ollama use Link 🌐
beomi/Llama-3-Open-Ko-8B-Instruct-preview Korean model trained with the Chat vector method Link 🌐

πŸ’¬ Chat Interface (Related Information) πŸ’»

Name Link
HuggingChat Link 🌐
Groq Link 🌐
together.ai Link 🌐
replicate Llama chat(local) Link 🌐
perplexity.ai(lightweight model) Link 🌐
openrouter.ai Link 🌐
MetaAI (Not available in Korea) Link 🌐
Morphic(multimodal offerings) Link 🌐
Nvidia AI [Link 🌐

LLaMA Framework πŸ“˜

Name Type Link
Langchain RAG Link 🌐
llamaindex RAG Link 🌐
llama.cpp convert Link 🌐

πŸ› οΈ Fine-tuning πŸ”§

Name Link
Meta Link 🌐
torchrune Link 🌐
LLaMAFactory Link 🌐
axolotl Link 🌐

LLAMA3_Cookbook πŸ‘©β€πŸ³

Information Link
Prompt Engineering Guide Link 🌐
Using llama3 with WEB UI Link 🌐
API with Ollama, LangChain and ChromaDB with Flask API and PDF upload Link 🌐
Guide for tuning and inference with Llama on MacBook Link 🌐
Fine-tune Llama 3 with ORPO Link 🌐
Qlora_aplaca_llama3 finetune Link 🌐
fully local RAG agents with LLama3 Link 🌐
RAG Chatbot LLama3(HF) Link 🌐
llama index RAG llama3 Link 🌐
ollama RAG + UI(Gradio) Link 🌐
LangGraph + Llama3 Link 🌐
RAG(Re-Ranking) Link 🌐

LLM Dataset πŸ—‚οΈ

Information Link
HuggingFaceFW/fineweb Link 🌐
mlabonne/orpo-dpo-mix-40k Link 🌐

LLM skills πŸ“Œ

Information Link
FSDP+QLORA finetunning Link 🌐

Mac vs 4090 Comparison πŸ–₯οΈπŸ†šπŸ–₯️

Category M3 Max M1 Pro RTX 4090
CPU Cores 16 cores 10 cores 16 cores AMD
Memory 128GB 16GB /32GB 32GB
GPU Memory 16 core CPU & 40 core GPU, 400GB/s memory bandwidth 10 core CPU(8 performance cores & 2 efficiency cores) 16 core GPU 200GB/s memory bandwidth 24GB
Model 7B - Performs well on all computers - Performs well on all computers - Performs well on all computers, similar performance to M3 Max
Model 13B - Good performance - Third best performance - Best performance
Model 70B - Runs quickly, utilizing 128GB memory - Lacks memory at 16GB, prone to crashes and reboots - Cannot run on GPU, very slow on CPU
Lightening Not necessary with sufficient memory Should be considered Quantization compromises necessary
Power Consumption 65W 250-300W
Value for Money Excellent ($4600) Relatively low ($6000 for A6000 GPU)

πŸ™Œ Contributing πŸ’–

Would you like to contribute to this repository? Feel free to leave your comments on Issues or send a Pull Request. All types of contributions are welcome!

πŸ“© Contact Us πŸ’Œ

Need more information or wish to collaborate? Click here to send me a message. Let's share knowledge together!

About

Here's how to use Lama3 for beginners and what services are being used.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published