End-to-End Deployment of LLaMA-3 Model on SageMaker

This Notebook demonstrates the complete process of deploying the LLaMA-3 language model as an end-to-end inference service using SageMaker. The main steps include:

Installing the necessary libraries and downloading the pre-trained LLaMA-3 model from Hugging Face to the local environment.
Uploading the downloaded model to Amazon S3 for subsequent deployment.
Preparing the required configurations for model deployment, including the inference entry point script, container image, and service parameters.
Utilizing SageMaker's managed inference functionality to create a model and an endpoint for providing online inference services.
Testing the deployed model's inference service using a text generation task as an example, showcasing how to invoke the deployed endpoint.

Environment Requirements

This Notebook needs to be run on a SageMaker Notebook instance.
The Notebook instance should have access to S3 for storing the model.
A Hugging Face API Token is required to download the model.

Usage Instructions

Open this Notebook on a SageMaker Notebook instance.
Execute the steps in the Notebook sequentially, making sure to replace the S3 bucket paths with your own.
During the model deployment process, you can adjust the type and number of compute instances according to your needs.
The model testing section demonstrates two invocation methods: with and without streaming. You can choose based on your requirements.
The last code cell provides sample code for resource cleanup. Execute it with caution.

By following this Notebook, you will learn how to quickly deploy large language models using SageMaker's managed inference functionality and provide flexible online inference services. In real-world applications, you can further optimize the model's performance and throughput to meet business demands.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
README.md		README.md
llama3_deploy_lmi_v9.ipynb		llama3_deploy_lmi_v9.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End-to-End Deployment of LLaMA-3 Model on SageMaker

Environment Requirements

Usage Instructions

About

Releases

Packages

Languages

tsaol/llama-on-aws-sagemaker

Folders and files

Latest commit

History

Repository files navigation

End-to-End Deployment of LLaMA-3 Model on SageMaker

Environment Requirements

Usage Instructions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages