GitHub - noconnor/yahoo-sagemaker-training

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
byoc_pytorch		byoc_pytorch
byos_pytorch		byos_pytorch
data_parallel_pytorch_code_changes		data_parallel_pytorch_code_changes
data_parallel_tensorflow_code_changes		data_parallel_tensorflow_code_changes
feature_store		feature_store
huggingface_data_parallelism_checkpointing		huggingface_data_parallelism_checkpointing
huggingface_data_parallelism_incremental_training		huggingface_data_parallelism_incremental_training
sagemaker_pipelines		sagemaker_pipelines
spark_distributed_data_processing		spark_distributed_data_processing
xgboost_builtin_distributed		xgboost_builtin_distributed
xgboost_pyspark		xgboost_pyspark
xgboost_script_mode_distributed		xgboost_script_mode_distributed
.gitignore		.gitignore
README.md		README.md

Repository files navigation

Table of Contents

Session 1 Hands-on Labs

data_parallel_pytorch_code_changes points basic code changes when you bring a PyTorch script to use SageMaker distributed training library data parallelism.
data_parallel_tensorflow_code_changes points basic code changes when you bring a Tensorflow script to use SageMaker distributed training library data parallelism.
huggingface_data_parallelism_checkpointing shows how to leverage SageMaker checkpointing to save transformer checkpoints during SageMaker distribued training (data parallelism) and resume training from checkpoints.
huggingface_data_parallelism_incremental_training shows how to do SageMaker distributed training (data parallelism) with HuggingFace framework and how to do incremental training from a saved model artifact.

Session 2 Hands-on Labs

byos_pytorch takes PyTorch framework as an example to show how to bring your own script to train and deploy a model on SageMaker.
byoc_pytorch shows how to extend AWS pre-built deep learning container (PyTorch as an example) to build your own container and bring it to SageMaker for model training.
xgboost_builtin_distributed shows doing distributed training with SageMaker built-in XgBoost algorithm, and using SageMaker automatic model tuning to tune model hyperparameters.
xgboost_script_mode_distributed shows how to leverage pre-built XgBoost framework container to train a XgBoost model in a distributed training fashion.
xgboost_pyspark shows using SageMaker pre-built Spark container to train a XgBoost model. Note: notebook is tested on SageMaker classic notebook instance.

Other sample codes

feature_store Example use case with Offline Feature Store SDK and create dataset
spark_distributed_data_processing Example use case with Distributed Data Processing using Apache Spark and SageMaker Processing
sagemaker_pipelines Example use case with SageMaker Pipelines which includes Processing, Training, Evaluation, Condition and Model Registry Steps

About

No description, website, or topics provided.

Report repository

Releases

No releases published

Packages

No packages published

Languages

Jupyter Notebook 91.5%
Python 7.6%
Other 0.9%