Skip to content

This repository provides hands-on labs on PyTorch-based Distributed Training and SageMaker Distributed Training. It is written to make it easy for beginners to get started, and guides you through step-by-step modifications to the code based on the most basic BERT use cases.

License

daekeun-ml/sm-distributed-training-step-by-step

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SageMaker Distributed Training Step-by-Step

This repository provides hands-on labs on PyTorch-based Distributed Training and SageMaker Distributed Training. It is written to make it easy for beginners to get started, and guides you through step-by-step modifications to the code based on the most basic BERT use cases.

Contents

Requirements

Before starting, you have met the following requirements:

  • ml.g4dn.12xlarge for Notebook instance (Not required if you are familiar with Distributed Training)
  • ml.g4dn.12xlarge & ml.p3.16xlarge for SageMaker Training instance

License Summary

This sample code is provided under the MIT-0 license. See the LICENSE file.

About

This repository provides hands-on labs on PyTorch-based Distributed Training and SageMaker Distributed Training. It is written to make it easy for beginners to get started, and guides you through step-by-step modifications to the code based on the most basic BERT use cases.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published