Skip to content

zjykzj/pytorch-distributed

Repository files navigation

Language: 🇺🇸 🇨🇳

«pytorch-distributed» use PyTorch DistributedDataParallel implements distributed computing, and use AMP implements the mixed precision operation

At present, only single machine and multi-card scenarios are considered

Table of Contents

Background

Distributed computing can make full use of the computing power of multi-card GPU and train better model parameters faster; At the same time, on the one hand, mixed precision training can improve the training speed, on the other hand, it can also reduce the memory occupation in the training stage and allow larger batches

Install

$ pip install -r requirements.txt

Usage

At present, four training scenarios are implemented:

  • Single card training
  • Multi-card training
  • Single card hybrid precision training
  • Multi-card hybrid precision training

Maintainers

  • zhujian - Initial work - zjykzj

Thanks

Contributing

Anyone's participation is welcome! Open an issue or submit PRs.

Small note:

License

Apache License 2.0 © 2020 zjykzj