Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-GPU implementation #5

Open
Lyken17 opened this issue Mar 24, 2017 · 4 comments
Open

Multi-GPU implementation #5

Lyken17 opened this issue Mar 24, 2017 · 4 comments

Comments

@Lyken17
Copy link

Lyken17 commented Mar 24, 2017

Hi author

Thanks for sharing your code. I notice in README you said "Multi-gpu help wanted". If you are indicating data parallelism, then it can be implemented in several lines in pytorch using nn.DataParallel.

In your train.py line 82, simply modify code

    if args.cuda:
        net = net.cuda()

to

    if args.cuda:
        net = net.cuda()
        net = nn.DataParallel(net, devices=[0,1,2,3])

can make whole model parallel.

@Lyken17
Copy link
Author

Lyken17 commented Mar 24, 2017

I wonder, did you not implement it for some concerns? Like gradient correctness, numerical stability, convergence ... I just migrated from torch, and heard there are still some bugs in pytorch.

@varun-suresh
Copy link

I had the same question. @Lyken17 Did you train on multi-gpus?

@bamos
Copy link
Owner

bamos commented May 4, 2017

Hi, I didn't try training on multiple GPUs. The issues @Lyken17 mentions can potentially happen, but I wouldn't expect hem to happen.

@Lyken17
Copy link
Author

Lyken17 commented Jul 19, 2017

Hello @varun-suresh @bamos , though this reply is two months late, I want to tell you multi-gpus work as expect on pytorch.

I use default setting in code. When using single gpu, I get error rate of 5.01%. When using 2 gpus, I get error rate of 4.67%. Experiments on 3 and 4 gpus are on the way, I believe it will converge well. I will push a PR after experiments.

  • One GPU
    1 GPU

  • Two GPUs
    python train.py --gpus 0,1 54374.10s user 3590.60s system 143% cpu 11:12:57.53 total
    loss-error

  • Three GPUs
    loss-error

PS: I love 1080ti -- the most cost efficient card! I can buy more cards with the same cost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants