Enable selection of GPU on inference page #1511

rodrigoberriel · 2017-03-15T13:41:16Z

This PR attempts to fix #1418, see #1418 (comment). I tried to minimize changes to avoid side-effects.

If the user doesn't have more than one GPU (or only one CUDA_VISIBLE_DEVICES for DIGITS), nothing is going to change:

But if the user has multiple GPUs (2 or more), it is going to look like this:

where Next Available is the default choice and has the same behavior as the current implementation. If the user selects a GPU (only one can be selected), then this GPU will be used.

PS: I modified images/generic/show.html the same way I did for images/classification/show.html, but I didn't test the former as much as I did on the latter.

lukeyeager · 2017-03-15T16:45:51Z

digits/model/forms.py

+ else get_device(index).totalGlobalMem)
+ ),
+ ) for index in config_value('gpu_list').split(',') if index],
+ default='next',


What happens here if there are no GPUs on the system?

@lukeyeager the same behavior that you'd get if you were to start a training job without a GPU:

If all setup was made with CPU only, you'd get the same page you currently get (without the multi-gpu form)

If you manage to have Caffe compiled with CUDA and has no GPU on the system (suppose you removed it afterwards -- I tested it masking all my GPUs using an invalid id on CUDA_VISIBLE_DEVICES), then you'd get the page without multi-gpu form and your job will fail exactly as the train job fails currently:

Check failed: error == cudaSuccess (38 vs. 0) no CUDA-capable device is detected

Sounds reasonable to me!

gheinrich

sorry for late response and thanks for another great PR! Do you think you could add some unit tests to verify the new functionality?

gheinrich · 2017-03-17T11:23:28Z

digits/model/images/generic/views.py

 return flask.render_template(
 'models/images/generic/show.html',
+ form=generic_form,


I am a bit uneasy about passing the form here when all you need is the list of GPUs. Wouldn't it more explicit and self-explanatory to pass the list instead?

I agree. Do you think this way is okay?

gheinrich · 2017-03-17T11:25:39Z

digits/templates/models/images/classification/show.html

+ {% if show_multi_gpu_form %}
+ <div class="col-sm-6">
+ <div class="form-group{{mark_errors([form.select_one_of_gpus])}}">
+ {{form.select_one_of_gpus.label}}


it's indeed better to use the WTForm fields here but it's inconsistent with the rest of the file... so I'm not sure about that...

I've changed that too. What you think?

rodrigoberriel · 2017-03-17T13:22:12Z

@gheinrich I'm on a deadline (March 27), but I could work on the unit tests after that. I have to say that I am not familiar with unit tests using flask, but I can give it a try.

RSly · 2017-07-27T09:07:54Z

Hi, any update on the merge for this PR?
tnx

rodrigoberriel force-pushed the select_gpu_inference branch from 6d019e1 to a1fa842 Compare March 15, 2017 14:06

Enable selection of GPU on inference page

a1fa842

lukeyeager reviewed Mar 15, 2017

View reviewed changes

lukeyeager approved these changes Mar 15, 2017

View reviewed changes

lukeyeager assigned gheinrich Mar 15, 2017

gheinrich reviewed Mar 17, 2017

View reviewed changes

Pass the list of GPUs only and remove WTForm

019414f

lukeyeager unassigned gheinrich Jul 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable selection of GPU on inference page #1511

Enable selection of GPU on inference page #1511

rodrigoberriel commented Mar 15, 2017

lukeyeager Mar 15, 2017

rodrigoberriel Mar 15, 2017

lukeyeager Mar 15, 2017

gheinrich left a comment

gheinrich Mar 17, 2017

rodrigoberriel Mar 17, 2017

gheinrich Mar 17, 2017

rodrigoberriel Mar 17, 2017

rodrigoberriel commented Mar 17, 2017

RSly commented Jul 27, 2017

Enable selection of GPU on inference page #1511

Are you sure you want to change the base?

Enable selection of GPU on inference page #1511

Conversation

rodrigoberriel commented Mar 15, 2017

lukeyeager Mar 15, 2017

Choose a reason for hiding this comment

rodrigoberriel Mar 15, 2017

Choose a reason for hiding this comment

lukeyeager Mar 15, 2017

Choose a reason for hiding this comment

gheinrich left a comment

Choose a reason for hiding this comment

gheinrich Mar 17, 2017

Choose a reason for hiding this comment

rodrigoberriel Mar 17, 2017

Choose a reason for hiding this comment

gheinrich Mar 17, 2017

Choose a reason for hiding this comment

rodrigoberriel Mar 17, 2017

Choose a reason for hiding this comment

rodrigoberriel commented Mar 17, 2017

RSly commented Jul 27, 2017