Skip to content
This repository has been archived by the owner on May 31, 2024. It is now read-only.

slow on gpu / parallel mode #3

Open
ghost opened this issue Oct 31, 2020 · 2 comments
Open

slow on gpu / parallel mode #3

ghost opened this issue Oct 31, 2020 · 2 comments

Comments

@ghost
Copy link

ghost commented Oct 31, 2020

Hi,

Hope you are all well !

I forked your code and created a flask server for generating question from webpages I scrape. (And, of course, I convert the html into clean text ^^)

It takes a long time (120s in avg) to generate questions (only sentences) even if cuda is available.

Is there a way to optimise the processing time ? I have 3 x gpu on my server, is it possible to enable the parallel or distributed mode for question_generator ?

Cheers,
X

@AMontgomerie
Copy link
Owner

Hey!

That does sound like quite a long time! Currently question generator doesn't support multiple GPUs but I suppose it should be possible using torch.distributed.

To be honest I don't really know much about it, and these tutorials seem to be mostly about distributed training rather than inference, but it might help. I don't currently have access to an environment with multiple GPUs to do any testing though.

@AMontgomerie
Copy link
Owner

Another possibility for speeding up inference would be exporting the model to ONNX.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant