This repository has been archived by the owner on Jan 3, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 222
Dynamic shape for input like NLP #4957
Comments
I thought of dynamic shapes as like C++ vectors; there's allocated space and something separate that indicates how much of that space you are actually using. You might cache particular compiled combinations of max sizes for inputs. If you are a server, dynamic batch size can help with latency since you would only need to compute for as many samples as you had ready. You could also imagine kernels that could make use of knowing actual sample lengths to reduce the computation in those transformer GEMMs (some hardware skips 0 arithmetic and some hardware does a tile of GEMM in the same amount of time as a partial tile, so it wouldn't matter for them). Whatever Intel did with ngraph along those lines would be in the OpenVINO repo. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi , I have a question and that is :
I know that nGraph supports dynamic shape for input but is there any improvement for such models ? (especially NLP models like Bert )
The text was updated successfully, but these errors were encountered: