-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about output type of QuantizeLinear #3844
Comments
What does it look like in onnx? |
You implement a layernorm plugin? TRT has native support. |
@zerollzeng thank you and sorry for the late reply. I will try the layernorm plugin lately. But now what i really want to ask is how to serialize the "manually fused qdq onnx ". As seen below, left is the onnx with a couple of qdq nodes, and right is fused onnx. Firstly, when i serialize the model with only "kINT8 DataType" supported in trt plugin, as It produced the error:
So i asked "why out type of QuantizeLinear is float". and, if i serialize the model with "kFLOAT DataType" supported also in trt plugin, as It produced the error:
With trt-8531 and cuda 11.3. Thank you very much. |
Your Q/DQ placement is wrong to me, try Q->MyOp->DQ if your plugin is in int8. |
Sorry, i didn't understand what this mean. |
hello, why the log showed out type of QuantizeLinear is float
The text was updated successfully, but these errors were encountered: