Question about output type of QuantizeLinear #3844

cc-sketch · 2024-05-06T13:12:48Z

hello, why the log showed out type of QuantizeLinear is float

cc-sketch · 2024-05-06T13:21:18Z

and, when i serialize manually fused qdq onnx when only int8 in-out type supported in plugin, like
bool supportsFormatCombination(int32_t pos, const PluginTensorDesc* inOut, int32_t nbInputs, int32_t nbOutputs) NOEXCEPT override { switch (pos) { case 0: return ((inOut[0].type == DataType::kINT8) && (inOut[0].format == TensorFormat::kCHW4 || inOut[0].format == TensorFormat::kCHW32)); case 1: return (inOut[pos].type == inOut[0].type) && (inOut[pos].format == inOut[0].format); default: return false; } return false; }

it failed with log like

[05/06/2024-21:04:56] [TRT] [E] 9: [pluginV2Builder.cpp::reportPluginError::23] Error Code 9: Internal Error (LayerNorm_4: could not find any supported formats consistent with input/output data types)

zerollzeng · 2024-05-12T07:12:53Z

hello, why the log showed out type of QuantizeLinear is float

What does it look like in onnx?

zerollzeng · 2024-05-12T07:13:53Z

You implement a layernorm plugin? TRT has native support.

cc-sketch · 2024-05-14T12:24:06Z

@zerollzeng thank you and sorry for the late reply. I will try the layernorm plugin lately. But now what i really want to ask is how to serialize the "manually fused qdq onnx ". As seen below, left is the onnx with a couple of qdq nodes, and right is fused onnx.

Firstly, when i serialize the model with only "kINT8 DataType" supported in trt plugin, as

It produced the error:

[pluginV2Builder.cpp::reportPluginError::23] Error Code 9: Internal Error (MyOp_3: could not find any supported formats consistent with input/output data types)

So i asked "why out type of QuantizeLinear is float".

and, if i serialize the model with "kFLOAT DataType" supported also in trt plugin, as

It produced the error:

[optimizer.cpp::filterQDQFormats::4422] Error Code 2: Internal Error (Assertion !n->candidateRequirements.empty() failed. All of the candidates were removed, which points to the node being incorrectly marked as an int8 node.)

With trt-8531 and cuda 11.3. Thank you very much.

zerollzeng · 2024-05-19T03:15:46Z

Your Q/DQ placement is wrong to me, try Q->MyOp->DQ if your plugin is in int8.

cc-sketch · 2024-05-20T02:22:48Z

Sorry, i didn't understand what this mean.

I manually fused the model. From Q->DQ->MyOp->Q->DQ, to Q-> MyOp->DQ, like https://forums.developer.nvidia.com/t/where-do-tensorrt-plugin-determine-whether-fuse-qdq-or-not/226705/4

So, for my question what is wrong? Is there sth wrong for quantized model generation, or sth wrong in MyOp plugin? It confused me for a long time.

zerollzeng self-assigned this May 12, 2024

zerollzeng added the triaged Issue has been triaged by maintainers label May 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about output type of QuantizeLinear #3844

Question about output type of QuantizeLinear #3844

cc-sketch commented May 6, 2024 •

edited

cc-sketch commented May 6, 2024

zerollzeng commented May 12, 2024

zerollzeng commented May 12, 2024

cc-sketch commented May 14, 2024 •

edited

zerollzeng commented May 19, 2024 •

edited

cc-sketch commented May 20, 2024

Question about output type of QuantizeLinear #3844

Question about output type of QuantizeLinear #3844

Comments

cc-sketch commented May 6, 2024 • edited

cc-sketch commented May 6, 2024

zerollzeng commented May 12, 2024

zerollzeng commented May 12, 2024

cc-sketch commented May 14, 2024 • edited

zerollzeng commented May 19, 2024 • edited

cc-sketch commented May 20, 2024

cc-sketch commented May 6, 2024 •

edited

cc-sketch commented May 14, 2024 •

edited

zerollzeng commented May 19, 2024 •

edited