-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The quantization parameters of encodings are inconsistent with the quantization parameters in embeeding.onnx #2680
Comments
Are you passing a config file to quantization sim API which aligns with HTP hardware? |
Yes, I found that although the quantization parameters in embed_onnx and encodings are inconsistent, the quantization parameters in cpp after passing qnn_convert_onnx are the same, but the running results on HTP are still inconsistent, and the max difference in integers exceeds 10 (int8). |
@hcqylymzc This could potentially happen when the nodes in the resulting QNN model(defined in the cpp file) are not aligned with the encodings exported from AIMET. One of the main reason for this is the various optimizations done by the converter which might result in a different set of nodes compared to what the encodings were generated for in AIMET, resulting in a drop in accuracy when taken to target. To see what's going on with your case, can you please provide the following information?
|
Hi, I am aligning the accuracy of the x86 platform and the htp platform. I found that the quantization parameters in the encodings output by sim.export are inconsistent with the quantization parameters in x_embeed.onnx with the qdq node. Why is this? How are the quantization parameters calculated?
I also tried to manually extract the quantization parameters from x_embeed.onnx, but found that the accuracy of the extracted encodings on the HTP platform was different from the accuracy of the encodings exported by aimet. Why is this? Is there any way to align the precision of x_embeed.onnx with onnx+encodings?
The text was updated successfully, but these errors were encountered: