-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add squeeze / unsqueeze operations to quant invariant functions in torch_handler.py
#891
Comments
Hi @nickfraser , You wanted to add squeeze / unsqueeze operations in quant_invariant_handler function right ? |
When dealing with squeeze/unsqueeze, we also have to handle the shapes of scale factors and zero points. Related to #728 |
For per-channel quantization Squeeze/Unsqueeze OP is more like Permute OP, where we can find easy ways to modify the QuantTensor to keep those OPs affine quantization invariant. In the case of Squeeze/Unsqueeze OP, all we need to do is squeeze/unsqueeze the scale and zero point tensor accordingly. However, OPs mentioned in #728 (reshape, flatten) are non-trivial. There are no trivial ways to modify the QuantTensor to keep those OPs affine quantization invariant. Recalculation of scale and zero point is inevitable. We may need to dequantize --> reshape/flatten --> requantize to bypass this problem, at the price of precision loss. It looks like PyTorch doesn't solve this problem either. They don’t offer a quantized version of the flatten(); instead, they simply use torch.flatten(). QUANTIZATION API REFERENCE |
A PR has been submitted to solve this issue. Your comments are highly appreciated, many thanks. |
No description provided.
The text was updated successfully, but these errors were encountered: