Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QAT : TRT 8 compatible workflow #804

Draft
wants to merge 33 commits into
base: master
Choose a base branch
from

Conversation

SrivastavaKshitij
Copy link
Contributor

@SrivastavaKshitij SrivastavaKshitij commented Sep 13, 2022

Hi @jaybdub

I am introducing this new QAT workflow which is compatible with TensorRT 8.

TRT introduced IQuantize and IDequantize Layers which are to be manually placed in the network based on the guidelines mentioned in Q/DQ placement.

I have added support to quantize nn.Conv2d, nn.MaxPool2d and nn.AdaptiveAvfPool2d - layers that are necessary to quantize Resnet(s). I have also added a QuantGenericTensor which can be used to add QDQ layer anywhere in the model based on Nvidia's guidelines.

This PR also introduces the option to choose between per tensor quantization and per channel quantization. All quant layers are scriptable with torch.jit.script

Most of the files that I have modified / changed are under contrib folders, so it doesn't affect the main torch2trt library.

I will continue to add support for more layers but I believe this PR is big enough to land and then I can put up smaller PRs to add more functionalities.

Entire workflow is tested with Pytorch NGC Container 22.04-py3

Thanks.

@SrivastavaKshitij SrivastavaKshitij changed the title [WIP] QAT : TRT 8 compatible workflow QAT : TRT 8 compatible workflow Sep 16, 2022
input = input_quantizer.get_output(0),
scale = scale_trt.get_output(0))

if hasattr(module._input_quantizer,'quant_axis'):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it can be simplified by re-using the result from the if block at line 25

@SrivastavaKshitij SrivastavaKshitij marked this pull request as draft October 17, 2022 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants