Skip to content

Issues: NVIDIA/TensorRT-LLM

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Assignee
Filter by who’s assigned
Sort

Issues list

Is it "INT8 or FP8" with "--use_weight_only --weight_only_precision int8 --qformat fp8" bug Something isn't working quantization Issue about lower bit quantization, including int8, int4, fp8 question Further information is requested
#1810 opened Jun 19, 2024 by aiiAtelier
2 of 4 tasks
prompt_vocab_size is ignored in executor API bug Something isn't working
#1809 opened Jun 19, 2024 by thefacetakt
2 of 4 tasks
cluster key option not working? question Further information is requested triaged Issue has been triaged by maintainers
#1807 opened Jun 19, 2024 by tonylek
Medusa with Mixtral 8x7B question Further information is requested
#1798 opened Jun 18, 2024 by v-dicicco
CogVLM just supports one image as input in the fixed place feature request New feature or request Investigating question Further information is requested
#1790 opened Jun 17, 2024 by littletomatodonkey
Qwen2 1.5B checkpoint conversion broken bug Something isn't working triaged Issue has been triaged by maintainers waiting for feedback
#1785 opened Jun 14, 2024 by yaysummeriscoming
2 of 4 tasks
Unable to convert LLaVa model to tensorrt triaged Issue has been triaged by maintainers waiting for feedback wontfix This will not be worked on
#1776 opened Jun 13, 2024 by tanveer-sayyed
2 of 4 tasks
ChatGLM3 6B Multi-batch Failed with Error bug Something isn't working Investigating
#1775 opened Jun 13, 2024 by RobinJYM
2 of 4 tasks
InferenceRequest::serialize does not handle logits post processor, log an error bug Something isn't working triaged Issue has been triaged by maintainers
#1771 opened Jun 12, 2024 by DreamGenX
4 tasks
Fail to build w4a8_awq on Llama 13b bug Something isn't working triaged Issue has been triaged by maintainers waiting for feedback
#1770 opened Jun 12, 2024 by Hongbosherlock
2 of 4 tasks
How to identify the rest toke latency? benchmark question Further information is requested triaged Issue has been triaged by maintainers
#1761 opened Jun 11, 2024 by RobinJYM
2 of 4 tasks
Internlm2 only runs normally on adjacent GPUs. bug Something isn't working triaged Issue has been triaged by maintainers waiting for feedback
#1759 opened Jun 10, 2024 by yuanphoenix
1 of 4 tasks
AWQ performance issue for higher batches bug Something isn't working quantization Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers
#1757 opened Jun 8, 2024 by canamika27
2 of 4 tasks
ProTip! Type g i on any issue or pull request to go back to the issue listing page.