You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The reason behind the slow adoption of bf16 in CPU: for training, most models are trained with GPU; for inference, int8 and int4 quantization have better support (no need specified hardware).
Describe the issue
MatMul in ONNX OpSet 13 started to support bf16 (https://onnx.ai/onnx/operators/onnx__MatMul.html)
However, we dont see the implementation for bfloat16 in the CPU EP for matmul(13), https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/cpu/math/matmul.cc#L61-L89
Is there any reason this is still not supported since OpSet was released long time ago?
If we want to implement it on our own, is there any PR i can reference to?
Ping @snnn @pranavsharma for help
To reproduce
NA
Urgency
No response
Platform
Windows
OS Version
11
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
NA
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: