bf16 kernel (OpSet13) for MatMul in CPU EP #20630

ZchiPitt · 2024-05-09T21:57:36Z

Describe the issue

MatMul in ONNX OpSet 13 started to support bf16 (https://onnx.ai/onnx/operators/onnx__MatMul.html)

However, we dont see the implementation for bfloat16 in the CPU EP for matmul(13), https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/cpu/math/matmul.cc#L61-L89

Is there any reason this is still not supported since OpSet was released long time ago?

If we want to implement it on our own, is there any PR i can reference to?

Ping @snnn @pranavsharma for help

To reproduce

NA

Urgency

No response

Platform

Windows

OS Version

11

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

NA

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

tianleiwu · 2024-05-10T05:56:26Z

The reason behind the slow adoption of bf16 in CPU: for training, most models are trained with GPU; for inference, int8 and int4 quantization have better support (no need specified hardware).

If you want to implement it, I think you can use hardware specified library. For intel CPU, you can start from the following:
https://github.com/oneapi-src/oneDNN/blob/df3022638aaab0d1fdf62bc6ab16d9031739a0fc/src/cpu/gemm/gemm.cpp#L286

github-actions bot added the platform:windows issues related to the Windows platform label May 9, 2024

snnn added feature request request for unsupported feature or enhancement core runtime issues related to core runtime and removed platform:windows issues related to the Windows platform labels May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bf16 kernel (OpSet13) for MatMul in CPU EP #20630

bf16 kernel (OpSet13) for MatMul in CPU EP #20630

ZchiPitt commented May 9, 2024 •

edited

tianleiwu commented May 10, 2024 •

edited

bf16 kernel (OpSet13) for MatMul in CPU EP #20630

bf16 kernel (OpSet13) for MatMul in CPU EP #20630

Comments

ZchiPitt commented May 9, 2024 • edited

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

tianleiwu commented May 10, 2024 • edited

ZchiPitt commented May 9, 2024 •

edited

tianleiwu commented May 10, 2024 •

edited