Whisper does not converted using onnxruntime-directml #813

DimQ1 · 2023-12-13T06:35:56Z

I prepared a configuration file for converting whisper using directml but the process fails with an error.

To Reproduce

Expected behavior
It would be grate to use whisper with directml

Olive config
Use following configuration file to convert model:
whisper_gpu_int8_dml.json

Olive logs
[2023-12-13 09:32:13,697] [DEBUG] [resource_path.py:156:create_resource_path] Resource path D:\Learnig\AI\Olive\examples\whisper\cache\models\0_OnnxConversion-386174a033bcd76f8941e56a22420503-0f2f01796d1fdfcd7c7058df3febec4e\output_model\decoder\model.onnx is inferred to be of type file.
[2023-12-13 09:32:13,816] [INFO] [quantization.py:354:_run_for_config] Preprocessing model for quantization
[2023-12-13 09:32:58,789] [INFO] [quantization.py:354:_run_for_config] Preprocessing model for quantization
[2023-12-13 09:33:24,424] [INFO] [engine.py:931:_run_pass] Running pass insert_beam_search:InsertBeamSearch
[2023-12-13 09:33:24,426] [DEBUG] [resource_path.py:156:create_resource_path] Resource path D:\Learnig\AI\Olive\examples\whisper\cache\models\1_OnnxDynamicQuantization-0-81443df774677d62399dbb62abc7a493\output_model\encoder_decoder_init\model.onnx is inferred to be of type file.
[2023-12-13 09:33:24,428] [DEBUG] [resource_path.py:156:create_resource_path] Resource path D:\Learnig\AI\Olive\examples\whisper\cache\models\1_OnnxDynamicQuantization-0-81443df774677d62399dbb62abc7a493\output_model\decoder\model.onnx is inferred to be of type file.
[2023-12-13 09:33:25,604] [WARNING] [insert_beam_search.py:171:chain_model] DecoderMaskedMultiHeadAttention could not be applied to whisper decoder subgraph
Removed 203 initializers with duplicated value
Removed 101 initializers with duplicated value
[2023-12-13 09:33:29,278] [DEBUG] [insert_beam_search.py:192:chain_model] Using IR version 8 for chained model
[2023-12-13 09:33:33,548] [INFO] [engine.py:931:_run_pass] Running pass prepost:AppendPrePostProcessingOps
[2023-12-13 09:33:33,550] [DEBUG] [resource_path.py:156:create_resource_path] Resource path D:\Learnig\AI\Olive\examples\whisper\cache\models\2_InsertBeamSearch-1-51b19e895c1591ef53a44fb74c8eac16\output_model\model_with_beam_search.onnx is inferred to be of type file.
[2023-12-13 09:33:33,551] [DEBUG] [resource_path.py:156:create_resource_path] Resource path D:\Learnig\AI\Olive\examples\whisper\cache\models\2_InsertBeamSearch-1-51b19e895c1591ef53a44fb74c8eac16\output_model\model_with_beam_search.onnx is inferred to be of type file.
[W shape_type_inference.cpp:1978] Warning: The shape inference of ai.onnx.contrib::StftNorm type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (function UpdateReliable)
[2023-12-13 09:33:37,374] [DEBUG] [engine.py:1071:_evaluate_model] Evaluating model ...
[2023-12-13 09:33:37,374] [DEBUG] [resource_path.py:156:create_resource_path] Resource path D:\Learnig\AI\Olive\examples\whisper\cache\models\3_AppendPrePostProcessingOps-2-4d9a9990e2391432dff23d272724f7c8\output_model\model_with_beam_search.onnx is inferred to be of type file.
[2023-12-13 09:33:37,376] [DEBUG] [resource_path.py:156:create_resource_path] Resource path D:\Learnig\AI\Olive\examples\whisper\cache\models\3_AppendPrePostProcessingOps-2-4d9a9990e2391432dff23d272724f7c8\output_model\model_with_beam_search.onnx is inferred to be of type file.
[2023-12-13 09:33:37,976] [DEBUG] [olive_evaluator.py:244:generate_metric_user_config_with_model_io] Model input shapes are not static. Cannot use inferred input shapes for creating dummy data. This will cause an error when creating dummy data for tuning.
[2023-12-13 09:33:37,979] [DEBUG] [resource_path.py:156:create_resource_path] Resource path D:\Learnig\AI\Olive\examples\whisper\data is inferred to be of type folder.
2023-12-13 09:33:50.4631945 [E:onnxruntime:, sequential_executor.cc:514 onnxruntime::ExecuteKernel] Non-zero status code returned while running Conv node. Name:'/whisper_encoder/encoder/conv1/Conv' Status Message: D:\a_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2538)\onnxruntime_pybind11_state.pyd!00007FFB2E960DA9: (caller: 00007FFB2F07EDDF) Exception(3) tid(3580) 80070057 The parameter is incorrect.

2023-12-13 09:33:50.4741269 [E:onnxruntime:, sequential_executor.cc:514 onnxruntime::ExecuteKernel] Non-zero status code returned while running BeamSearch node. Name:'BeamSearch_node' Status Message: Non-zero status code returned while running Conv node. Name:'/whisper_encoder/encoder/conv1/Conv' Status Message: D:\a_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2538)\onnxruntime_pybind11_state.pyd!00007FFB2E960DA9: (caller: 00007FFB2F07EDDF) Exception(3) tid(3580) 80070057 The parameter is incorrect.

[2023-12-13 09:33:50,485] [WARNING] [engine.py:438:run_accelerator] Failed to run Olive on gpu-dml: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running BeamSearch node. Name:'BeamSearch_node' Status Message: Non-zero status code returned while running Conv node. Name:'/whisper_encoder/encoder/conv1/Conv' Status Message: D:\a_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2538)\onnxruntime_pybind11_state.pyd!00007FFB2E960DA9: (caller: 00007FFB2F07EDDF) Exception(3) tid(3580) 80070057 The parameter is incorrect.
Traceback (most recent call last):
File "C:\Program Files\Python311\Lib\site-packages\olive\engine\engine.py", line 418, in run_accelerator
return self.run_no_search(
^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\olive\engine\engine.py", line 489, in run_no_search
should_prune, signal, model_ids = self._run_passes(
^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\olive\engine\engine.py", line 910, in _run_passes
signal = self._evaluate_model(model_config, model_id, data_root, evaluator_config, accelerator_spec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\olive\engine\engine.py", line 1097, in _evaluate_model
signal = self.target.evaluate_model(model_config, data_root, metrics, accelerator_spec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\olive\systems\local.py", line 49, in evaluate_model
return evaluator.evaluate(model, data_root, metrics, device=device, execution_providers=execution_providers)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\olive\evaluator\olive_evaluator.py", line 215, in evaluate
metrics_res[metric.name] = self._evaluate_latency(
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\olive\evaluator\olive_evaluator.py", line 132, in _evaluate_latency latencies = self._evaluate_raw_latency(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\olive\evaluator\olive_evaluator.py", line 784, in _evaluate_raw_latency
return self._evaluate_onnx_latency(model, metric, dataloader, post_func, device, execution_providers)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\olive\evaluator\olive_evaluator.py", line 559, in _evaluate_onnx_latency
session.run(input_feed=input_dict, output_names=None)
File "C:\Program Files\Python311\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 220, in run
return self._sess.run(output_names, input_feed, run_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running BeamSearch node. Name:'BeamSearch_node' Status Message: Non-zero status code returned while running Conv node. Name:'/whisper_encoder/encoder/conv1/Conv' Status Message: D:\a_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2538)\onnxruntime_pybind11_state.pyd!00007FFB2E960DA9: (caller: 00007FFB2F07EDDF) Exception(3) tid(3580) 80070057 The parameter is incorrect.

[2023-12-13 09:33:50,810] [INFO] [engine.py:359:run] Run history for gpu-dml:
[2023-12-13 09:33:50,823] [INFO] [engine.py:636:dump_run_history] run history:
+------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+----------------------------+----------------+-----------+
| model_id | parent_model_id
| from_pass | duration_sec | metrics |
+====================================================================================+====================================================================================+============================+================+===========+
| 386174a033bcd76f8941e56a22420503 |
| | | |
+------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+----------------------------+----------------+-----------+
| 0_OnnxConversion-386174a033bcd76f8941e56a22420503-0f2f01796d1fdfcd7c7058df3febec4e | 386174a033bcd76f8941e56a22420503 | OnnxConversion | 55.8453 | |
+------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+----------------------------+----------------+-----------+
| 1_OnnxDynamicQuantization-0-81443df774677d62399dbb62abc7a493 | 0_OnnxConversion-386174a033bcd76f8941e56a22420503-0f2f01796d1fdfcd7c7058df3febec4e | OnnxDynamicQuantization | 70.7202 | |
+------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+----------------------------+----------------+-----------+
| 2_InsertBeamSearch-1-51b19e895c1591ef53a44fb74c8eac16 | 1_OnnxDynamicQuantization-0-81443df774677d62399dbb62abc7a493 | InsertBeamSearch | 9.11838 | |
+------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+----------------------------+----------------+-----------+
| 3_AppendPrePostProcessingOps-2-4d9a9990e2391432dff23d272724f7c8 | 2_InsertBeamSearch-1-51b19e895c1591ef53a44fb74c8eac16 | AppendPrePostProcessingOps | 3.81862 | |
+------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+----------------------------+----------------+-----------+
[2023-12-13 09:33:50,826] [INFO] [engine.py:374:run] No packaging config provided, skip packaging artifacts

Other information

OS: Windows
Olive version: latest from git source 0.5.0
ONNXRuntime version: onnxruntime-directml 1.16.3

trajepl · 2023-12-15T03:44:49Z

microsoft/onnxruntime#18805
Seems beam search node for whisper is not available in Dml EP.

guotuofeng · 2023-12-21T02:56:45Z

@PatriceVignola, is there any plan we add the beam search op support for directml?

DimQ1 changed the title ~~Whispe does not converted using onnxruntime-directml~~ Whisper does not converted using onnxruntime-directml Dec 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whisper does not converted using onnxruntime-directml #813

Whisper does not converted using onnxruntime-directml #813

DimQ1 commented Dec 13, 2023

trajepl commented Dec 15, 2023

guotuofeng commented Dec 21, 2023

Whisper does not converted using onnxruntime-directml #813

Whisper does not converted using onnxruntime-directml #813

Comments

DimQ1 commented Dec 13, 2023

trajepl commented Dec 15, 2023

guotuofeng commented Dec 21, 2023