We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System information
Standalone code to reproduce the issue
Compile the model that fails:
import tensorflow as tf import keras input0_shape = [2, 2, 1] input1_shape = [1, 2, 2, 1] output_shape = [1, 2, 2, 1] tf_input0 = keras.Input(input0_shape[1:], batch_size=input0_shape[0]) tf_input1 = keras.Input(input1_shape[1:], batch_size=input1_shape[0]) class MyMatMul(keras.layers.Layer): def call(self, tf_input0, tf_input1): tf_output = tf_input0 * tf_input1 return tf_output tf_output = MyMatMul()(tf_input0, tf_input1) model = keras.Model(inputs=[tf_input0, tf_input1], outputs=[tf_output]) # Convert the model. converter = tf.lite.TFLiteConverter.from_keras_model(model) tflite_model = converter.convert() # Save the model. with open('model.tflite', 'wb') as f: f.write(tflite_model)
Execute on GPUv2 (for instance through benchmark_tool). It falls back to CPU and executes without a problem there.
Any other info / logs
Runtime log (executed through https://aihub.qualcomm.com/):
[30/Apr/2024:21:28:08 +05:30: profiler/info] Detected chipset 2807, made by 2000. [30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] Loading tflite model Models/model.tflite [30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] Malloc VM size before: 16488.0 kB, allocated: 14040.6 kB, slack: 2447.4 kB. [30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] Current memory baseline range: 58204.6-60652.0 kB. [30/Apr/2024:21:28:08 +05:30: profiler/debug] [job_id: jqp4zo91g] [model.tflite] Runtime metadata not found in Models/model.tflite/trt_metadata.json or Models/model.tflite/trt_metadata.pb [30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] TF Lite version 2.16.1. Loading model from Models/model.tflite. [30/Apr/2024:21:28:08 +05:30: profiler/debug] [job_id: jqp4zo91g] [model.tflite] Mapping resource file in Models/model.tflite [30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] Loaded model. Minimum TF Lite version = . [30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] No delegates specified; using compute unit=cpu_and_gpu. [30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] [tflite] Initialized TensorFlow Lite runtime. [30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] GPUV2 delegate requested. OpenCL detected. [30/Apr/2024:21:28:08 +05:30: profiler/debug] [job_id: jqp4zo91g] [model.tflite] Enabling delegate cache in dir=/data/user/0/ai.tetra.tungsten/cache/1714492688301/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714492678194/gpuv2. [30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] [tflite] Created TensorFlow Lite delegate for GPU. [30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] [tflite] Replacing 1 out of 1 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions for the whole graph. [30/Apr/2024:21:28:08 +05:30: profiler/warning] [job_id: jqp4zo91g] [model.tflite] [tflite] File /data/user/0/ai.tetra.tungsten/cache/1714492688301/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714492678194/gpuv2/gpuv2_11394406828873984768.bin couldn't be opened for reading: No such file or directory [30/Apr/2024:21:28:08 +05:30: profiler/warning] [job_id: jqp4zo91g] [model.tflite] [tflite] Failed to build program executable - Build program failureBC-src-code:64:105: error: expected expression {half4 second_value = read_imageh(src_tensor_1_image2d, smp_zero, (int2)((((X)) * shared_int4_0.w + (())), (0))); ^ 1 diagnostic(s) generated.
There are many input shapes that will trigger this (NG), but some execute just fine (OK):
# OK # (5, 352, 352, 3), (352, 352, 3) -> (5, 352, 352, 3) # (2, 1, 1, 3), (1, 3) -> (2, 1, 1, 3) # (2, 1, 1, 3), (2, 3) -> (2, 1, 2, 3) # (2, 1, 1, 3), (3, 3) -> (2, 1, 3, 3) # (2, 2, 2, 1), (2, 1) -> (2, 2, 2, 1) # (2, 2, 1), (2, 1) -> (2, 2, 1) # (3, 2), (2,) -> (3, 2) # (3,) -> (1, 2, 2, 3) -> (1, 2, 2, 3) # (2,), (1, 2) -> (1, 2) # (2, 2), (1, 2) -> (2, 2) # NG # (1, 2, 2, 3), (3,) -> (1, 2, 2, 3) # (1, 2, 2, 1), (2, 2, 1) -> (1, 2, 2, 1) # (2, 2, 1), (1, 2, 2, 1) -> (1, 2, 2, 1) # (2, 1), (1, 2, 1) -> (1, 2, 1) # (1, 2, 2, 1), (2, 1) -> (1, 2, 2, 1) # (1, 2, 1), (2, 1) -> (1, 2, 1) # (1, 2), (2,) -> (1, 2)
The text was updated successfully, but these errors were encountered:
Hi @gustavla ,
Have you tried replicating this particular issue on other android devices or this happens only on galaxy s23?
Sorry, something went wrong.
@sawantkumar I also tried it on a Google Pixel 7 and it also falls back to CPU.
sawantkumar
No branches or pull requests
System information
Standalone code to reproduce the issue
Compile the model that fails:
Execute on GPUv2 (for instance through benchmark_tool). It falls back to CPU and executes without a problem there.
Any other info / logs
Runtime log (executed through https://aihub.qualcomm.com/):
There are many input shapes that will trigger this (NG), but some execute just fine (OK):
The text was updated successfully, but these errors were encountered: