GPUv2 syntax error in shader generation in broadcast with different batch sizes #66712

gustavla · 2024-04-30T16:15:55Z

System information

Samsung Galaxy S23 / Android 13 / Snapdragon® 8 Gen 2 | SM8550
TFLite 2.16.1 (also repros on many older versions)

Standalone code to reproduce the issue

Compile the model that fails:

import tensorflow as tf
import keras

input0_shape = [2, 2, 1]
input1_shape = [1, 2, 2, 1]
output_shape = [1, 2, 2, 1]

tf_input0 = keras.Input(input0_shape[1:], batch_size=input0_shape[0])
tf_input1 = keras.Input(input1_shape[1:], batch_size=input1_shape[0])


class MyMatMul(keras.layers.Layer):
    def call(self, tf_input0, tf_input1):
        tf_output = tf_input0 * tf_input1
        return tf_output

tf_output = MyMatMul()(tf_input0, tf_input1)

model = keras.Model(inputs=[tf_input0, tf_input1], outputs=[tf_output])

# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Save the model.
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)

Execute on GPUv2 (for instance through benchmark_tool). It falls back to CPU and executes without a problem there.

Any other info / logs

Runtime log (executed through https://aihub.qualcomm.com/):

[30/Apr/2024:21:28:08 +05:30: profiler/info] Detected chipset 2807, made by 2000.
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] Loading tflite model Models/model.tflite
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] Malloc VM size before: 16488.0 kB, allocated: 14040.6 kB, slack: 2447.4 kB.
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] Current memory baseline range: 58204.6-60652.0 kB.
[30/Apr/2024:21:28:08 +05:30: profiler/debug] [job_id: jqp4zo91g] [model.tflite] Runtime metadata not found in Models/model.tflite/trt_metadata.json or Models/model.tflite/trt_metadata.pb
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] TF Lite version 2.16.1. Loading model from Models/model.tflite.
[30/Apr/2024:21:28:08 +05:30: profiler/debug] [job_id: jqp4zo91g] [model.tflite] Mapping resource file in Models/model.tflite
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] Loaded model. Minimum TF Lite version = .
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] No delegates specified; using compute unit=cpu_and_gpu.
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] [tflite] Initialized TensorFlow Lite runtime.
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] GPUV2 delegate requested. OpenCL detected.
[30/Apr/2024:21:28:08 +05:30: profiler/debug] [job_id: jqp4zo91g] [model.tflite] Enabling delegate cache in dir=/data/user/0/ai.tetra.tungsten/cache/1714492688301/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714492678194/gpuv2.
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] [tflite] Created TensorFlow Lite delegate for GPU.
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] [tflite] Replacing 1 out of 1 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions for the whole graph.
[30/Apr/2024:21:28:08 +05:30: profiler/warning] [job_id: jqp4zo91g] [model.tflite] [tflite] File /data/user/0/ai.tetra.tungsten/cache/1714492688301/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714492678194/gpuv2/gpuv2_11394406828873984768.bin couldn't be opened for reading: No such file or directory
[30/Apr/2024:21:28:08 +05:30: profiler/warning] [job_id: jqp4zo91g] [model.tflite] [tflite] Failed to build program executable - Build program failureBC-src-code:64:105: error: expected expression
  {half4 second_value = read_imageh(src_tensor_1_image2d, smp_zero, (int2)((((X)) * shared_int4_0.w + (())), (0)));
                                                                                                        ^
1 diagnostic(s) generated.

There are many input shapes that will trigger this (NG), but some execute just fine (OK):

# OK                                                  
# (5, 352, 352, 3), (352, 352, 3) -> (5, 352, 352, 3) 
# (2, 1, 1, 3), (1, 3) -> (2, 1, 1, 3)                
# (2, 1, 1, 3), (2, 3) -> (2, 1, 2, 3)                
# (2, 1, 1, 3), (3, 3) -> (2, 1, 3, 3)                
# (2, 2, 2, 1), (2, 1) -> (2, 2, 2, 1)                
# (2, 2, 1), (2, 1) -> (2, 2, 1)                      
# (3, 2), (2,) -> (3, 2)                              
# (3,) -> (1, 2, 2, 3) -> (1, 2, 2, 3)                
# (2,), (1, 2) -> (1, 2)                              
# (2, 2), (1, 2) -> (2, 2)                            
                                                      
# NG                                                  
# (1, 2, 2, 3), (3,) -> (1, 2, 2, 3)                  
# (1, 2, 2, 1), (2, 2, 1) -> (1, 2, 2, 1)             
# (2, 2, 1), (1, 2, 2, 1) -> (1, 2, 2, 1)             
# (2, 1), (1, 2, 1) -> (1, 2, 1)                      
# (1, 2, 2, 1), (2, 1) -> (1, 2, 2, 1)                
# (1, 2, 1), (2, 1) -> (1, 2, 1)                      
# (1, 2), (2,) -> (1, 2)

The text was updated successfully, but these errors were encountered:

sawantkumar · 2024-05-07T04:09:20Z

Hi @gustavla ,

Have you tried replicating this particular issue on other android devices or this happens only on galaxy s23?

gustavla · 2024-05-08T00:54:26Z

@sawantkumar I also tried it on a Google Pixel 7 and it also falls back to CPU.

gustavla added the comp:lite TF Lite related issues label Apr 30, 2024

google-ml-butler bot assigned tilakrayal Apr 30, 2024

tilakrayal added TF 2.16 type:bug Bug comp:gpu GPU related issues labels May 2, 2024

tilakrayal assigned sawantkumar and unassigned tilakrayal May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPUv2 syntax error in shader generation in broadcast with different batch sizes #66712

GPUv2 syntax error in shader generation in broadcast with different batch sizes #66712

gustavla commented Apr 30, 2024

sawantkumar commented May 7, 2024

gustavla commented May 8, 2024

GPUv2 syntax error in shader generation in broadcast with different batch sizes #66712

GPUv2 syntax error in shader generation in broadcast with different batch sizes #66712

Comments

gustavla commented Apr 30, 2024

sawantkumar commented May 7, 2024

gustavla commented May 8, 2024