Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPUv2 syntax error in shader generation in broadcast with different batch sizes #66712

Open
gustavla opened this issue Apr 30, 2024 · 2 comments
Assignees
Labels
comp:gpu GPU related issues comp:lite TF Lite related issues TF 2.16 type:bug Bug

Comments

@gustavla
Copy link
Contributor

System information

  • Samsung Galaxy S23 / Android 13 / Snapdragon® 8 Gen 2 | SM8550
  • TFLite 2.16.1 (also repros on many older versions)

Standalone code to reproduce the issue

Compile the model that fails:

import tensorflow as tf
import keras

input0_shape = [2, 2, 1]
input1_shape = [1, 2, 2, 1]
output_shape = [1, 2, 2, 1]

tf_input0 = keras.Input(input0_shape[1:], batch_size=input0_shape[0])
tf_input1 = keras.Input(input1_shape[1:], batch_size=input1_shape[0])


class MyMatMul(keras.layers.Layer):
    def call(self, tf_input0, tf_input1):
        tf_output = tf_input0 * tf_input1
        return tf_output

tf_output = MyMatMul()(tf_input0, tf_input1)

model = keras.Model(inputs=[tf_input0, tf_input1], outputs=[tf_output])

# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Save the model.
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)

Execute on GPUv2 (for instance through benchmark_tool). It falls back to CPU and executes without a problem there.

Any other info / logs

Runtime log (executed through https://aihub.qualcomm.com/):

[30/Apr/2024:21:28:08 +05:30: profiler/info] Detected chipset 2807, made by 2000.
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] Loading tflite model Models/model.tflite
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] Malloc VM size before: 16488.0 kB, allocated: 14040.6 kB, slack: 2447.4 kB.
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] Current memory baseline range: 58204.6-60652.0 kB.
[30/Apr/2024:21:28:08 +05:30: profiler/debug] [job_id: jqp4zo91g] [model.tflite] Runtime metadata not found in Models/model.tflite/trt_metadata.json or Models/model.tflite/trt_metadata.pb
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] TF Lite version 2.16.1. Loading model from Models/model.tflite.
[30/Apr/2024:21:28:08 +05:30: profiler/debug] [job_id: jqp4zo91g] [model.tflite] Mapping resource file in Models/model.tflite
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] Loaded model. Minimum TF Lite version = .
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] No delegates specified; using compute unit=cpu_and_gpu.
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] [tflite] Initialized TensorFlow Lite runtime.
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] GPUV2 delegate requested. OpenCL detected.
[30/Apr/2024:21:28:08 +05:30: profiler/debug] [job_id: jqp4zo91g] [model.tflite] Enabling delegate cache in dir=/data/user/0/ai.tetra.tungsten/cache/1714492688301/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714492678194/gpuv2.
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] [tflite] Created TensorFlow Lite delegate for GPU.
[30/Apr/2024:21:28:08 +05:30: profiler/info] [job_id: jqp4zo91g] [model.tflite] [tflite] Replacing 1 out of 1 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions for the whole graph.
[30/Apr/2024:21:28:08 +05:30: profiler/warning] [job_id: jqp4zo91g] [model.tflite] [tflite] File /data/user/0/ai.tetra.tungsten/cache/1714492688301/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714492678194/gpuv2/gpuv2_11394406828873984768.bin couldn't be opened for reading: No such file or directory
[30/Apr/2024:21:28:08 +05:30: profiler/warning] [job_id: jqp4zo91g] [model.tflite] [tflite] Failed to build program executable - Build program failureBC-src-code:64:105: error: expected expression
  {half4 second_value = read_imageh(src_tensor_1_image2d, smp_zero, (int2)((((X)) * shared_int4_0.w + (())), (0)));
                                                                                                        ^
1 diagnostic(s) generated.

There are many input shapes that will trigger this (NG), but some execute just fine (OK):

# OK                                                  
# (5, 352, 352, 3), (352, 352, 3) -> (5, 352, 352, 3) 
# (2, 1, 1, 3), (1, 3) -> (2, 1, 1, 3)                
# (2, 1, 1, 3), (2, 3) -> (2, 1, 2, 3)                
# (2, 1, 1, 3), (3, 3) -> (2, 1, 3, 3)                
# (2, 2, 2, 1), (2, 1) -> (2, 2, 2, 1)                
# (2, 2, 1), (2, 1) -> (2, 2, 1)                      
# (3, 2), (2,) -> (3, 2)                              
# (3,) -> (1, 2, 2, 3) -> (1, 2, 2, 3)                
# (2,), (1, 2) -> (1, 2)                              
# (2, 2), (1, 2) -> (2, 2)                            
                                                      
# NG                                                  
# (1, 2, 2, 3), (3,) -> (1, 2, 2, 3)                  
# (1, 2, 2, 1), (2, 2, 1) -> (1, 2, 2, 1)             
# (2, 2, 1), (1, 2, 2, 1) -> (1, 2, 2, 1)             
# (2, 1), (1, 2, 1) -> (1, 2, 1)                      
# (1, 2, 2, 1), (2, 1) -> (1, 2, 2, 1)                
# (1, 2, 1), (2, 1) -> (1, 2, 1)                      
# (1, 2), (2,) -> (1, 2)     
@gustavla gustavla added the comp:lite TF Lite related issues label Apr 30, 2024
@tilakrayal tilakrayal added TF 2.16 type:bug Bug comp:gpu GPU related issues labels May 2, 2024
@tilakrayal tilakrayal assigned sawantkumar and unassigned tilakrayal May 2, 2024
@sawantkumar
Copy link

Hi @gustavla ,

Have you tried replicating this particular issue on other android devices or this happens only on galaxy s23?

@gustavla
Copy link
Contributor Author

gustavla commented May 8, 2024

@sawantkumar I also tried it on a Google Pixel 7 and it also falls back to CPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:gpu GPU related issues comp:lite TF Lite related issues TF 2.16 type:bug Bug
Projects
None yet
Development

No branches or pull requests

3 participants