Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA] [CODEGEN] Cuda back end implementation of TopK results in wrong output. #17275

Open
LLITCHEV opened this issue May 2, 2024 · 0 comments
Labels
bug 🐞 Something isn't working

Comments

@LLITCHEV
Copy link
Contributor

LLITCHEV commented May 2, 2024

What happened?

Running the following tests, results in duplicated elements in the output.

  %input_values = util.unfoldable_constant dense<[[
    1.3, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.3, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.3, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.3, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.3, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 123.45, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.3, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 8.9, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.3, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.3, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 4.28, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0,
    1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 7.8, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0]]> : tensor<1x256xf32>
  %out_values_empty = tensor.empty() : tensor<1x40xf32>
  %out_indices_empty = tensor.empty() : tensor<1x40xi32>
  %neg_inf = arith.constant 0xFF800000 : f32
  %c0 = arith.constant 0 : i32
  %out_values = linalg.fill ins(%neg_inf : f32) outs(%out_values_empty : tensor<1x40xf32>) -> tensor<1x40xf32>
  %out_indices = linalg.fill ins(%c0 : i32) outs(%out_indices_empty : tensor<1x40xi32>) -> tensor<1x40xi32>
  %0:2 = iree_linalg_ext.topk
        dimension(1)
        ins(%input_values: tensor<1x256xf32>)
        outs(%out_values, %out_indices : tensor<1x40xf32>, tensor<1x40xi32>) {
        ^bb0(%arg0 : f32, %arg1 : f32):
         %0 = arith.cmpf ogt, %arg0, %arg1 : f32
         iree_linalg_ext.yield %0 : i1
        } -> tensor<1x40xf32>, tensor<1x40xi32>
 check.expect_almost_eq_const(
     %0#0,
     dense<[[123.45, 8.9, 7.8, 4.28, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0]]> : tensor<1x40xf32>
 ) : tensor<1x40xf32>

  check.expect_eq_const(
      %0#1,
      dense<[[135, 169, 247, 233, 3, 7, 11, 15, 19, 23, 27, 31, 35, 39, 43, 47, 51, 55, 59, 63, 67, 71, 75, 79, 83, 87, 91, 95, 99, 103, 107, 111, 115, 119, 123, 127, 131, 139, 143, 147]]> : tensor<1x40xi32>
  ) : tensor<1x40xi32>
 return
}```

The output is:
```[ RUN      ] module.vector_call_topk_1x256
/work/runtime/src/iree/modules/check/module.cc:372: Failure
Failed
Expected equality of these values. Contents does not match.
  lhs:
    1x40xi32=[135 135 169 169 247 247 233 233 3 3 7 7 11 11 15 15 19 19 23 23 27 27 31 31 35 35 39 39 43 43 47 47 51 51 55 55 59 59 63 63]
  rhs:
    1x40xi32=[135 169 247 233 3 7 11 15 19 23 27 31 35 39 43 47 51 55 59 63 67 71 75 79 83 87 91 95 99 103 107 111 115 119 123 127 131 139 143 147]

/work/runtime/src/iree/modules/check/module.cc:456: Failure
Failed
Expected near equality of these values. Contents does not match.
  lhs:
    1x40xf32=[123.45 123.45 8.9 8.9 7.8 7.8 4.28 4.28 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4]
  rhs:
    1x40xf32=[123.45 8.9 7.8 4.28 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4]

[  FAILED  ] module.vector_call_topk_1x256, where GetParam() = 2 (599 ms)```


### Steps to reproduce your issue

I get the error when running newly added tests in CI on (WIP) PR https://github.com/iree-org/iree/pull/17045

### What component(s) does this issue relate to?

_No response_

### Version information

Latest source.

### Additional context

_No response_
@LLITCHEV LLITCHEV added the bug 🐞 Something isn't working label May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant