Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cast(9.95 as decimal(3,1)), actual: 9.9, expected: 10.0 #10809

Open
Tracked by #10771
gerashegalov opened this issue May 13, 2024 · 7 comments · May be fixed by #10917
Open
Tracked by #10771

[BUG] cast(9.95 as decimal(3,1)), actual: 9.9, expected: 10.0 #10809

gerashegalov opened this issue May 13, 2024 · 7 comments · May be fixed by #10917
Assignees
Labels
bug Something isn't working

Comments

@gerashegalov
Copy link
Collaborator

Repro

 ~/dist/spark-3.3.0-bin-hadoop3/bin/spark-shell  \
   --conf spark.plugins=com.nvidia.spark.SQLPlugin \
   --conf spark.rapids.sql.test.enabled=true \
   --conf spark.rapids.sql.explain=ALL 
   --jars dist/target/rapids-4-spark_2.12-24.06.0-SNAPSHOT-cuda11.jar 
scala> val df = Seq(9.95).toDF.coalesce(1)
df: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [value: double]

scala> spark.conf.set("spark.rapids.sql.enabled", true)

scala> df.selectExpr("cast(value as decimal(3,1))").collect()
24/05/13 11:34:42 WARN GpuOverrides: 
*Exec <ProjectExec> will run on GPU
  *Expression <Alias> cast(value#1 as decimal(3,1)) AS value#5 will run on GPU
    *Expression <Cast> cast(value#1 as decimal(3,1)) will run on GPU
  *Exec <CoalesceExec> will run on GPU
    ! <LocalTableScanExec> cannot run on GPU because GPU does not currently support the operator class org.apache.spark.sql.execution.LocalTableScanExec
      @Expression <AttributeReference> value#1 could run on GPU

res2: Array[org.apache.spark.sql.Row] = Array([9.9])

scala> spark.conf.set("spark.rapids.sql.enabled", false)

scala> df.selectExpr("cast(value as decimal(3,1))").collect()
res4: Array[org.apache.spark.sql.Row] = Array([10.0])

Related to #9682

@gerashegalov gerashegalov added bug Something isn't working ? - Needs Triage Need team to review and classify labels May 13, 2024
@mattahrens
Copy link
Collaborator

mattahrens commented May 21, 2024

Scope: make casting of floats to decimals feature flag off by default and update documentation accordingly with this example. Ref: spark.rapids.sql.castFloatToDecimal.enabled. Check if supported operators in tools needs to be updated.

@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label May 21, 2024
@ttnghia
Copy link
Collaborator

ttnghia commented May 24, 2024

Filed a cudf issue: rapidsai/cudf#15862

@gerashegalov
Copy link
Collaborator Author

gerashegalov commented May 24, 2024

Also take a look at the discussion around #9682 (comment) and above. 9.95 is not representable as double

$ jshell 
|  Welcome to JShell -- Version 21.0.2
|  For an introduction type: /help intro

jshell> new BigDecimal(9.95)
$8 ==> 9.949999999999999289457264239899814128875732421875

jshell> new BigDecimal(9.949999999999999289457264239899814128875732421875)
$9 ==> 9.949999999999999289457264239899814128875732421875

jshell> new BigDecimal(9.95).setScale(1, BigDecimal.ROUND_HALF_UP)
$11 ==> 9.9

jshell> new BigDecimal(String.valueOf(9.95)).setScale(1, BigDecimal.ROUND_HALF_UP)
$1 ==> 10.0

jshell> new BigDecimal("9.95").setScale(1, BigDecimal.ROUND_HALF_UP)
$12 ==> 10.0

@ttnghia
Copy link
Collaborator

ttnghia commented May 24, 2024

Okay after reading through the issue #9682 then I realize that this issue is also one instance of it.

@thirtiseven
Copy link
Collaborator

thirtiseven commented May 25, 2024

I tried the float => string => decimal path (it is very easy to implement in plugin), it can pass the Spark UT, but still some differences from the known limits of ryu float to string and a very edge cases in string to decimal #10890. I will post a pr and share some results for review next week, but not sure if the diffs are acceptable or original way can match it better.

@ttnghia
Copy link
Collaborator

ttnghia commented May 25, 2024

That sounds good. I've also found a way to implement in C++ which is very efficient but not sure if it will pass integration test. I'll verify that and will post a PR too.

If having a chance, please list the related tests that I can run to verify.

@ttnghia
Copy link
Collaborator

ttnghia commented May 28, 2024

@thirtiseven Please test #10917 to see if you have any test fails. On my local env, all the unit tests and integration tests passed but I'm not sure if I missed anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants