Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Spark-compatible cast between decimals with different precision and scale #375

Open
Tracked by #286
andygrove opened this issue May 3, 2024 · 4 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@andygrove
Copy link
Member

What is the problem the feature request solves?

Comet is not consistent with Spark when casting between decimals. Here is a test to demonstrate this.

  test("cast between decimals with different precision and scale") {
    val rowData = Seq(
      Row(BigDecimal("12345.6789")),
      Row(BigDecimal("9876.5432")),
      Row(BigDecimal("123.4567"))
    )
    val df = spark.createDataFrame(
      spark.sparkContext.parallelize(rowData),
      StructType(Seq(StructField("a", DataTypes.createDecimalType(10,4))))
    )
    castTest(df, DataTypes.createDecimalType(6,2))
  }

Spark Result

+----------+-------+
|         a|      a|
+----------+-------+
|  123.4567| 123.46|
| 9876.5432|9876.54|
|12345.6789|   null|
+----------+-------+

Comet Result

java.lang.ArithmeticException: Cannot convert 12345.68 (bytes: [B@4f834a43, integer: 1234568) to decimal with precision: 6 and scale: 2
	at org.apache.comet.vector.CometVector.getDecimal(CometVector.java:86)

Describe the potential solution

No response

Additional context

No response

@viirya
Copy link
Member

viirya commented May 3, 2024

This looks simple to fix. We currently throw an exception if cannot convert the byes to decimal, but looks like Spark returns null.

@caicancai
Copy link
Member

@andygrove Do you mind letting me try it? In my spare time, I have been fixing the compatibility issues of calcite in spark sql type conversion.

@viirya
Copy link
Member

viirya commented May 6, 2024

@caicancai Thanks. Please go ahead to create a PR for the open tickets which no one claims working on it.

@caicancai
Copy link
Member

I am working on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants