{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":17165658,"defaultBranch":"master","name":"spark","ownerLogin":"apache","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2014-02-25T08:00:08.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/47359?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1715902415.0","currentOid":""},"activityList":{"items":[{"before":"889820c1ff392983c52b55d80bd8d80be22785ab","after":"74a1a76e811a0b6953468dc59b0f258fbd4d7691","ref":"refs/heads/master","pushedAt":"2024-05-17T04:03:43.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[MINOR][PYTHON][TESTS] Call `test_apply_schema_to_dict_and_rows` in `test_apply_schema_to_row`\n\n### What changes were proposed in this pull request?\n\nThis PR fixes the test `test_apply_schema_to_row` to call `test_apply_schema_to_row` instead of `test_apply_schema_to_dict_and_rows`. It was a mistake.\n\n### Why are the changes needed?\n\nTo avoid a mistake when it's enabled in the future.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo, test-only\n\n### How was this patch tested?\n\nCI in this PR.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46631 from HyukjinKwon/minor-test-rename.\n\nAuthored-by: Hyukjin Kwon \nSigned-off-by: Hyukjin Kwon ","shortMessageHtmlLink":"[MINOR][PYTHON][TESTS] Call test_apply_schema_to_dict_and_rows in `…"}},{"before":"714fc8cd872d6f583a6066e9ddb4a51caa51caf3","after":"889820c1ff392983c52b55d80bd8d80be22785ab","ref":"refs/heads/master","pushedAt":"2024-05-17T03:57:44.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"zhengruifeng","name":"Ruifeng Zheng","path":"/zhengruifeng","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/7322292?s=80&v=4"},"commit":{"message":"[SPARK-41625][PYTHON][CONNECT][TESTS][FOLLOW-UP] Enable `DataFrameObservationParityTests.test_observe_str`\n\n### What changes were proposed in this pull request?\n\nThis PR proposes to enable `DataFrameObservationParityTests.test_observe_str`.\n\n### Why are the changes needed?\n\nTo make sure on the test coverage\n\n### Does this PR introduce _any_ user-facing change?\n\nNo, test-only.\n\n### How was this patch tested?\n\nCI in this PR.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46630 from HyukjinKwon/SPARK-41625-followup.\n\nAuthored-by: Hyukjin Kwon \nSigned-off-by: Ruifeng Zheng ","shortMessageHtmlLink":"[SPARK-41625][PYTHON][CONNECT][TESTS][FOLLOW-UP] Enable `DataFrameObs…"}},{"before":"403619a3974c595ba80d6c9dbd23b8c2f1e2233e","after":"714fc8cd872d6f583a6066e9ddb4a51caa51caf3","ref":"refs/heads/master","pushedAt":"2024-05-17T03:09:52.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48316][PS][CONNECT][TESTS] Fix comments for SparkFrameMethodsParityTests.test_coalesce and test_repartition\n\n### What changes were proposed in this pull request?\n\nThis PR proposes to enable `SparkFrameMethodsParityTests.test_coalesce` and `SparkFrameMethodsParityTests.test_repartition` in Spark Connect by avoiding RDD usage in the test.\n\n### Why are the changes needed?\n\nTo make sure on the test coverage\n\n### Does this PR introduce _any_ user-facing change?\n\nNo, test-only.\n\n### How was this patch tested?\n\nCI in this PR.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46629 from HyukjinKwon/SPARK-48316.\n\nAuthored-by: Hyukjin Kwon \nSigned-off-by: Hyukjin Kwon ","shortMessageHtmlLink":"[SPARK-48316][PS][CONNECT][TESTS] Fix comments for SparkFrameMethodsP…"}},{"before":"b0e535217bf891f2320f2419d213e1c700e15b41","after":"403619a3974c595ba80d6c9dbd23b8c2f1e2233e","ref":"refs/heads/master","pushedAt":"2024-05-17T02:06:04.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-48306][SQL] Improve UDT in error message\n\n### What changes were proposed in this pull request?\n\nThis PR improves UDT in error message when error occurs. Currently, an UDT is displayed as it's inner SQL type, which is ambiguous for debugging\n\n### Why are the changes needed?\n\nimprovement error message\n### Does this PR introduce _any_ user-facing change?\n\nNO\n\n### How was this patch tested?\n\nnew tests\n\n### Was this patch authored or co-authored using generative AI tooling?\nno\n\nCloses #46616 from yaooqinn/SPARK-48306.\n\nAuthored-by: Kent Yao \nSigned-off-by: Kent Yao ","shortMessageHtmlLink":"[SPARK-48306][SQL] Improve UDT in error message"}},{"before":"05e1706e5aa66a592e61b03263683a2dbbc64afe","after":"b0e535217bf891f2320f2419d213e1c700e15b41","ref":"refs/heads/master","pushedAt":"2024-05-17T01:56:13.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"zhengruifeng","name":"Ruifeng Zheng","path":"/zhengruifeng","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/7322292?s=80&v=4"},"commit":{"message":"[SPARK-48301][SQL][FOLLOWUP] Update the error message\n\n### What changes were proposed in this pull request?\nUpdate the error message\n\n### Why are the changes needed?\nwe don't support `CREATE PROCEDURE` in spark, to address https://github.com/apache/spark/pull/46608#discussion_r1604205064\n\n### Does this PR introduce _any_ user-facing change?\nno\n\n### How was this patch tested?\nci\n\n### Was this patch authored or co-authored using generative AI tooling?\nno\n\nCloses #46628 from zhengruifeng/nit_error.\n\nAuthored-by: Ruifeng Zheng \nSigned-off-by: Ruifeng Zheng ","shortMessageHtmlLink":"[SPARK-48301][SQL][FOLLOWUP] Update the error message"}},{"before":"153053fe6c3d62d8fa607cdcc5c4813a60a33aa1","after":"05e1706e5aa66a592e61b03263683a2dbbc64afe","ref":"refs/heads/master","pushedAt":"2024-05-17T01:28:40.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48310][PYTHON][CONNECT] Cached properties must return copies\n\n### What changes were proposed in this pull request?\nWhen a consumer modifies the result values of a cached property it will modify the value of the cached property.\n\nBefore:\n```python\ndf_columns = df.columns\nfor col in ['id', 'name']:\n df_columns.remove(col)\nassert len(df_columns) == df.columns\n```\n\nBut this is wrong and this patch fixes it to\n\n```python\ndf_columns = df.columns\nfor col in ['id', 'name']:\n df_columns.remove(col)\nassert len(df_columns) != df.columns\n```\n\n### Why are the changes needed?\nCorrectness of the API\n\n### Does this PR introduce _any_ user-facing change?\nNo, this makes the code consistent with Spark classic.\n\n### How was this patch tested?\nUT\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo\n\nCloses #46621 from grundprinzip/grundprinzip/SPARK-48310.\n\nAuthored-by: Martin Grund \nSigned-off-by: Hyukjin Kwon ","shortMessageHtmlLink":"[SPARK-48310][PYTHON][CONNECT] Cached properties must return copies"}},{"before":"e9d4152a319af4ad138ad1a6eb87bdf0b051ec9e","after":"153053fe6c3d62d8fa607cdcc5c4813a60a33aa1","ref":"refs/heads/master","pushedAt":"2024-05-16T23:36:10.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48268][CORE] Add a configuration for SparkContext.setCheckpointDir\n\n### What changes were proposed in this pull request?\n\nThis PR adds `spark.checkpoint.dir` configuration so users can set the checkpoint dir when they submit their application.\n\n### Why are the changes needed?\n\nSeparate the configuration logic so the same app can run with a different checkpoint.\nIn addition, this would be useful for Spark Connect with https://github.com/apache/spark/pull/46570.\n\n### Does this PR introduce _any_ user-facing change?\n\nYes, it adds a new user-facing configuration.\n\n### How was this patch tested?\n\nunittest added\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46571 from HyukjinKwon/SPARK-48268.\n\nLead-authored-by: Hyukjin Kwon \nCo-authored-by: Hyukjin Kwon \nSigned-off-by: Hyukjin Kwon ","shortMessageHtmlLink":"[SPARK-48268][CORE] Add a configuration for SparkContext.setCheckpoin…"}},{"before":"59f88c3725222b84b2d0b51ba40a769d99866b56","after":"e9d4152a319af4ad138ad1a6eb87bdf0b051ec9e","ref":"refs/heads/master","pushedAt":"2024-05-16T23:35:40.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48031][SQL][FOLLOW-UP] Use ANSI-enabled cast in view lookup test\n\n### What changes were proposed in this pull request?\n\nThis PR is a followup of https://github.com/apache/spark/pull/46267 that uses ANSI-enabled cast in the tests. It intentionally uses ANSI-enabled cast in `castColToType` when you look up a view.\n\n### Why are the changes needed?\n\nIn order to fix the scheduled CI build without ANSI:\n\n- https://github.com/apache/spark/actions/runs/9072308206/job/24960016975\n- https://github.com/apache/spark/actions/runs/9072308206/job/24960019187\n\n```\n[info] - look up view relation *** FAILED *** (72 milliseconds)\n[info] == FAIL: Plans do not match ===\n[info] 'SubqueryAlias spark_catalog.db3.view1 'SubqueryAlias spark_catalog.db3.view1\n[info] +- View (`spark_catalog`.`db3`.`view1`, ['col1, 'col2, 'a, 'b]) +- View (`spark_catalog`.`db3`.`view1`, ['col1, 'col2, 'a, 'b])\n[info] +- 'Project [cast(getviewcolumnbynameandordinal(`spark_catalog`.`db3`.`view1`, col1, 0, 1) as int) AS col1#0, cast(getviewcolumnbynameandordinal(`spark_catalog`.`db3`.`view1`, col2, 0, 1) as string) AS col2#0, cast(getviewcolumnbynameandordinal(`spark_catalog`.`db3`.`view1`, a, 0, 1) as int) AS a#0, cast(getviewcolumnbynameandordinal(`spark_catalog`.`db3`.`view1`, b, 0, 1) as string) AS b#0] +- 'Project [cast(getviewcolumnbynameandordinal(`spark_catalog`.`db3`.`view1`, col1, 0, 1) as int) AS col1#0, cast(getviewcolumnbynameandordinal(`spark_catalog`.`db3`.`view1`, col2, 0, 1) as string) AS col2#0, cast(getviewcolumnbynameandordinal(`spark_catalog`.`db3`.`view1`, a, 0, 1) as int) AS a#0, cast(getviewcolumnbynameandordinal(`spark_catalog`.`db3`.`view1`, b, 0, 1) as string) AS b#0]\n[info] +- 'Project [*] +- 'Project [*]\n[info] +- 'UnresolvedRelation [tbl1], [], false\n```\n\n```\n[info] - look up view created before Spark 3.0 *** FAILED *** (452 milliseconds)\n[info] == FAIL: Plans do not match ===\n[info] 'SubqueryAlias spark_catalog.db3.view2 'SubqueryAlias spark_catalog.db3.view2\n[info] +- View (`db3`.`view2`, ['col1, 'col2, 'a, 'b]) +- View (`db3`.`view2`, ['col1, 'col2, 'a, 'b])\n[info] +- 'Project [cast(getviewcolumnbynameandordinal(`db3`.`view2`, col1, 0, 1) as int) AS col1#0, cast(getviewcolumnbynameandordinal(`db3`.`view2`, col2, 0, 1) as string) AS col2#0, cast(getviewcolumnbynameandordinal(`db3`.`view2`, a, 0, 1) as int) AS a#0, cast(getviewcolumnbynameandordinal(`db3`.`view2`, b, 0, 1) as string) AS b#0] +- 'Project [cast(getviewcolumnbynameandordinal(`db3`.`view2`, col1, 0, 1) as int) AS col1#0, cast(getviewcolumnbynameandordinal(`db3`.`view2`, col2, 0, 1) as string) AS col2#0, cast(getviewcolumnbynameandordinal(`db3`.`view2`, a, 0, 1) as int) AS a#0, cast(getviewcolumnbynameandordinal(`db3`.`view2`, b, 0, 1) as string) AS b#0]\n[info] +- 'Project [*] +- 'Project [*]\n[info] +- 'UnresolvedRelation [tbl1], [], false +- 'UnresolvedRelation [tbl1], [], false (PlanTest.scala:179)\n\n```\n\n### Does this PR introduce _any_ user-facing change?\n\nNo, the main change has not been released yet.\n\n### How was this patch tested?\n\nManually ran the tests after ANSI disabled.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46614 from HyukjinKwon/SPARK-48031-followup.\n\nAuthored-by: Hyukjin Kwon \nSigned-off-by: Hyukjin Kwon ","shortMessageHtmlLink":"[SPARK-48031][SQL][FOLLOW-UP] Use ANSI-enabled cast in view lookup test"}},{"before":"96e70ab579c3ac844815522fc6898d3e4dcb1882","after":null,"ref":"refs/heads/dependabot/bundler/docs/rexml-3.2.8","pushedAt":"2024-05-16T23:33:35.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"dependabot[bot]","name":null,"path":"/apps/dependabot","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/29110?s=80&v=4"}},{"before":"283b2ff422218b025e7b0170e4b7ed31a1294a80","after":"59f88c3725222b84b2d0b51ba40a769d99866b56","ref":"refs/heads/master","pushedAt":"2024-05-16T21:58:28.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"gengliangwang","name":"Gengliang Wang","path":"/gengliangwang","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1097932?s=80&v=4"},"commit":{"message":"[SPARK-48294][SQL] Handle lowercase in nestedTypeMissingElementTypeError\n\n### What changes were proposed in this pull request?\n\nHandle lowercase values inside of nestTypeMissingElementTypeError to prevent match errors.\n\n### Why are the changes needed?\n\nThe previous match error was not user-friendly. Now it gives an actionable `INCOMPLETE_TYPE_DEFINITION` error.\n\n### Does this PR introduce _any_ user-facing change?\n\nN/A\n\n### How was this patch tested?\n\nNewly added tests pass.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46623 from michaelzhan-db/SPARK-48294.\n\nAuthored-by: Michael Zhang \nSigned-off-by: Gengliang Wang ","shortMessageHtmlLink":"[SPARK-48294][SQL] Handle lowercase in nestedTypeMissingElementTypeError"}},{"before":null,"after":"96e70ab579c3ac844815522fc6898d3e4dcb1882","ref":"refs/heads/dependabot/bundler/docs/rexml-3.2.8","pushedAt":"2024-05-16T21:44:08.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"dependabot[bot]","name":null,"path":"/apps/dependabot","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/29110?s=80&v=4"},"commit":{"message":"Bump rexml from 3.2.6 to 3.2.8 in /docs\n\nBumps [rexml](https://github.com/ruby/rexml) from 3.2.6 to 3.2.8.\n- [Release notes](https://github.com/ruby/rexml/releases)\n- [Changelog](https://github.com/ruby/rexml/blob/master/NEWS.md)\n- [Commits](https://github.com/ruby/rexml/compare/v3.2.6...v3.2.8)\n\n---\nupdated-dependencies:\n- dependency-name: rexml\n dependency-type: indirect\n...\n\nSigned-off-by: dependabot[bot] ","shortMessageHtmlLink":"Bump rexml from 3.2.6 to 3.2.8 in /docs"}},{"before":"57948c865e064469a75c92f8b58c632b9b40fdd3","after":"283b2ff422218b025e7b0170e4b7ed31a1294a80","ref":"refs/heads/master","pushedAt":"2024-05-16T18:55:23.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"gengliangwang","name":"Gengliang Wang","path":"/gengliangwang","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1097932?s=80&v=4"},"commit":{"message":"[SPARK-48291][CORE][FOLLOWUP] Rename Java *LoggerSuite* as *SparkLoggerSuite*\n\n### What changes were proposed in this pull request?\nThe pr is follow up https://github.com/apache/spark/pull/46600\n\n to . Similarly, to maintain consistency, should be renamed to\n\n### Why are the changes needed?\nAfter `org.apache.spark.internal.Logger` is renamed to `org.apache.spark.internal.SparkLogger` and `org.apache.spark.internal.LoggerFactory` is renamed to `org.apache.spark.internal.SparkLoggerFactory.`, the related UT's names should also be `renamed`, so that developers can easily locate the related UT.\n\n### Does this PR introduce _any_ user-facing change?\nNo.\n\n### How was this patch tested?\nPass GA.\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo.\n\nCloses #46615 from panbingkun/SPARK-48291_follow_up.\n\nAuthored-by: panbingkun \nSigned-off-by: Gengliang Wang ","shortMessageHtmlLink":"[SPARK-48291][CORE][FOLLOWUP] Rename Java *LoggerSuite* as *SparkLogg…"}},{"before":"3d3d18f14ba29074ca3ff8b661449ad45d84369e","after":"57948c865e064469a75c92f8b58c632b9b40fdd3","ref":"refs/heads/master","pushedAt":"2024-05-16T14:38:07.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"cloud-fan","name":"Wenchen Fan","path":"/cloud-fan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3182036?s=80&v=4"},"commit":{"message":"[SPARK-48308][CORE] Unify getting data schema without partition columns in FileSourceStrategy\n\n### What changes were proposed in this pull request?\nCompute the schema of the data without partition columns only once in FileSourceStrategy.\n\n### Why are the changes needed?\nIn FileSourceStrategy, the schema of the data excluding partition columns is computed 2 times in a slightly different way, using an AttributeSet (`partitionSet`) and using the attributes directly (`partitionColumns`)\nThese don't have the exact same semantics, AttributeSet will only use expression ids for comparison while comparing with the actual attributes will use the name, type, nullability and metadata. We want to use the former here.\n\n### Does this PR introduce _any_ user-facing change?\nNo\n\n### How was this patch tested?\nExisting tests\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo\n\nCloses #46619 from johanl-db/reuse-schema-without-partition-columns.\n\nAuthored-by: Johan Lasperas \nSigned-off-by: Wenchen Fan ","shortMessageHtmlLink":"[SPARK-48308][CORE] Unify getting data schema without partition colum…"}},{"before":"4be0828e6e6afa6d9ab67958f5ef5fbe6814252d","after":"3d3d18f14ba29074ca3ff8b661449ad45d84369e","ref":"refs/heads/master","pushedAt":"2024-05-16T12:58:29.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"zhengruifeng","name":"Ruifeng Zheng","path":"/zhengruifeng","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/7322292?s=80&v=4"},"commit":{"message":"[SPARK-48301][SQL] Rename `CREATE_FUNC_WITH_IF_NOT_EXISTS_AND_REPLACE` to `CREATE_ROUTINE_WITH_IF_NOT_EXISTS_AND_REPLACE`\n\n### What changes were proposed in this pull request?\nRename `CREATE_FUNC_WITH_IF_NOT_EXISTS_AND_REPLACE` to `CREATE_ROUTINE_WITH_IF_NOT_EXISTS_AND_REPLACE`\n\n### Why are the changes needed?\n`IF NOT EXISTS` + `REPLACE` is standard restriction, not just for functions.\nRename it to make it reusable.\n\n### Does this PR introduce _any_ user-facing change?\nno\n\n### How was this patch tested?\nupdated tests\n\n### Was this patch authored or co-authored using generative AI tooling?\nno\n\nCloses #46608 from zhengruifeng/sql_rename_if_not_exists_replace.\n\nLead-authored-by: Ruifeng Zheng \nCo-authored-by: Ruifeng Zheng \nSigned-off-by: Ruifeng Zheng ","shortMessageHtmlLink":"[SPARK-48301][SQL] Rename `CREATE_FUNC_WITH_IF_NOT_EXISTS_AND_REPLACE…"}},{"before":"fa83d0f8fce792b8db0ad1ed53cf80acdf4ea5de","after":"4be0828e6e6afa6d9ab67958f5ef5fbe6814252d","ref":"refs/heads/master","pushedAt":"2024-05-16T11:51:57.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"cloud-fan","name":"Wenchen Fan","path":"/cloud-fan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3182036?s=80&v=4"},"commit":{"message":"[SPARK-48288] Add source data type for connector cast expression\n\nCurrently,\nV2ExpressionBuilder will build connector.Cast expression from catalyst.Cast expression.\nCatalyst cast have expression data type, but connector cast does not have it.\nSince some casts are not allowed on external engine, we need to know source and target data type, since we want finer granularity to block some unsupported casts.\n\n### What changes were proposed in this pull request?\nAdd source data type to connector `Cast` expression\n\n### Why are the changes needed?\nWe need finer granularity to allow implementors of `SQLBuilder` to disable some unsupported casts.\n\n### Does this PR introduce _any_ user-facing change?\nYes, visitCast function is changed, and it needs to be overriden again.\n\n### How was this patch tested?\nNo tests made. Simple code change.\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo\n\nCloses #46596 from urosstan-db/SPARK-48288-Add-source-data-type-to-connector-cast-expression.\n\nAuthored-by: Uros Stankovic \nSigned-off-by: Wenchen Fan ","shortMessageHtmlLink":"[SPARK-48288] Add source data type for connector cast expression"}},{"before":"3bd845ea930a4709b7a2f0447b5f8af64c697239","after":"fa83d0f8fce792b8db0ad1ed53cf80acdf4ea5de","ref":"refs/heads/master","pushedAt":"2024-05-16T10:13:37.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-48296][SQL] Codegen Support for `to_xml`\n\n### What changes were proposed in this pull request?\nThe PR adds `Codegen Support` for `to_xml`.\n\n### Why are the changes needed?\nImprove codegen coverage.\n\n### Does this PR introduce _any_ user-facing change?\nNo.\n\n### How was this patch tested?\n- Add new UT & existed UT.\n- Pass GA.\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo.\n\nCloses #46591 from panbingkun/minor_to_xml.\n\nLead-authored-by: panbingkun \nCo-authored-by: panbingkun \nSigned-off-by: Kent Yao ","shortMessageHtmlLink":"[SPARK-48296][SQL] Codegen Support for to_xml"}},{"before":"210ed2521d3dc1202cd1ba855ed5e729a5d940d0","after":"c1dd4a5df69340884f3f0f0c28ce916bf9e30159","ref":"refs/heads/branch-3.5","pushedAt":"2024-05-16T09:30:15.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-48297][SQL] Fix a regression TRANSFORM clause with char/varchar\n\n### What changes were proposed in this pull request?\n\nTRANSFORM with char/varchar has been accidentally invalidated since 3.1 with a scala.MatchError, this PR fixes it\n\n### Why are the changes needed?\n\nbugfix\n### Does this PR introduce _any_ user-facing change?\n\nno\n\n### How was this patch tested?\n\nnew tests\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nno\n\nCloses #46603 from yaooqinn/SPARK-48297.\n\nAuthored-by: Kent Yao \nSigned-off-by: Kent Yao \n(cherry picked from commit 3bd845ea930a4709b7a2f0447b5f8af64c697239)\nSigned-off-by: Kent Yao ","shortMessageHtmlLink":"[SPARK-48297][SQL] Fix a regression TRANSFORM clause with char/varchar"}},{"before":"b53d78e94f6e69c65d61d5e1b7d3e59a4815f620","after":"3bd845ea930a4709b7a2f0447b5f8af64c697239","ref":"refs/heads/master","pushedAt":"2024-05-16T09:29:53.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-48297][SQL] Fix a regression TRANSFORM clause with char/varchar\n\n### What changes were proposed in this pull request?\n\nTRANSFORM with char/varchar has been accidentally invalidated since 3.1 with a scala.MatchError, this PR fixes it\n\n### Why are the changes needed?\n\nbugfix\n### Does this PR introduce _any_ user-facing change?\n\nno\n\n### How was this patch tested?\n\nnew tests\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nno\n\nCloses #46603 from yaooqinn/SPARK-48297.\n\nAuthored-by: Kent Yao \nSigned-off-by: Kent Yao ","shortMessageHtmlLink":"[SPARK-48297][SQL] Fix a regression TRANSFORM clause with char/varchar"}},{"before":"0ba8ddc9ce5b960815234a3999a080ba3271775b","after":"b53d78e94f6e69c65d61d5e1b7d3e59a4815f620","ref":"refs/heads/master","pushedAt":"2024-05-16T07:31:52.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-48036][DOCS][FOLLOWUP] Update sql-ref-ansi-compliance.md\n\n### What changes were proposed in this pull request?\n\nFollowup of https://github.com/apache/spark/pull/46271, to fix some missing parts\n### Why are the changes needed?\n\ndoc fix\n### Does this PR introduce _any_ user-facing change?\n\nno\n\n### How was this patch tested?\n\ndoc build\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nno\n\nCloses #46610 from yaooqinn/SPARK-48036.\n\nAuthored-by: Kent Yao \nSigned-off-by: Kent Yao ","shortMessageHtmlLink":"[SPARK-48036][DOCS][FOLLOWUP] Update sql-ref-ansi-compliance.md"}},{"before":"97717363abae0526f4a6f8c577f539da2d4ea314","after":"0ba8ddc9ce5b960815234a3999a080ba3271775b","ref":"refs/heads/master","pushedAt":"2024-05-16T06:25:22.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HeartSaVioR","name":"Jungtaek Lim","path":"/HeartSaVioR","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1317309?s=80&v=4"},"commit":{"message":"[SPARK-48293][SS] Add test for when ForeachBatchUserFuncException wraps interrupted exception due to query stop\n\n### What changes were proposed in this pull request?\nAdd test for when ForeachBatchUserFuncException wraps interrupted exception due to query stop\n\n### Why are the changes needed?\ntest\n\n### Does this PR introduce _any_ user-facing change?\nNo\n\n### How was this patch tested?\ntest added\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo\n\nCloses #46601 from micheal-o/ForEachBatchExWrapInterruptTest.\n\nAuthored-by: micheal-o \nSigned-off-by: Jungtaek Lim ","shortMessageHtmlLink":"[SPARK-48293][SS] Add test for when ForeachBatchUserFuncException wra…"}},{"before":"9130f78fb12eed94f48e1fd9ccedb6fe651a4440","after":"97717363abae0526f4a6f8c577f539da2d4ea314","ref":"refs/heads/master","pushedAt":"2024-05-16T06:14:42.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-48264][BUILD] Upgrade `datasketches-java` to 6.0.0\n\n### What changes were proposed in this pull request?\nThe pr aims to upgrade `datasketches-java` from `5.0.1` to `6.0.0`\n\n### Why are the changes needed?\nThe full release notes:\n- https://github.com/apache/datasketches-java/releases/tag/6.0.0\n- https://github.com/apache/datasketches-java/releases/tag/5.0.2\n \"image\"\n\n### Does this PR introduce _any_ user-facing change?\nNo.\n\n### How was this patch tested?\nPass GA.\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo.\n\nCloses #46563 from panbingkun/SPARK-48264.\n\nAuthored-by: panbingkun \nSigned-off-by: Kent Yao ","shortMessageHtmlLink":"[SPARK-48264][BUILD] Upgrade datasketches-java to 6.0.0"}},{"before":"726f2c95d4dca93168a98a3782d21ea99147a47b","after":"9130f78fb12eed94f48e1fd9ccedb6fe651a4440","ref":"refs/heads/master","pushedAt":"2024-05-16T06:13:19.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-47607] Add documentation for Structured logging framework\n\n### What changes were proposed in this pull request?\n\nAdd documentation for Structured logging framework\n\n### Why are the changes needed?\n\nProvide document for Spark developers\n\n### Does this PR introduce _any_ user-facing change?\n\nNo\n\n### How was this patch tested?\n\nDoc preview:\n\"image\"\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo\n\nCloses #46605 from gengliangwang/updateGuideline.\n\nAuthored-by: Gengliang Wang \nSigned-off-by: Kent Yao ","shortMessageHtmlLink":"[SPARK-47607] Add documentation for Structured logging framework"}},{"before":"dec910ba3c36e27b9cff5b5e139be82af6c799ab","after":"726f2c95d4dca93168a98a3782d21ea99147a47b","ref":"refs/heads/master","pushedAt":"2024-05-16T05:41:48.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-48299][BUILD] Upgrade `scala-maven-plugin` to 4.9.1\n\n### What changes were proposed in this pull request?\nThis pr aims to upgrade `scala-maven-plugin` from 4.8.1 to 4.9.1\n\n### Why are the changes needed?\nThe new version is built using Java 11 and has been upgraded to use zinc 1.10.0. For all changes, please refer to:\n- https://github.com/davidB/scala-maven-plugin/compare/4.8.1...4.9.1\n\n### Does this PR introduce _any_ user-facing change?\nNo\n\n### How was this patch tested?\nPass GitHub Actions\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo\n\nCloses #46593 from LuciferYang/scala-maven-plugin-491.\n\nAuthored-by: yangjie01 \nSigned-off-by: Kent Yao ","shortMessageHtmlLink":"[SPARK-48299][BUILD] Upgrade scala-maven-plugin to 4.9.1"}},{"before":"d0f4533f4e797a439eb78b8214e7bbfe06d0839a","after":"dec910ba3c36e27b9cff5b5e139be82af6c799ab","ref":"refs/heads/master","pushedAt":"2024-05-16T04:50:51.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"gengliangwang","name":"Gengliang Wang","path":"/gengliangwang","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1097932?s=80&v=4"},"commit":{"message":"[SPARK-48214][INFRA] Ban import `org.slf4j.Logger` & `org.slf4j.LoggerFactory`\n\n### What changes were proposed in this pull request?\nThe pr aims to ban import `org.slf4j.Logger` & `org.slf4j.LoggerFactory`.\n\n### Why are the changes needed?\nAfter the migration of structured logs on the `java side` is completed, we need to ban import `org.slf4j.Logger` & `org.slf4j.LoggerFactory` in the code to avoid the log format that is not written as required in the future new java code.\n\n### Does this PR introduce _any_ user-facing change?\nYes, only for spark developers.\n\n### How was this patch tested?\n- Manually test.\n- Pass GA.\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo.\n\nCloses #46502 from panbingkun/ban_import_slf4j.\n\nAuthored-by: panbingkun \nSigned-off-by: Gengliang Wang ","shortMessageHtmlLink":"[SPARK-48214][INFRA] Ban import org.slf4j.Logger & `org.slf4j.Logge…"}},{"before":"5e8322150a050ad4d0c3962d62c9a2b3e9a937c1","after":"d0f4533f4e797a439eb78b8214e7bbfe06d0839a","ref":"refs/heads/master","pushedAt":"2024-05-16T04:15:28.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"zhengruifeng","name":"Ruifeng Zheng","path":"/zhengruifeng","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/7322292?s=80&v=4"},"commit":{"message":"[SPARK-48287][PS][CONNECT] Apply the builtin `timestamp_diff` method\n\n### What changes were proposed in this pull request?\nApply the builtin `timestamp_diff` method\n\n### Why are the changes needed?\n`timestamp_diff` method was added as a builtin method, no need to maintain a PS-specific method\n\n### Does this PR introduce _any_ user-facing change?\nno\n\n### How was this patch tested?\nci\n\n### Was this patch authored or co-authored using generative AI tooling?\nno\n\nCloses #46595 from zhengruifeng/ps_ts_diff.\n\nAuthored-by: Ruifeng Zheng \nSigned-off-by: Ruifeng Zheng ","shortMessageHtmlLink":"[SPARK-48287][PS][CONNECT] Apply the builtin timestamp_diff method"}},{"before":"9f64219ae325d0c0014928b574776b5fa55c61a5","after":"5e8322150a050ad4d0c3962d62c9a2b3e9a937c1","ref":"refs/heads/master","pushedAt":"2024-05-16T04:11:52.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-48219][CORE] StreamReader Charset fix with UTF8\n\n### What changes were proposed in this pull request?\nFix some StreamReader not set with UTF8,if we actually default charset not support Chinese chars such as latin and conf contain Chinese chars,it would not resolve success,so we need set it as utf8 in StreamReader,we can find all StreamReader with utf8 charset in other compute framework,such as Calcite、Hive、Hudi and so on.\n\n### Why are the changes needed?\nMay cause string decode not as expected\n\n### Does this PR introduce _any_ user-facing change?\nYes\n\n### How was this patch tested?\nNot need\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo\n\nCloses #46509 from xuzifu666/SPARK-48219.\n\nAuthored-by: xuyu <11161569@vivo.com>\nSigned-off-by: Kent Yao ","shortMessageHtmlLink":"[SPARK-48219][CORE] StreamReader Charset fix with UTF8"}},{"before":"4ffaa2e89a8a777a374b7f5b22166ef9bac8b99f","after":"9f64219ae325d0c0014928b574776b5fa55c61a5","ref":"refs/heads/master","pushedAt":"2024-05-16T03:10:07.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48295][PS] Turn on `compute.ops_on_diff_frames` by default\n\n### What changes were proposed in this pull request?\nTurn on `compute.ops_on_diff_frames` by default\n\n### Why are the changes needed?\n1, in most cases, this config need to be turned on to enable computation with different dataframes;\n2, enable `compute.ops_on_diff_frames` should not break any workloads, it should only enable more;\n\n### Does this PR introduce _any_ user-facing change?\nyes, this config is turned on by default\n\n### How was this patch tested?\nupdated tests\n\n### Was this patch authored or co-authored using generative AI tooling?\nno\n\nCloses #46602 from zhengruifeng/turn_on_ops_on_diff_frames.\n\nAuthored-by: Ruifeng Zheng \nSigned-off-by: Hyukjin Kwon ","shortMessageHtmlLink":"[SPARK-48295][PS] Turn on compute.ops_on_diff_frames by default"}},{"before":"ca3593288d577435a193f356b5214cf6f4bd534a","after":"4ffaa2e89a8a777a374b7f5b22166ef9bac8b99f","ref":"refs/heads/master","pushedAt":"2024-05-16T02:09:29.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-48289][DOCKER][TEST] Clean up Oracle JDBC tests by skipping redundant SYSTEM password reset\n\n### What changes were proposed in this pull request?\nThis pull request improves the Oracle JDBC tests by skipping the redundant SYSTEM password reset.\n\n### Why are the changes needed?\nThese changes are necessary to clean up the Oracle JDBC tests.\n\nThis pull request effectively reverts the modifications introduced in [SPARK-46592](https://issues.apache.org/jira/browse/SPARK-46592) and [PR #44594](https://github.com/apache/spark/pull/44594), which attempted to work around the sporadic occurrence of ORA-65048 and ORA-04021 errors by setting the Oracle parameter DDL_LOCK_TIMEOUT.\n\nAs discussed in [issue #35](https://github.com/gvenzl/oci-oracle-free/issues/35), setting DDL_LOCK_TIMEOUT did not resolve the issue. The root cause appears to be an Oracle bug or unwanted behavior related to the use of Pluggable Database (PDB) rather than the expected functionality of Oracle itself.\n\nAdditionally, with [SPARK-48141](https://issues.apache.org/jira/browse/SPARK-48141), we have upgraded the Oracle version used in the tests to Oracle Free 23ai, version 23.4. This upgrade should help address some of the issues observed with the previous Oracle version.\n\n### Does this PR introduce _any_ user-facing change?\nNo\n\n### How was this patch tested?\nThis patch was tested using the existing test suite, with a particular focus on Oracle JDBC tests. The following steps were executed:\n```\nexport ENABLE_DOCKER_INTEGRATION_TESTS=1\n./build/sbt -Pdocker-integration-tests \"docker-integration-tests/testOnly org.apache.spark.sql.jdbc.OracleIntegrationSuite\"\n```\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo\n\nCloses #46598 from LucaCanali/fixOracleIntegrationTests.\n\nLead-authored-by: Kent Yao \nCo-authored-by: Luca Canali \nSigned-off-by: Kent Yao ","shortMessageHtmlLink":"[SPARK-48289][DOCKER][TEST] Clean up Oracle JDBC tests by skipping re…"}},{"before":"a252cbd5ca13fb7b758c839edc92b50336747d82","after":"ca3593288d577435a193f356b5214cf6f4bd534a","ref":"refs/heads/master","pushedAt":"2024-05-16T01:42:41.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"cloud-fan","name":"Wenchen Fan","path":"/cloud-fan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3182036?s=80&v=4"},"commit":{"message":"[SPARK-48252][SQL] Update CommonExpressionRef when necessary\n\n### What changes were proposed in this pull request?\n\nThe `With` expression assumes that it should be created after all input expressions are fully resolved. This is mostly true (function lookup happens after function input expressions are resolved), but there is a special case of column resolution in HAVING: we use `TempResolvedColumn` to try one column resolution option. If it doesn't work, re-resolve the column, which may be a different data type. `With` expression should update the refs when this happens.\n\n### Why are the changes needed?\n\nbug fix, otherwise the query will fail\n\n### Does this PR introduce _any_ user-facing change?\n\nThis feature is not released yet.\n\n### How was this patch tested?\n\nnew test\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nno\n\nCloses #46552 from cloud-fan/with.\n\nLead-authored-by: Wenchen Fan \nCo-authored-by: Wenchen Fan \nSigned-off-by: Wenchen Fan ","shortMessageHtmlLink":"[SPARK-48252][SQL] Update CommonExpressionRef when necessary"}},{"before":"a2d93d104a6c3307f50dfd91f5cc51a97d4c5837","after":"a252cbd5ca13fb7b758c839edc92b50336747d82","ref":"refs/heads/master","pushedAt":"2024-05-15T23:43:50.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"gengliangwang","name":"Gengliang Wang","path":"/gengliangwang","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1097932?s=80&v=4"},"commit":{"message":"[SPARK-48291][CORE] Rename Java Logger as SparkLogger\n\n### What changes were proposed in this pull request?\n\nTwo new classes `org.apache.spark.internal.Logger` and `org.apache.spark.internal.LoggerFactory` were introduced from https://github.com/apache/spark/pull/46301.\nGiven that Logger is a widely recognized **interface** in Log4j, it may lead to confusion to have a class with the same name. To avoid this and clarify its purpose within the Spark framework, I propose renaming `org.apache.spark.internal.Logger` to `org.apache.spark.internal.SparkLogger`. Similarly, to maintain consistency, `org.apache.spark.internal.LoggerFactory` should be renamed to `org.apache.spark.internal.SparkLoggerFactory`.\n\n### Why are the changes needed?\n\nTo avoid naming confusion and clarify the java Spark logger purpose within the logging framework\n\n### Does this PR introduce _any_ user-facing change?\n\nNo\n\n### How was this patch tested?\n\nGA tests\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo\n\nCloses #46600 from gengliangwang/refactorLogger.\n\nAuthored-by: Gengliang Wang \nSigned-off-by: Gengliang Wang ","shortMessageHtmlLink":"[SPARK-48291][CORE] Rename Java Logger as SparkLogger"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAETIKwtwA","startCursor":null,"endCursor":null}},"title":"Activity · apache/spark"}