Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-50370][SQL] Codegen Support for json_tuple #48908

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Nov 20, 2024

What changes were proposed in this pull request?

The pr aims to add Codegen Support for JsonTuple(json_tuple).

Why are the changes needed?

  • improve codegen coverage.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • Pass GA & Existed UT (eg: JsonFunctionsSuite#*json_tuple*, JsonExpressionsSuite#*json_tuple*)
  • Updated existed UT.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Nov 20, 2024
@dongjoon-hyun
Copy link
Member

BTW, FYI, I added you to the Apache Spark Committers group in ASF JIRA system in a few minutes ago.

BEFORE
Screenshot 2024-11-20 at 08 36 52

AFTER
Screenshot 2024-11-20 at 08 37 02

@panbingkun
Copy link
Contributor Author

panbingkun commented Nov 20, 2024

BTW, FYI, I added you to the Apache Spark Committers group in ASF JIRA system in a few minutes ago.

BEFORE Screenshot 2024-11-20 at 08 36 52

AFTER Screenshot 2024-11-20 at 08 37 02

Oh, great, thank you! ❤️

@@ -273,8 +273,9 @@ class JsonExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper with
}

test("json_tuple escaping") {
Copy link
Contributor Author

@panbingkun panbingkun Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test("stack") {
    GenerateUnsafeProjection.generate(
      Stack(Seq(2, 1, 2, 3).map(Literal(_))) :: Nil)
  }
  • which is also not supported and throw an exception.
19:59:59.933 ERROR org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator: Failed to compile the generated Java code.
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 78, Column 30: Assignment conversion not possible from type "scala.collection.mutable.ArraySeq" to type "org.apache.spark.sql.catalyst.util.ArrayData"
	at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:13014)
	at org.codehaus.janino.UnitCompiler.assignmentConversion(UnitCompiler.java:11263)
	at org.codehaus.janino.UnitCompiler.access$3900(UnitCompiler.java:236)
	at org.codehaus.janino.UnitCompiler$7.visitRvalue(UnitCompiler.java:2764)
	at org.codehaus.janino.UnitCompiler$7.visitRvalue(UnitCompiler.java:2754)
	at org.codehaus.janino.Java$Rvalue.accept(Java.java:4498)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2754)
19:59:59.940 ERROR org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator: 
/* 001 */ public java.lang.Object generate(Object[] references) {
/* 002 */   return new SpecificUnsafeProjection(references);
/* 003 */ }
/* 004 */
/* 005 */ class SpecificUnsafeProjection extends org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
/* 006 */
/* 007 */   private Object[] references;
/* 008 */   private org.apache.spark.sql.catalyst.expressions.codegen.UnsafeArrayWriter[] mutableStateArray_2 = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeArrayWriter[1];
/* 009 */   p...
  • therefore, the testing approach has been modified here with the same ultimate goal.

@panbingkun panbingkun marked this pull request as ready for review November 22, 2024 12:05
@panbingkun
Copy link
Contributor Author

cc @cloud-fan @MaxGekk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants