Describe the bug
As mentioned in this issue, apache/datafusion#16903, after upgrading DataFusion to version 49, the output of the MD5 function defaults to Utf8View due to the changes in apache/datafusion#16290.
This data format is not fully supported in Auron. If the result of the MD5 function is used as the input for a hash, an error will occur.
To Reproduce
The following test case can be added in AuronFunctionSuite:
test("md5 function") {
withTable("t1") {
sql("create table t1 using parquet as select 'spark' as c1, '3.x' as version")
val functions =
"""
|select b.md5
|from (
| select c1, version from t1
|) a join (
| select md5(concat(c1, version)) as md5 from t1
|) b on md5(concat(a.c1, a.version)) = b.md5
|""".stripMargin
val df = sql(functions)
checkAnswer(df, Seq(Row("9ff36a3857e29335d03cf6bef2147119")))
}
}
This will result in the following error:
Caused by: java.lang.RuntimeException: task panics: Execution error: Execution error: output_with_sender[Project] error: Execution error: output_with_sender[BroadcastJoin] error: Execution error: Unsupported data type in hasher: Utf8View
Expected behavior
The MD5 function should work correctly. Perhaps full support for Utf8View in Auron will not be available soon. If that's the case, the MD5 function could revert to the old logic that does not convert the return value to a StringViewArray.
Describe the bug
As mentioned in this issue, apache/datafusion#16903, after upgrading DataFusion to version 49, the output of the MD5 function defaults to
Utf8Viewdue to the changes in apache/datafusion#16290.This data format is not fully supported in Auron. If the result of the MD5 function is used as the input for a hash, an error will occur.
To Reproduce
The following test case can be added in AuronFunctionSuite:
This will result in the following error:
Caused by: java.lang.RuntimeException: task panics: Execution error: Execution error: output_with_sender[Project] error: Execution error: output_with_sender[BroadcastJoin] error: Execution error: Unsupported data type in hasher: Utf8View
Expected behavior
The MD5 function should work correctly. Perhaps full support for
Utf8Viewin Auron will not be available soon. If that's the case, the MD5 function could revert to the old logic that does not convert the return value to a StringViewArray.