在Spark 3.2中,当使用大小写混合时与使用相同大小写时,行为会有所不同。

huangapple go评论56阅读模式
英文:

Why different behavior when mixed case are used, vs same case are used in spark 3.2

问题

以下是翻译好的部分:

在Spark 3.2中,我正在运行一个简单的查询。

上面的查询返回结果,但当我混合大小写时会导致异常。

在我的测试中,caseSensitive默认值为false。

我期望两个查询都返回结果,或两个查询都失败。

为什么其中一个失败而另一个不失败呢?

英文:

I am running a simple query in spark 3.2

val df1 = sc.parallelize(List((1,2,3,4,5),(1,2,3,4,5))).toDF("id","col2","col3","col4", "col5")
val op_cols_same_case = List("id","col2","col3","col4", "col5", "id")
val df2 = df1.select(op_cols_same_case.head, op_cols_same_case.tail: _*)
df2.select("id").show() 

The above query return the result, but when I mix the casing it gives exception

val df1 = sc.parallelize(List((1,2,3,4,5),(1,2,3,4,5))).toDF("id","col2","col3","col4", "col5")
val op_cols_diff_case = List("id","col2","col3","col4", "col5", "ID")
val df2 = df1.select(op_cols_diff_case.head, op_cols_diff_case.tail: _*)
df2.select("id").show() 

In my test caseSensitive was default (false).
I expect both queries to return the result. Or both queries to fail.
Why is it failing for one and not for the other one?

答案1

得分: 0

我们将此视为一个基于个人逻辑的问题或非问题。在此拉取请求上有一条长线程,一些人认为它是正确的,而一些人认为它是错误的。

但拉取请求的更改确实使行为一致。

英文:

We see this as an issue or non-issue based on what seems logical to one. There is a long thread on this pull request, where some believe it to be correct while some think its wrong.

But the pull request changes do make the behavior consistent.

huangapple
  • 本文由 发表于 2023年2月8日 17:31:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/75383698.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定