英文:
Why different behavior when mixed case are used, vs same case are used in spark 3.2
问题
以下是翻译好的部分:
在Spark 3.2中,我正在运行一个简单的查询。
上面的查询返回结果,但当我混合大小写时会导致异常。
在我的测试中,caseSensitive默认值为false。
我期望两个查询都返回结果,或两个查询都失败。
为什么其中一个失败而另一个不失败呢?
英文:
I am running a simple query in spark 3.2
val df1 = sc.parallelize(List((1,2,3,4,5),(1,2,3,4,5))).toDF("id","col2","col3","col4", "col5")
val op_cols_same_case = List("id","col2","col3","col4", "col5", "id")
val df2 = df1.select(op_cols_same_case.head, op_cols_same_case.tail: _*)
df2.select("id").show()
The above query return the result, but when I mix the casing it gives exception
val df1 = sc.parallelize(List((1,2,3,4,5),(1,2,3,4,5))).toDF("id","col2","col3","col4", "col5")
val op_cols_diff_case = List("id","col2","col3","col4", "col5", "ID")
val df2 = df1.select(op_cols_diff_case.head, op_cols_diff_case.tail: _*)
df2.select("id").show()
In my test caseSensitive was default (false).
I expect both queries to return the result. Or both queries to fail.
Why is it failing for one and not for the other one?
答案1
得分: 0
我们将此视为一个基于个人逻辑的问题或非问题。在此拉取请求上有一条长线程,一些人认为它是正确的,而一些人认为它是错误的。
但拉取请求的更改确实使行为一致。
英文:
We see this as an issue or non-issue based on what seems logical to one. There is a long thread on this pull request, where some believe it to be correct while some think its wrong.
But the pull request changes do make the behavior consistent.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论