2023年2月8日 17:31:24go评论122阅读模式

英文:

Why different behavior when mixed case are used, vs same case are used in spark 3.2

问题

以下是翻译好的部分：

在Spark 3.2中，我正在运行一个简单的查询。

上面的查询返回结果，但当我混合大小写时会导致异常。

在我的测试中，caseSensitive默认值为false。

我期望两个查询都返回结果，或两个查询都失败。

为什么其中一个失败而另一个不失败呢？

英文:

I am running a simple query in spark 3.2

val df1 = sc.parallelize(List((1,2,3,4,5),(1,2,3,4,5))).toDF(&quot;id&quot;,&quot;col2&quot;,&quot;col3&quot;,&quot;col4&quot;, &quot;col5&quot;)
val op_cols_same_case = List(&quot;id&quot;,&quot;col2&quot;,&quot;col3&quot;,&quot;col4&quot;, &quot;col5&quot;, &quot;id&quot;)
val df2 = df1.select(op_cols_same_case.head, op_cols_same_case.tail: _*)
df2.select(&quot;id&quot;).show()

The above query return the result, but when I mix the casing it gives exception

val df1 = sc.parallelize(List((1,2,3,4,5),(1,2,3,4,5))).toDF(&quot;id&quot;,&quot;col2&quot;,&quot;col3&quot;,&quot;col4&quot;, &quot;col5&quot;)
val op_cols_diff_case = List(&quot;id&quot;,&quot;col2&quot;,&quot;col3&quot;,&quot;col4&quot;, &quot;col5&quot;, &quot;ID&quot;)
val df2 = df1.select(op_cols_diff_case.head, op_cols_diff_case.tail: _*)
df2.select(&quot;id&quot;).show()

In my test caseSensitive was default (false).
I expect both queries to return the result. Or both queries to fail.
Why is it failing for one and not for the other one?

答案1

得分: 0

我们将此视为一个基于个人逻辑的问题或非问题。在此拉取请求上有一条长线程，一些人认为它是正确的，而一些人认为它是错误的。

但拉取请求的更改确实使行为一致。

英文:

We see this as an issue or non-issue based on what seems logical to one. There is a long thread on this pull request, where some believe it to be correct while some think its wrong.

But the pull request changes do make the behavior consistent.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Spark 3.2中，当使用大小写混合时与使用相同大小写时，行为会有所不同。

问题

答案1

Comments convention "// $example on:" and "// $example off:" in Scala and Java

在Pyspark中应用Mongo的查找查询。

最快的方式找到在最后 ‘x’ 分钟内修改的文件。

数据帧在经常使用的筛选列上重新分区如何在Spark中有所帮助？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论