问题

我正在使用Spark SQL，并使用to_date函数将时间戳字符串转换为日期格式。但尽管文档建议如此，我仍无法提取小数秒。

例如：

to_date("2017-09-09T21:01:12Z","yyyy-MM-dd'T'HH:mm:ss'Z'") -> 正常工作并返回日期
to_date("2017-09-09T21:01:12.234433Z","yyyy-MM-dd'T'HH:mm:ss.SSSSSS'Z'") -> 返回空值

有人能否提供这种行为的可能原因？

编辑（2023年4月19日）：我认为这可能是由于我使用的Spark版本导致的。这不是官方的Spark版本，很可能是Cloudera的一个分发版本（我将确认此事），因为在官方Spark版本3.0.0中，完全相同的函数可以正常工作。

英文:

I'm working in spark SQL and using this function to_date to convert the timestamp string in date format. But I'm unable to pick fractional seconds even though the documentation suggests so. Here it suggests that we can 'S' continuously, upto 6 singificant places, for fraction patterns.

For instance:

to_date(&quot;2017-09-09T21:01:12Z&quot;,&quot;yyyy-MM-dd&#39;T&#39;HH:mm:ss&#39;Z&#39;&quot;) -&gt; working and returns date
to_date(&quot;2017-09-09T21:01:12.234433Z&quot;,&quot;yyyy-MM-dd&#39;T&#39;HH:mm:ss.SSSSSS&#39;Z&#39;&quot;) -&gt; result in null

Can anybody suggests the possible reason for this behavior?

Edit (19-Apr-2023) : I believe this might be due to the version of spark that I'm using. Its not an offcial spark version and is probably a distribution of cloudera(I will confirm this) because the exact same function is properly working in official spark version 3.0.0

答案1

得分: 1

以下是已翻译的内容：

格式字符串区分大小写。
用于表示偏移量为零的区偏移字符是 X。

因此，这两个格式字符串应为 yyyy-MM-dd'T'HH:mm:ssX 和 yyyy-MM-dd'T'HH:mm:ss.SSSSSSX，但如果区偏移始终为零，则也可以使用 yyyy-MM-dd'T'HH:mm:ss.SSSSSS'Z'。

英文:

Two things should be changed (docs):

the format string is case-sensitive and
the character to use for the zone-offset with offset zero printed as Z is X

So the two format strings should be yyyy-MM-dd'T'HH:mm:ssX and yyyy-MM-dd'T'HH:mm:ss.SSSSSSX, but using yyyy-MM-dd'T'HH:mm:ss.SSSSSS'Z' also works of the zone-offset is always zero.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在Spark SQL的日期模式中选择分数秒？

问题

答案1

将每一行字符串保存为文件在Spark中。

将Spark SQL转换为Python Spark / Databricks管道事件日志。

Numpy向量化破坏了数据类型 (2)

在Spark会话中设置 “table”。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论