英文:
How to convert string type to timestamp in pyspark?
问题
我在努力将基于字符串类型的日期转换为时间戳,如下所示。
我有以下的字符串类型,找到了一些代码可以将其转换为时间戳。我的pyspark代码如下。
但是尽管我尝试了很多次,结果都是空的。
任何帮助将不胜感激。
谢谢。
英文:
I am struggling to convert the string type based into timestamp as below.
+--------------------+
| mydate|
+--------------------+
|26/Feb/2023:13:58:40|
|26/Feb/2023:13:30:33|
|26/Feb/2023:13:52:50|
|26/Feb/2023:13:47:09|
|26/Feb/2023:13:30:33|
|26/Feb/2023:13:14:28|
|26/Feb/2023:13:11:42|
|26/Feb/2023:13:34:03|
|26/Feb/2023:13:50:43|
|26/Feb/2023:13:10:47|
|26/Feb/2023:13:28:09|
|26/Feb/2023:13:30:16|
|26/Feb/2023:13:19:07|
|26/Feb/2023:13:30:24|
|26/Feb/2023:13:30:16|
|26/Feb/2023:13:05:37|
|26/Feb/2023:13:09:24|
|26/Feb/2023:13:24:18|
|26/Feb/2023:13:49:13|
|26/Feb/2023:13:56:40|
+--------------------+
I have the string type as below and I found the some codes that makes it converted to the time stamp. My pyspark code is as below.
wt.select('mydate').show()
wt.select(to_timestamp(lit('mydate'),"dd/MMM/yyyy:HH:mm:ss")).show()
But the results are empty even though I tried many times.
+----------------------------------------------+
|to_timestamp('mydate', 'dd/MMM/yyyy:HH:mm:ss')|
+----------------------------------------------+
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
+----------------------------------------------+
Any help will be appreciated.
Thanks.
答案1
得分: 1
以下是代码部分的翻译:
The code you have is almost correct.
你的代码几乎是正确的。
If you have a dataframe with timestamps in string.
如果你有一个包含字符串格式的时间戳的数据框。
You convert the column of 'strDate' to the given format.
你将名为'strDate'的列转换为指定的格式。
Yields
产生的结果如下
We can verify the datatype with
我们可以使用以下方式验证数据类型
res.dtypes
数据类型如下:
[('to_timestamp(strDate, dd/MMM/yyyy:HH:mm:ss)', 'timestamp')]
英文:
The code you have is almost correct.
If you have a dataframe with timestamps in string.
+--------------------+
| strDate|
+--------------------+
|26/Feb/2023:13:30:16|
|26/Feb/2023:13:05:37|
+--------------------+
You convert the column of 'strDate' to the given format.
from pyspark.sql import functions as F
res = df.select(F.to_timestamp(F.col('strDate'),"dd/MMM/yyyy:HH:mm:ss")).show()
res.show()
Yields
+-------------------------------------------+
|to_timestamp(strDate, dd/MMM/yyyy:HH:mm:ss)|
+-------------------------------------------+
| 2023-02-26 13:30:16|
| 2023-02-26 13:05:37|
+-------------------------------------------+
We can verify the datatype with
res.dtypes
res.dtypes
Out[28]: [('to_timestamp(strDate, dd/MMM/yyyy:HH:mm:ss)', 'timestamp')]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论