英文:
How to format Timestamp for ingesting JSON data into AWS IoT Analytics datastore using Parquet file format?
问题
我想将数据以 Parquet 格式摄入到 AWS IoT Analytics 数据存储中。以下是频道中记录的样式。
{
"Total_in": 1825.5841,
"Time": "2023-02-17T14:08:19"
}
问题是,我需要如何格式化时间(作为管道活动的一部分进行转换),以便在 Parquet 文件中用作 "timestamp"?
Parquet 文件的架构如下所示。
Column name Data type
time TIMESTAMP
total_in FLOAT
我尝试过将时间戳使用秒、毫秒以及 `%Y-%m-%dT%H:%M:%S`(Python 格式)的形式。在这种情况下,从未有记录进入数据存储("最后消息到达时间" 始终为空)。如果我更改为 `%Y-%m-%dT%H:%M:%S..%fZ`,记录会到达数据存储("最后消息到达时间" 不为空),但如果我运行查询(`Select * from datastore`),结果集为空。
我已经启用了日志记录,但管道日志和数据存储日志均不包含任何信息。
数据存储不包含分区/分区已禁用。
英文:
I'd like to ingest data into an AWS IoT Analytics datastore in Parquet format. This is how the records are in the channel.
{
"Total_in": 1825.5841,
"Time": "2023-02-17T14:08:19"
}
Question is, how do I need to format the time (in a transformation as part of a pipeline activity), to be used as a "timestamp" in the parquet file?
The schema of the parquet files looks like the following.
Column name Data type
time TIMESTAMP
total_in FLOAT
I tried to use timestamp in seconds, in milliseconds as well as the %Y-%m-%dT%H:%M:%S
(Python) and in this case never a records gets into the data store ("Last message arrival time" is always none). If I change to %Y-%m-%dT%H:%M:%S..%fZ
records arrive in the data store ("Last message arrival time" is not null), but if I run a query (Select * from datastore
), then the result set is empty.
I already enabled logging, but neither the pipeline logs nor the datastore logs contain any information.
The datastore does not contain partitions/partitions are disabled.
答案1
得分: 0
时间戳需要以yyyy-MM-dd HH:mm:ss
的格式提供(例如:2020-10-22 11:23:48)。
英文:
The timestamp needs to be provided in the format yyyy-MM-dd HH:mm:ss
(eg: 2020-10-22 11:23:48).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论