英文:
Unexpected Clickhouse datetime results
问题
I'm transferring data from one clickhouse server to another one and faced problem with data filtering and timezone doesn't look correct. For simplicity lets call these servers server A and server B.
Server A has following response for time functions: timezone() = Europe/Moscow, now() = 2023-04-13 10:39:25
Server B returns same data for these functions.
Both servers actually do not return Europe/Moscow time, they return UTC time (2023-04-13 10:39:25 is UTC and 2023-04-13 13:39:25 is correct Europe/Moscow time)
Server A table has following timeseries column - _timestamp (DateTime64(3))
Server B table has following timeseries column - _timestamp (DateTime64(3))
I'm transferring data like following:
INSERT INTO TABLE B (LIST OF COLUMNS)
SELECT
(LIST OF COLUMNS)
from remote('server A', databasenameA.tablenameA, 'user', 'password')
where _timestamp >= '2022-09-01 00:00:00' and _timestamp < '2023-02-01 00:00:00'
Actually this script takes data in range >= '2022-08-31 21:00:00' and < '2023-01-31 21:00:00'
I can prove this if I take min and max date in target table after loading:
select min(_timestamp
) from TABLE_A = 2022-08-31 21:00:00
select max(_timestamp
) from TABLE_A = 2023-01-31 20:59:59
Why, despite the correct timezone() = Europe/Moscow, do I receive incorrect datetime (actually UTC time)? Why, despite the filtering conditions >= '2022-09-01 00:00:00' and < '2023-02-01 00:00:00', does ClickHouse convert these date conditions to other values? I'm using DBeaver to run these statements and do not have direct access to the server machine.
英文:
I'm transferring data from one clickhouse server to another one and faced problem with data filtering and timezone doesn't look correct. For simplicity lets call these servers server A and server B.
Server A has following response for time functions : timezone() = Europe/Moscow , now() = 2023-04-13 10:39:25
Server B returns same data for these functions.
Both servers actually do not return Europe/Moscow time , they return UTC time (2023-04-13 10:39:25 is UTC and 2023-04-13 13:39:25 is correct Europe/Moscow time
Server A table has following timeseries column - _timestamp (DateTime64(3))
Server B table has following timeseries column - _timestamp (DateTime64(3))
I'm transfering data like following
INSERT INTO TABLE B (LIST OF CLUMNS)
SELECT
(LIST OF COLUMNS)
from remote('server A',databasenameA.tablenameA,'user','password')
where _timestamp >= '2022-09-01 00:00:00' and _timestamp < '2023-02-01 00:00:00'
Actually this script takes data in range >= '2022-08-31 21:00:00' and < '2023-01-31 21:00:00'
I can prove this if i take min and max date in target table after loading
select min(_timestamp
) from TABLE_A = 2022-08-31 21:00:00
select max(_timestamp
) from TABLE_A = 2023-01-31 20:59:59
Why despite on correct timezone() = Europe/Moscow i receive incorrect datetime (actually UTC time).
Why despite on filtering conditions >= '2022-09-01 00:00:00' and < '2023-02-01 00:00:00' clickhouse convert these date conditions to another values.
I'm using DBeaver to run these stateements and do not have direct access to server machine.
答案1
得分: 1
当您查询表格如下所示:
select min(_timestamp) from TABLE_A
您会得到一个由DBeaver渲染成文本的值。这不是原始的日期时间戳,而是UInt32值,由JVM在您桌面上使用当前时区转换为文本。
我建议使用 toUnixTimestamp
来避免混淆。
select toUnixTimestamp(min(_timestamp)) as min_timestamp from TABLE_A
然后在 INSERT SELECT
中将 min_timestamp
用作筛选谓词。
英文:
When you query table like this
select min(_timestamp) from TABLE_A
You get a value rendered into the text by DBeaver. It's not the orignal datetime timestamp, but the UInt32 value converted to Text by JVM using your current timezone at your desktop.
I suggest to use toUnixTimestamp
to avoid confusion.
select toUnixTimestamp(min(_timestamp)) as min_timestamp from TABLE_A
And use that min_timestamp
as a filter predicate in the INSERT SELECT
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论