Timescale多维分区

huangapple go评论61阅读模式
英文:

Timescale multidimension partitioning

问题

我创建了一个多维超表,将id作为第二维:

SELECT create_hypertable(
    '<table>',
    '<tstamp>',
    'id',
    1080,
    migrate_data => true,
    chunk_time_interval => INTERVAL '6 hours'
);

我的表中有1080个不同的id。
我的期望是找到1080个块,每个块中包含一个id的所有值。但出于某种原因,我只找到了686个块。当我探索这些块时,我发现一些块中只有1、2或3个id。

我是否有什么理解上的错误?

英文:

I created a multi dimensional hypertable with the id as a 2nd dimension :

SELECT create_hypertable(
    '<table>',
    '<tstamp>',
	'id',
	1080,
	migrate_data => true,
    chunk_time_interval => INTERVAL '6 hours'
);

I have 1080 distinct ids in my table.
My expectation was to find 1080 chunks, each with all the values for a single id in them. But for some reason, I find only 686 chunks. And When I explore the chunks, I see some with 1, 2 or 3 ids.

Is there something I dont understand ?

答案1

得分: 1

这不意味着你将拥有1080个块。你将拥有的块数量将取决于表中的时间戳值。number_partitions表示在特定的时间戳范围内,你最多可以拥有1080个不同的哈希分区范围。哈希分区不保证你将拥有与你指定的数量完全相同的分区。不同的值可能最终会放在同一个分区中。

要检查块的数量并显示示例:

postgres=# insert into testtimescale select x.i%2160 , '2023-06-18 13:00:00+00'::timestamp, x.i as name from generate_series(1,10800) x(i);
INSERT 0 10800
postgres=# select count(*) from timescaledb_information.chunks where range_start = '2023-06-18 12:00:00+00' and range_end = '2023-06-18 18:00:00+00';
 count
-------
   932
(1 row)

postgres=# insert into testtimescale select x.i%1234 , '2023-06-18 13:00:00+00'::timestamp, x.i as name from generate_series(1,10800) x(i);
INSERT 0 10800
postgres=# select count(*) from timescaledb_information.chunks where range_start = '2023-06-18 12:00:00+00' and range_end = '2023-06-18 18:00:00+00';
 count
-------
   932
(1 row)

postgres=# insert into testtimescale select x.i%5000 , '2023-06-18 13:00:00+00'::timestamp, x.i as name from generate_series(1,10800) x(i);
INSERT 0 10800
postgres=# select count(*) from timescaledb_information.chunks where range_start = '2023-06-18 12:00:00+00' and range_end = '2023-06-18 18:00:00+00';
 count
-------
  1068
(1 row)

postgres=# insert into testtimescale select x.i%7000 , '2023-06-18 13:00:00+00'::timestamp, x.i as name from generate_series(1,10800) x(i);
INSERT 0 10800
postgres=# select count(*) from timescaledb_information.chunks where range_start = '2023-06-18 12:00:00+00' and range_end = '2023-06-18 18:00:00+00';
 count
-------
  1077
(1 row)

postgres=# insert into testtimescale select x.i%10000 , '2023-06-18 13:00:00+00'::timestamp, x.i as name from generate_series(1,108000) x(i);
INSERT 0 108000
postgres=# select count(*) from timescaledb_information.chunks where range_start = '2023-06-18 12:00:00+00' and range_end = '2023-06-18 18:00:00+00';
 count
-------
  1080
(1 row)

正如你在上面看到的,当新数据加载到id列时,新的块将被创建,但不能超过1080。

英文:

It does not mean you are going to have 1080 chunks. The number of chunks you will have will chance depending on the timestamp values in your table. The number_partitions means that in a specific timestamp range you can have upmost 1080 different hash partition ranges. Hash partition does not guarantee that you will have exactly the same number of partitions with the number you specified. Different values might end up with in the same partition.

In order to check number of chunks and to show example;

postgres=# insert into testtimescale  select x.i%2160 , '2023-06-18 13:00:00+00'::timestamp, x.i as name from generate_series(1,10800) x(i);
INSERT 0 10800
postgres=# select count(*) from timescaledb_information.chunks where range_start = '2023-06-18 12:00:00+00' and range_end = '2023-06-18 18:00:00+00';
 count
-------
   932
(1 row)

postgres=# insert into testtimescale  select x.i%1234 , '2023-06-18 13:00:00+00'::timestamp, x.i as name from generate_series(1,10800) x(i);
INSERT 0 10800
postgres=# select count(*) from timescaledb_information.chunks where range_start = '2023-06-18 12:00:00+00' and range_end = '2023-06-18 18:00:00+00';
 count
-------
   932
(1 row)

postgres=# insert into testtimescale  select x.i%5000 , '2023-06-18 13:00:00+00'::timestamp, x.i as name from generate_series(1,10800) x(i);
INSERT 0 10800
postgres=# select count(*) from timescaledb_information.chunks where range_start = '2023-06-18 12:00:00+00' and range_end = '2023-06-18 18:00:00+00';
 count
-------
  1068
(1 row)

postgres=# insert into testtimescale  select x.i%7000 , '2023-06-18 13:00:00+00'::timestamp, x.i as name from generate_series(1,10800) x(i);
INSERT 0 10800
postgres=# select count(*) from timescaledb_information.chunks where range_start = '2023-06-18 12:00:00+00' and range_end = '2023-06-18 18:00:00+00';
 count
-------
  1077
(1 row)

postgres=# insert into testtimescale  select x.i%10000 , '2023-06-18 13:00:00+00'::timestamp, x.i as name from generate_series(1,108000) x(i);
INSERT 0 108000
postgres=# select count(*) from timescaledb_information.chunks where range_start = '2023-06-18 12:00:00+00' and range_end = '2023-06-18 18:00:00+00';
 count
-------
  1080
(1 row)

As you can see above when new data loaded into to id column new chunks will be created, but it cannot exceed 1080.

huangapple
  • 本文由 发表于 2023年6月19日 18:17:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/76505660.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定