英文:
Create a sorted table in Redshift
问题
我想创建一个表,该表将根据一个列自动排序,但将null
值放在最后。
我认为可以使用sortkey
来实现,但在查询表时,它仍然按照填充的顺序返回:
英文:
I would like to create a table that will be automatically sorted by a column, but with null
values last.
I thought i could do it with sortkey
like this, but when querying the table it still comes back just in the order it was filled:
create temp table temp (
id integer,
date date sortkey ,
month integer,
cost integer);
insert into temp values
(1, '2022-09-01', 1, 5),
(3, '2023-01-01', 2, 3),
(1, '2022-09-01', 2, 20),
(2, '2022-11-01', 3, 26),
(1, null, 3, 33),
(1, null, 4, 39),
(1, '2022-09-01', 6, 60),
(1, null, 7, 72),
(2, '2022-11-01', 1, 11),
(2, null, 2, 14),
(2, '2022-11-01', 4, 55),
(2, '2022-11-01', 5, 99),
(3, null, 1, 1),
(1, '2022-09-01', 5, 44),
(3, '2023-01-01', 3, 4);
.
答案1
得分: 1
First off Redshift is a cluster and portions of the table are stored on each “slice” of the cluster. So at some level these slices aren’t “sorted” with respect to each other. You don’t specify the size of the cluster you are using so it is not clear how many slices are in play. When the data comes back it will be in the order in which the slices return their data.
Next, the data stored on Redshift’s disks are only resorted on a VACUUM command. Vacuum rewrites all the blocks of disk storage with blocks that are in sort order. So if you didn’t Vacuum the table no change will happen.
Lastly Redshift will skip the sort phase of vacuum if less than the threshold number of rows are unsorted - 95% by default. So you will need to vacuum “TO 100 PERCENT” to be sure that the entire table is sorted by vacuum.
英文:
First off Redshift is a cluster and portions of the table are stored on each “slice” of the cluster. So at some level these slices aren’t “sorted” with respect to each other. You don’t specify the size of the cluster you are using so it is not clear how many slices are in play. When the data comes back it will be in the order in which the slices return their data.
Next, the data stored on Redshift’s disks are only resorted on a VACUUM command. Vacuum rewrites all the blocks of disk storage with blocks that are in sort order. So if you didn’t Vacuum the table no change will happen.
Lastly Redshift will skip the sort phase of vacuum if less than the threshold number of rows are unsorted - 95% by default. So you will need to vacuum “TO 100 PERCENT” to be sure that the entire table is sorted by vacuum.
答案2
得分: 0
我是一名Redshift专家。
如果您刚开始使用Redshift,这个PDF可能会对您有兴趣;
英文:
I am a Redshift specialist.
If you're just starting with Redshift, this PDF may be of interest;
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论