英文:
Slow indexing in Aurora PostgreSQL (Serverless v2)
问题
我正在尝试为一个拥有10亿行的表建立索引。已经过去了24小时,但查询仍在运行中:CREATE INDEX idx1_table1b ON table1b USING HASH(column1)
。
由于column1经常使用等号(=)进行过滤,我选择了哈希索引作为索引类型。我正在使用的DB实例类型是Serverless V2,ACU min-max: 16-128,PostgreSQL 14.6。
不确定我是否在配置或语句中遗漏了什么,感谢任何帮助!谢谢!
英文:
I'm trying to build an index for a table with 1B of rows. 24 hours has passed and the query is still running:
CREATE INDEX idx1_table1b on table1b using HASH(column1)
.
Since column1 is often filtered with equality operator(=), I've chosen hash indexing to be the index type. The DB instance class I'm using is Serverless V2, ACU min-max:16-128, PostgreSQL 14.6.
Not sure if I missed anything in the configuration or statement, any help is appreciated, Thanks!
答案1
得分: 0
发现该列有大量重复值,这可能是散列停止(或花费很长时间构建散列索引)的原因。
解决我的问题的方法是使用btree(适应重复值很好),并且索引在几分钟内构建完成。在查询中使用索引列执行连接操作的性能在毫秒级别。
英文:
Found out the column has tons of duplicate value, which might be the cause why the hashing halted(or took a long time to build hash-index).
The solution to my problem is to use btree(which accommodates well duplicate values) and the indexed was built in minutes. The performance of using indexed column to perform join in a query is at milli-second performance.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论