2023年7月27日 22:23:26go评论95阅读模式

英文:

Optimal query to PostgreSQL and complex index on 3 columns, when 2 columns have static values and 3-rd one uses operator IN

问题

以下是已翻译的内容：

"有一个带有复合索引（col1、col2、col3）的表。这个表中有大量的数据。想要构建一个查询，例如 WHERE col1 = 2 AND col2 = 12 AND col3 IN (1, 2, 3, ..., 40)，是否有一种方法可以完全利用索引（包括这3列）？

当我尝试以下SQL查询：

SELECT *
FROM table t
WHERE t.col1 = 2
    AND t.col2 = 12
    AND t.col3 IN (1, 2, 3, ..., 40)

Postgres查询规划器会在（col1、col2）上进行索引扫描，然后使用SeqScan逐个过滤出符合col3 IN (1, 2, 3, ..., 40)条件的40万行数据。

如果我尝试以下SQL查询：

SELECT *
FROM table t
WHERE (col1, col2, col3) IN VALUES (2, 12, 1), (2, 12, 2), (2, 12, 3), ... ,(2, 12, 40)

会出现错误：

临时文件大小超出了temp_file_limit

所以速度很慢。是否有一种方法可以让Postgres以某种方式使用复合索引来处理这3列？"

英文:

There's a table with compound index (col1, col2, col3). There's a lot of data in this table.
Want to build query for example WHERE col1 = 2 AND col2 = 12 AND col3 IN (1, 2, 3, ..., 40)
Is there a way to use index fully (with 3 columns)?

When I'm trying

SELECT *
FROM table t
WHERE t.col1 = 2
    AND t.col2 = 12
    AND t.col3 IN (1, 2, 3, ..., 40)

Postgres planner makes index scan on (col1, col2) and then uses SeqScan to filter one by one 400k of rows with col3 IN (1, 2, 3, ..., 40)

If I try

SELECT *
FROM table t
WHERE (col1, col2, col3) IN VALUES (2, 12, 1), (2, 12, 2), (2, 12, 3), ... ,(2, 12, 40)

it gives error:

> temporary file size exceeds temp_file_limit

So it works slow. Is there a way to make postgres use somehow compound index for 3 columns?

答案1

得分: 0

你可以尝试将col3的可能值加载到一个真正的表中，然后将查询重写如下：

SELECT t1.*
FROM yourTable t1
WHERE t1.col1 = 2 AND
      t1.col2 = 12 AND
      EXISTS (
          SELECT 1
          FROM table2 t2
          WHERE t2.col3 = t1.col3
      );

这假设table2具有以下结构：

table2:
col3
1
2
3
...
40

table1可能可以在(col1, col2, col3)上使用索引。还应在table2 (col3)上放置索引，以确保快速查找。

英文:

You could try loading the col3 possible values into a bona fide table and then rewriting the query to the following:

SELECT t1.*
FROM yourTable t1
WHERE t1.col1 = 2 AND
      t1.col2 = 12 AND
      EXISTS (
          SELECT 1
          FROM table2 t2
          WHERE t2.col3 = t1.col3
      );

This assumes that table2 has the following structure:

table2:
col3
1
2
3
...
40

table1 might be able to use an index on (col1, col2, col3). An index should also be placed on table2 (col3), to ensure rapid lookups.

答案2

得分: 0

根据您的评论，看起来我们可以通过在 (col1, col2, col3) = (arg1, arg2, arg3) 上进行显式连接来强制使用索引。

我不知道您是如何调用这个查询的，但如果从支持通过数据库驱动程序传递 int[] 类型的主机语言调用，我的查询将如下所示：

with invars as (
  select 2 as c1val, 12 as c2val,
         array[1, 2, 3, 4, 5, 6, 40] as c3vals  
), search_tuples as (
  select i.c1val, i.c2val, u.c3val
    from invars i
         cross join lateral unnest(i.c3vals) as u(c3val)
)
select t.*
  from search_tuples s
       join table1 t 
    on (t.col1, t.col2, t.col3) = (s.c1val, s.c2val, s.c3val);

具有随机测试记录和 explain 的可工作演示

英文:

Based on your comment, it looks like we can force use of the index through an explicit join on (col1, col2, col3) = (arg1, arg2, arg3).

I don't know how you are calling this query, but if called from a host language that allows passing an int[] type through the database driver, my query would look like this:

with invars as (
  select 2 as c1val, 12 as c2val,
         array[1, 2, 3, 4, 5, 6, 40] as c3vals  
), search_tuples as (
  select i.c1val, i.c2val, u.c3val
    from invars i
         cross join lateral unnest(i.c3vals) as u(c3val)
)
select t.*
  from search_tuples s
       join table1 t 
    on (t.col1, t.col2, t.col3) = (s.c1val, s.c2val, s.c3val);

A working fiddle with random test records and explain

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Optimal query to PostgreSQL and complex index on 3 columns, when 2 columns have static values and 3-rd one uses operator IN

问题

答案1

答案2

在JavaFX中显示原始字节数组图像（不使用SwingFXUtils）。

为什么《Rust书》中的BufRead的高效示例高效？

将SQL查询生成的XML文件中添加具有静态值的根元素。

将一个数组分成两半是否会提高性能或处理时间？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。