优化 nvarchar 列上的多个 LIKE 操作的 SQL 查询

huangapple go评论86阅读模式
英文:

Optimizing SQL query with multiple LIKE operations on nvarchar columns

问题

我正面临一个涉及到nvarchar列上的多个LIKE操作的SQL查询性能问题。该查询旨在从extraOne到extraEight的几乎每一列中搜索子字符串,涉及三个表(除了表A,它只有三个可搜索列)。
如果在这三个表中找到匹配项,我需要从表A检索记录(C -> B -> A)。

表结构如下:

表A
- Id
- ExtraOne
...
- ExtraThree

表B
- Id
- ExtraOne
...
- ExtraEight
- A_Id(指向表A的外键)

表C
- Id
- ExtraOne
...
- ExtraEight
- B_Id(指向表B的外键)

目前,我正在使用以下查询:

SELECT [t1].[id]
FROM
(
    SELECT DISTINCT [t0].[id]
    FROM
    (
        SELECT [b].[id]
        FROM [Table A] AS [b]
        WHERE
        (
            (
                ([b].[id] LIKE '%searchText%')
                OR ([b].[extraone] LIKE '%searchText%')
            )
            OR ([b].[extratwo] LIKE '%searchText%')
        )
        UNION
        SELECT [b0].[id]
        FROM [Table A] AS [b0]
        INNER JOIN [Table B] AS [c] ON [b0].[id] = [c].[A_Id]
        WHERE
        (
            (
                (
                    (
                        (
                            (
                                (
                                    [c].[id] LIKE '%searchText%')
                                    OR ([c].[name] LIKE '%searchText%')
                                )
                                OR ([c].[extraone] LIKE '%searchText%')
                            )
                            OR ([c].[extratwo] LIKE '%searchText%')
                        )
                        OR ([c].[extrathree] LIKE '%searchText%')
                    )
                    OR ([c].[extrafour] LIKE '%searchText%')
                )
                OR ([c].[extrafive] LIKE '%searchText%')
            )
            OR ([c].[extrasix] LIKE '%searchText%')
        )
        UNION
        SELECT [b1].[id]
        FROM [Table A] AS [b1]
        INNER JOIN [Table B] AS [c0] ON [b1].[id] = [c0].[A_id]
        INNER JOIN [Table C] AS [d] ON [c0].[id] = [d].[B_id]
        WHERE
        (
            (
                (
                    (
                        (
                            (
                                (
                                    [d].[id] LIKE '%searchText%')
                                    OR ([d].[name] LIKE '%searchText%')
                                )
                                OR ([d].[extraone] LIKE '%searchText%')
                            )
                            OR ([d].[extratwo] LIKE '%searchText%')
                        )
                        OR ([d].[extrathree] LIKE '%searchText%')
                    )
                    OR ([d].[extrafour] LIKE '%searchText%')
                )
                OR ([d].[extrafive] LIKE '%searchText%')
            )
            OR ([d].[extrasix] LIKE '%searchText%')
        )
    ) AS [t0]
) AS [t1]
ORDER BY [t1].[id]
OFFSET 0 ROWS FETCH NEXT 2 ROWS ONLY

我想优化这个查询以提高性能。以下是我有的一些问题:

  • 是否有任何可供应用的替代技术或优化,以改善查询中的LIKE操作的性能?

  • 是否应该考虑使用全文搜索来替代LIKE,在nvarchar列中搜索子字符串?

  • 如何确保在与搜索条件相关的列上创建适当的索引以提高查询性能?

到目前为止,通过使用UNION而不是简单的左连接,我已经实现了性能提升,但仍然需要一些优化。

此外,我注意到当fetch next的行数为100时,速度要快得多(3倍),而当限制为10时则速度较慢。

英文:

I am facing performance issues with an SQL query that involves multiple LIKE operations on nvarchar columns. The query aims to search for a substring in almost every column, from extraOne to extraEight, in three tables (except table A that has only three searchable columns).
And if match is found in any of these tree tables I need to retrieve records from Table A (C -> B -> A).
Tables structure

Table A
- Id
- ExtraOne
...
- ExtraThree
Table B
- Id
- ExtraOne
...
- ExtraEight
- A_Id (FK to table A)
Table C
- Id
- ExtraOne
...
- ExtraEight
- B_Id (FK to table B)

Currently, this is the query I am using:

SELECT [t1].[id]
FROM
(
SELECT DISTINCT [t0].[id]
FROM
(
SELECT [b].[id]
FROM [Table A] AS [b]
WHERE
(
(
([b].[id] LIKE '%searchText%')
OR ([b].[extraone] LIKE '%searchText%')
)
OR ([b].[extratwo] LIKE '%searchText%')
)
UNION
SELECT [b0].[id]
FROM [Table A] AS [b0]
INNER JOIN [Table B] AS [c] ON [b0].[id] = [c].[A_Id]
WHERE
(
(
(
(
(
(
(
(
[c].[id] LIKE '%searchText%'
)
OR ([c].[name] LIKE '%searchText%')
)
OR ([c].[extraone] LIKE '%searchText%')
)
OR ([c].[extratwo] LIKE '%searchText%')
)
OR ([c].[extrathree] LIKE '%searchText%')
)
OR ([c].[extrafour] LIKE '%searchText%')
)
OR ([c].[extrafive] LIKE '%searchText%')
)
OR ([c].[extrasix] LIKE '%searchText%')
)
UNION
SELECT [b1].[id]
FROM [Table A] AS [b1]
INNER JOIN [Table B] AS [c0] ON [b1].[id] = [c0].[A_id]
INNER JOIN [Table C] AS [d] ON [c0].[id] = [d].[B_id]
WHERE
(
(
(
(
(
(
(
(
[d].[id] LIKE '%searchText%'
)
OR ([d].[name] LIKE '%searchText%')
)
OR ([d].[extraone] LIKE '%searchText%')
)
OR ([d].[extratwo] LIKE '%searchText%')
)
OR ([d].[extrathree] LIKE '%searchText%')
)
OR ([d].[extrafour] LIKE '%searchText%')
)
OR ([d].[extrafive] LIKE '%searchText%')
)
OR ([d].[extrasix] LIKE '%searchText%')
)
) AS [t0]
) AS [t1]
ORDER BY [t1].[id]
OFFSET 0 ROWS FETCH NEXT 2 ROWS ONLY

I would like to optimize this query for better performance. Here are a few questions I have:

  • Are there any alternative techniques or optimizations I can apply to improve the performance of the LIKE operations in the query?

  • Should I consider using Full-Text Search instead of LIKE for searching substrings in the nvarchar columns?

  • How can I ensure that appropriate indexes are created on the columns involved in the search conditions to improve query performance?

So far I achieved performance gain by using UNION instead of making simply left joins but still needs some optimizations.

Also I have noticed when fetch next rows has number 100 it is much more faster (3x) then when limit is 10.

答案1

得分: 2

请按以下可读的结构重写:

WITH 
t0 AS
(
SELECT b.id
FROM   TableA AS b
WHERE     b.id       LIKE '%searchText%'
       OR b.extraone LIKE '%searchText%'
       OR b.extratwo LIKE '%searchText%'
UNION
SELECT b0.id
FROM   TableA AS b0
       INNER JOIN TableB AS c 
          ON b0.id = c.A_Id
WHERE  c.id LIKE '%searchText%'
       OR (c.name LIKE '%searchText%')
       OR (c.extraone LIKE '%searchText%')
       OR (c.extratwo LIKE '%searchText%')
       OR (c.extrathree LIKE '%searchText%')
       OR (c.extrafour LIKE '%searchText%')
       OR (c.extrafive LIKE '%searchText%')
       OR (c.extrasix LIKE '%searchText%')
UNION
SELECT b1.id
FROM TableA AS b1
      INNER JOIN TableB AS c0 ON b1.id = c0.A_id
      INNER JOIN TableC AS d ON c0.id = d.B_id
WHERE d.id LIKE '%searchText%'
      OR (d.name LIKE '%searchText%')
      OR (d.extraone LIKE '%searchText%')
      OR (d.extratwo LIKE '%searchText%')
      OR (d.extrathree LIKE '%searchText%')
      OR (d.extrafour LIKE '%searchText%')
      OR (d.extrafive LIKE '%searchText%')
      OR (d.extrasix LIKE '%searchText%')
)
SELECT DISTINCT t0.id
FROM   t0
ORDER  BY t0.id OFFSET 0 ROWS FETCH NEXT 2 ROWS ONLY;

如果您需要性能,不要使用经典索引。您需要使用自己的结构来实现性能优化。

1 - 对于要搜索的每个列,将值拆分为三字母组并存储在另一个表中,该表引用表名、键值和三字母组在字符串值中的第一个字符的位置。

2 - 在此表上创建索引。

3 - 重写您的查询以使用这个新表。

英文:

Please rewrite with a readbale structure like this :

WITH 
t0 AS
(
SELECT b.id
FROM   TableA AS b
WHERE     b.id       LIKE '%searchText%'
OR b.extraone LIKE '%searchText%'
OR b.extratwo LIKE '%searchText%'
UNION
SELECT b0.id
FROM   TableA AS b0
INNER JOIN TableB AS c 
ON b0.id = c.A_Id
WHERE  c.id LIKE '%searchText%'
OR (c.name LIKE '%searchText%')
OR (c.extraone LIKE '%searchText%')
OR (c.extratwo LIKE '%searchText%')
OR (c.extrathree LIKE '%searchText%')
OR (c.extrafour LIKE '%searchText%')
OR (c.extrafive LIKE '%searchText%')
OR (c.extrasix LIKE '%searchText%')
UNION
SELECT b1.id
FROM TableA AS b1
INNER JOIN TableB AS c0 ON b1.id = c0.A_id
INNER JOIN TableC AS d ON c0.id = d.B_id
WHERE d.id LIKE '%searchText%'
OR (d.name LIKE '%searchText%')
OR (d.extraone LIKE '%searchText%')
OR (d.extratwo LIKE '%searchText%')
OR (d.extrathree LIKE '%searchText%')
OR (d.extrafour LIKE '%searchText%')
OR (d.extrafive LIKE '%searchText%')
OR (d.extrasix LIKE '%searchText%')
)
SELECT DISTINCT t0.id
FROM   t0
ORDER  BY t0.id OFFSET 0 ROWS FETCH NEXT 2 ROWS ONLY;

If you want performances, not classical indexes will be used. You need your own structure to do so...

1 - for each colums to be search split the value into trigrams in another table that references the table name, the key value, the position of first char of the trigram into the string value

2 - create an index on this table

3 - rewrite your query to use this new table

答案2

得分: 0

我不确定性能如何,但将表A按[id]列左连接到其他表应该是有效的,然后将所有可搜索的列连接在一起并执行单个字符串搜索。 CONCAT()会移除NULL值,所以我不确定您查询的数据中是否有NULL,但它会从不匹配的LEFT JOIN生成。类似这样:

SELECT A.[id]
  FROM [Table A] A
    LEFT JOIN [Table B] B ON A.id = B.A_id
    LEFT JOIN [Table C] C ON B.id = C.B_id
 WHERE CONCAT(A.[extraone], '|', A.[extratwo], '|'
             ,B.[name], '|', B.[extraone], '|', B.[extratwo], '|', B.[extrathree], '|'
             ,B.[extrafour], '|', B.[extrafive], '|', B.[extrasix], '|'
             ,C.[name], '|', C.[extraone], '|', C.[extratwo], C.[extrathree], '|'
             ,C.[extrafour], '|', C.[extrafive], '|', C.[extrasix], '|'
             ,C.[extraseven], '|', C.[extraeight], '|'
             ) LIKE '%searchText%'

如果返回了重复的id,则可以编写SELECT DISTINCT A.[id]

左连接可以使用索引来在ID值上与表A进行左连接,但是WHERE子句条件不可搜索。

英文:

I'm not sure about performance but it should be valid to left join table A by [id] column to these other tables, then concatenate all searchable columns together and perform a single string search. CONCAT() removes NULLs so not sure if that's in the data you're querying, but it will be produced from non-matching LEFT JOINs. Something like:

SELECT A.[id]
FROM [Table A] A
LEFT JOIN [Table B] B ON A.id = B.A_id
LEFT JOIN [Table C] C ON B.id = C.B_id
WHERE CONCAT(A.[extraone], '|', A.[extratwo], '|'
,B.[name], '|', B.[extraone], '|', B.[extratwo], '|', B.[extrathree], '|'
,B.[extrafour], '|', B.[extrafive], '|', B.[extrasix], '|'
,C.[name], '|', C.[extraone], '|', C.[extratwo], C.[extrathree], '|'
,C.[extrafour], '|', C.[extrafive], '|', C.[extrasix], '|'
,C.[extraseven], '|', C.[extraeight], '|'
) LIKE '%searchText%'

If there are duplicate id's returned then you could write SELECT DISTINCT A.[id].

The left joins can use an index to left join on ID value to table A, but the where-clause condition is not sargable.

huangapple
  • 本文由 发表于 2023年6月1日 15:49:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/76379730.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定