英文:
Optimizing SQL query with multiple LIKE operations on nvarchar columns
问题
我正面临一个涉及到nvarchar列上的多个LIKE操作的SQL查询性能问题。该查询旨在从extraOne到extraEight的几乎每一列中搜索子字符串,涉及三个表(除了表A,它只有三个可搜索列)。
如果在这三个表中找到匹配项,我需要从表A检索记录(C -> B -> A)。
表结构如下:
表A
- Id
- ExtraOne
...
- ExtraThree
表B
- Id
- ExtraOne
...
- ExtraEight
- A_Id(指向表A的外键)
表C
- Id
- ExtraOne
...
- ExtraEight
- B_Id(指向表B的外键)
目前,我正在使用以下查询:
SELECT [t1].[id]
FROM
(
SELECT DISTINCT [t0].[id]
FROM
(
SELECT [b].[id]
FROM [Table A] AS [b]
WHERE
(
(
([b].[id] LIKE '%searchText%')
OR ([b].[extraone] LIKE '%searchText%')
)
OR ([b].[extratwo] LIKE '%searchText%')
)
UNION
SELECT [b0].[id]
FROM [Table A] AS [b0]
INNER JOIN [Table B] AS [c] ON [b0].[id] = [c].[A_Id]
WHERE
(
(
(
(
(
(
(
[c].[id] LIKE '%searchText%')
OR ([c].[name] LIKE '%searchText%')
)
OR ([c].[extraone] LIKE '%searchText%')
)
OR ([c].[extratwo] LIKE '%searchText%')
)
OR ([c].[extrathree] LIKE '%searchText%')
)
OR ([c].[extrafour] LIKE '%searchText%')
)
OR ([c].[extrafive] LIKE '%searchText%')
)
OR ([c].[extrasix] LIKE '%searchText%')
)
UNION
SELECT [b1].[id]
FROM [Table A] AS [b1]
INNER JOIN [Table B] AS [c0] ON [b1].[id] = [c0].[A_id]
INNER JOIN [Table C] AS [d] ON [c0].[id] = [d].[B_id]
WHERE
(
(
(
(
(
(
(
[d].[id] LIKE '%searchText%')
OR ([d].[name] LIKE '%searchText%')
)
OR ([d].[extraone] LIKE '%searchText%')
)
OR ([d].[extratwo] LIKE '%searchText%')
)
OR ([d].[extrathree] LIKE '%searchText%')
)
OR ([d].[extrafour] LIKE '%searchText%')
)
OR ([d].[extrafive] LIKE '%searchText%')
)
OR ([d].[extrasix] LIKE '%searchText%')
)
) AS [t0]
) AS [t1]
ORDER BY [t1].[id]
OFFSET 0 ROWS FETCH NEXT 2 ROWS ONLY
我想优化这个查询以提高性能。以下是我有的一些问题:
-
是否有任何可供应用的替代技术或优化,以改善查询中的LIKE操作的性能?
-
是否应该考虑使用全文搜索来替代LIKE,在nvarchar列中搜索子字符串?
-
如何确保在与搜索条件相关的列上创建适当的索引以提高查询性能?
到目前为止,通过使用UNION而不是简单的左连接,我已经实现了性能提升,但仍然需要一些优化。
此外,我注意到当fetch next的行数为100时,速度要快得多(3倍),而当限制为10时则速度较慢。
英文:
I am facing performance issues with an SQL query that involves multiple LIKE operations on nvarchar columns. The query aims to search for a substring in almost every column, from extraOne to extraEight, in three tables (except table A that has only three searchable columns).
And if match is found in any of these tree tables I need to retrieve records from Table A (C -> B -> A).
Tables structure
Table A
- Id
- ExtraOne
...
- ExtraThree
Table B
- Id
- ExtraOne
...
- ExtraEight
- A_Id (FK to table A)
Table C
- Id
- ExtraOne
...
- ExtraEight
- B_Id (FK to table B)
Currently, this is the query I am using:
SELECT [t1].[id]
FROM
(
SELECT DISTINCT [t0].[id]
FROM
(
SELECT [b].[id]
FROM [Table A] AS [b]
WHERE
(
(
([b].[id] LIKE '%searchText%')
OR ([b].[extraone] LIKE '%searchText%')
)
OR ([b].[extratwo] LIKE '%searchText%')
)
UNION
SELECT [b0].[id]
FROM [Table A] AS [b0]
INNER JOIN [Table B] AS [c] ON [b0].[id] = [c].[A_Id]
WHERE
(
(
(
(
(
(
(
(
[c].[id] LIKE '%searchText%'
)
OR ([c].[name] LIKE '%searchText%')
)
OR ([c].[extraone] LIKE '%searchText%')
)
OR ([c].[extratwo] LIKE '%searchText%')
)
OR ([c].[extrathree] LIKE '%searchText%')
)
OR ([c].[extrafour] LIKE '%searchText%')
)
OR ([c].[extrafive] LIKE '%searchText%')
)
OR ([c].[extrasix] LIKE '%searchText%')
)
UNION
SELECT [b1].[id]
FROM [Table A] AS [b1]
INNER JOIN [Table B] AS [c0] ON [b1].[id] = [c0].[A_id]
INNER JOIN [Table C] AS [d] ON [c0].[id] = [d].[B_id]
WHERE
(
(
(
(
(
(
(
(
[d].[id] LIKE '%searchText%'
)
OR ([d].[name] LIKE '%searchText%')
)
OR ([d].[extraone] LIKE '%searchText%')
)
OR ([d].[extratwo] LIKE '%searchText%')
)
OR ([d].[extrathree] LIKE '%searchText%')
)
OR ([d].[extrafour] LIKE '%searchText%')
)
OR ([d].[extrafive] LIKE '%searchText%')
)
OR ([d].[extrasix] LIKE '%searchText%')
)
) AS [t0]
) AS [t1]
ORDER BY [t1].[id]
OFFSET 0 ROWS FETCH NEXT 2 ROWS ONLY
I would like to optimize this query for better performance. Here are a few questions I have:
-
Are there any alternative techniques or optimizations I can apply to improve the performance of the LIKE operations in the query?
-
Should I consider using Full-Text Search instead of LIKE for searching substrings in the nvarchar columns?
-
How can I ensure that appropriate indexes are created on the columns involved in the search conditions to improve query performance?
So far I achieved performance gain by using UNION instead of making simply left joins but still needs some optimizations.
Also I have noticed when fetch next rows has number 100 it is much more faster (3x) then when limit is 10.
答案1
得分: 2
请按以下可读的结构重写:
WITH
t0 AS
(
SELECT b.id
FROM TableA AS b
WHERE b.id LIKE '%searchText%'
OR b.extraone LIKE '%searchText%'
OR b.extratwo LIKE '%searchText%'
UNION
SELECT b0.id
FROM TableA AS b0
INNER JOIN TableB AS c
ON b0.id = c.A_Id
WHERE c.id LIKE '%searchText%'
OR (c.name LIKE '%searchText%')
OR (c.extraone LIKE '%searchText%')
OR (c.extratwo LIKE '%searchText%')
OR (c.extrathree LIKE '%searchText%')
OR (c.extrafour LIKE '%searchText%')
OR (c.extrafive LIKE '%searchText%')
OR (c.extrasix LIKE '%searchText%')
UNION
SELECT b1.id
FROM TableA AS b1
INNER JOIN TableB AS c0 ON b1.id = c0.A_id
INNER JOIN TableC AS d ON c0.id = d.B_id
WHERE d.id LIKE '%searchText%'
OR (d.name LIKE '%searchText%')
OR (d.extraone LIKE '%searchText%')
OR (d.extratwo LIKE '%searchText%')
OR (d.extrathree LIKE '%searchText%')
OR (d.extrafour LIKE '%searchText%')
OR (d.extrafive LIKE '%searchText%')
OR (d.extrasix LIKE '%searchText%')
)
SELECT DISTINCT t0.id
FROM t0
ORDER BY t0.id OFFSET 0 ROWS FETCH NEXT 2 ROWS ONLY;
如果您需要性能,不要使用经典索引。您需要使用自己的结构来实现性能优化。
1 - 对于要搜索的每个列,将值拆分为三字母组并存储在另一个表中,该表引用表名、键值和三字母组在字符串值中的第一个字符的位置。
2 - 在此表上创建索引。
3 - 重写您的查询以使用这个新表。
英文:
Please rewrite with a readbale structure like this :
WITH
t0 AS
(
SELECT b.id
FROM TableA AS b
WHERE b.id LIKE '%searchText%'
OR b.extraone LIKE '%searchText%'
OR b.extratwo LIKE '%searchText%'
UNION
SELECT b0.id
FROM TableA AS b0
INNER JOIN TableB AS c
ON b0.id = c.A_Id
WHERE c.id LIKE '%searchText%'
OR (c.name LIKE '%searchText%')
OR (c.extraone LIKE '%searchText%')
OR (c.extratwo LIKE '%searchText%')
OR (c.extrathree LIKE '%searchText%')
OR (c.extrafour LIKE '%searchText%')
OR (c.extrafive LIKE '%searchText%')
OR (c.extrasix LIKE '%searchText%')
UNION
SELECT b1.id
FROM TableA AS b1
INNER JOIN TableB AS c0 ON b1.id = c0.A_id
INNER JOIN TableC AS d ON c0.id = d.B_id
WHERE d.id LIKE '%searchText%'
OR (d.name LIKE '%searchText%')
OR (d.extraone LIKE '%searchText%')
OR (d.extratwo LIKE '%searchText%')
OR (d.extrathree LIKE '%searchText%')
OR (d.extrafour LIKE '%searchText%')
OR (d.extrafive LIKE '%searchText%')
OR (d.extrasix LIKE '%searchText%')
)
SELECT DISTINCT t0.id
FROM t0
ORDER BY t0.id OFFSET 0 ROWS FETCH NEXT 2 ROWS ONLY;
If you want performances, not classical indexes will be used. You need your own structure to do so...
1 - for each colums to be search split the value into trigrams in another table that references the table name, the key value, the position of first char of the trigram into the string value
2 - create an index on this table
3 - rewrite your query to use this new table
答案2
得分: 0
我不确定性能如何,但将表A按[id]列左连接到其他表应该是有效的,然后将所有可搜索的列连接在一起并执行单个字符串搜索。 CONCAT()
会移除NULL值,所以我不确定您查询的数据中是否有NULL,但它会从不匹配的LEFT JOIN生成。类似这样:
SELECT A.[id]
FROM [Table A] A
LEFT JOIN [Table B] B ON A.id = B.A_id
LEFT JOIN [Table C] C ON B.id = C.B_id
WHERE CONCAT(A.[extraone], '|', A.[extratwo], '|'
,B.[name], '|', B.[extraone], '|', B.[extratwo], '|', B.[extrathree], '|'
,B.[extrafour], '|', B.[extrafive], '|', B.[extrasix], '|'
,C.[name], '|', C.[extraone], '|', C.[extratwo], C.[extrathree], '|'
,C.[extrafour], '|', C.[extrafive], '|', C.[extrasix], '|'
,C.[extraseven], '|', C.[extraeight], '|'
) LIKE '%searchText%'
如果返回了重复的id,则可以编写SELECT DISTINCT A.[id]
。
左连接可以使用索引来在ID值上与表A进行左连接,但是WHERE子句条件不可搜索。
英文:
I'm not sure about performance but it should be valid to left join table A by [id] column to these other tables, then concatenate all searchable columns together and perform a single string search. CONCAT()
removes NULLs so not sure if that's in the data you're querying, but it will be produced from non-matching LEFT JOINs. Something like:
SELECT A.[id]
FROM [Table A] A
LEFT JOIN [Table B] B ON A.id = B.A_id
LEFT JOIN [Table C] C ON B.id = C.B_id
WHERE CONCAT(A.[extraone], '|', A.[extratwo], '|'
,B.[name], '|', B.[extraone], '|', B.[extratwo], '|', B.[extrathree], '|'
,B.[extrafour], '|', B.[extrafive], '|', B.[extrasix], '|'
,C.[name], '|', C.[extraone], '|', C.[extratwo], C.[extrathree], '|'
,C.[extrafour], '|', C.[extrafive], '|', C.[extrasix], '|'
,C.[extraseven], '|', C.[extraeight], '|'
) LIKE '%searchText%'
If there are duplicate id's returned then you could write SELECT DISTINCT A.[id]
.
The left joins can use an index to left join on ID value to table A, but the where-clause condition is not sargable.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论