如何优化查询以返回每种类型的最高得分结果

huangapple go评论55阅读模式
英文:

How to optimize query to return the highest-scoring results for each type

问题

以下是您的查询的翻译,代码部分已被省略:

(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'address' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 4)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'mls_address' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 4)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'state' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'county' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'city' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'zip_code' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 15)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'high_school' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'middle_junior_school' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 5)

这是您的查询的翻译,没有其他内容。

英文:

Here is my query

(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'address' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 4)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'mls_address' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 4)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'state' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'county' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'city' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'zip_code' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 15)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'high_school' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'middle_junior_school' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 5)

Its working for me at the start, but after a while it became slow, and i want to optimize it with the same data model returned, like i want to select the whole data with a single query without using the UNION method

i cant do WHERE type In array, but it gives me like 160 rows for type high_school, and i want to select only 5 rows with the highest score

my table got like 800k records, so i want to optimise it to the utmost level

also here is SHOW CREATE TABLE property_place_metas; result

 CREATE TABLE `property_place_metas` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`status` enum('active','suspended') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`type` enum('address','mls_address','state','county','city','zip_code','high_school','middle_junior_school') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`place_id` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`place_geo_id` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`full_name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`score` int NOT NULL DEFAULT '0',
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `property_place_metas_status_index` (`status`),
KEY `property_place_metas_type_index` (`type`),
KEY `property_place_metas_place_id_index` (`place_id`),
KEY `property_place_metas_place_geo_id_index` (`place_geo_id`),
KEY `property_place_metas_name_index` (`name`),
KEY `property_place_metas_full_name_index` (`full_name`),
KEY `property_place_metas_score_index` (`score`)
) ENGINE=InnoDB AUTO_INCREMENT=1406600 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

this query is used for auto complete in search bar

答案1

得分: 1

索引优化

根据给定的标准,需要优化表索引,因为单独的索引没有被使用。例如,WHERE type and name and status 可以从多列索引中受益。

CREATE INDEX type_status_name ON property_place_metas(`type`, `status`, `name`)

索引中列的顺序影响了标准和性能。例如,(type, status, name) 仍然适用于 WHERE typeWHERE type AND status,但不适用于 WHERE statusWHERE nameWHERE name AND status

因此,使用 type_status_name(type, status, name) 索引,可以删除 property_place_metas_type_index(type) 索引,因为它将变得多余。


如果您使用的是 MySQL 8.0,并且为标准定义了适当的索引,优化查询的一种方法是使用带有 ROW_NUMBER() 窗口函数 的 CTE(公共表达式)。

CTE

WITH cte(id, type, place_id, name, full_name, score, rn) AS (
    SELECT 
      id, type, place_id, name, full_name, score, 
      ROW_NUMBER() OVER (PARTITION BY type ORDER BY score DESC) AS rn
    FROM property_place_metas
    WHERE `type` IN('address', 'mls_address', 'state', 'county', 'city', 'zip_code', 'high_school', 'middle_junior_school')
    AND `status` = 'active'
    AND `name` RLIKE '^" . $_GET['keyword'] . "'
)

CTE 将生成一个按分数降序排序的所需 type 列分组的单一结果集,然后可以用于检索前 n 个值。

简单查询 - 后续解析

为了查询简单,您可以检索每个分组的记录的最高限制,例如每个分组的前 15 条记录,然后遍历结果并按 type 进行解析,限制视图或 DTO 中的结果。

然而,由于没有为每个特定的 type 值使用 UNION,原始的排序和不同的限制将无法在查询中保留。

示例:db<>fiddle

WITH cte(id, type, place_id, name, full_name, score, rn) AS (
    SELECT 
      id, type, place_id, name, full_name, score, 
      ROW_NUMBER() OVER (PARTITION BY type ORDER BY score DESC) AS rn
    FROM property_place_metas
    WHERE `type` IN('address', 'mls_address', 'state', 'county', 'city', 'zip_code', 'high_school', 'middle_junior_school')
    AND `status` = 'active'
    AND `name` RLIKE '^" . $_GET['keyword'] . "'
)
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score
FROM cte
WHERE cte.rn <= 15

保留 type 列的顺序和限制

如果需要相同的 type 列的顺序和限制,仍然需要使用 UNION 来保留顺序和不同的限制,但由于 CTE 查询仅运行一次,其余由内部查询优化器处理,因此不会有相同的性能问题。

WITH cte(id, type, place_id, name, full_name, score, rn) AS (
    SELECT 
      id, type, place_id, name, full_name, score, 
      ROW_NUMBER() OVER (PARTITION BY type ORDER BY score DESC) AS rn
    FROM property_place_metas
    WHERE `type` IN('address', 'mls_address', 'state', 'county', 'city', 'zip_code', 'high_school', 'middle_junior_school')
    AND `status` = 'active'
    AND `name` RLIKE '^" . $_GET['keyword'] . "'
)
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 4 AND cte.type = 'address'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 4 AND cte.type = 'mls_address'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'state'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'county'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'city'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 15 AND cte.type = 'zip_code'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'high_school'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'middle_junior_school';

结束语

为了进一步优化应用程序性能,建议在 MySQL 中创建一个事件,定期在特定间隔运行,生成每个关键字/类型分组的前限制排名,并将它们插入到一个专用表中,可以从中查询。这将显著减少需要在运行时解析的记录数量,并在后续调用时充当伪缓存。

英文:

Index Optimization

With the given criteria being used the table indexes need to be optimized, as the individual indexes are not being utilized. For example WHERE type and name and status would benefit from a multi-column index.

CREATE INDEX type_status_name ON property_place_metas(`type`, `status`, `name`)

The order of the columns in the index plays a role on the criteria and their performance. eg (type, status, name) would still work with WHERE type, WHERE type AND status but not WHERE status, WHERE name, or WHERE name AND status.

So with a type_status_name(type, status, name) index, the property_place_metas_type_index(type) index can be removed as it would become redundant.


Provided you are using MySQL 8.0 and have appropriate indexes defined for the criteria, one approach to optimize the query is to utilize a CTE with the ROW_NUMBER() Window function.

CTE

WITH cte(id, type, place_id, name, full_name, score, rn) AS (
SELECT 
id, type, place_id, name, full_name, score, 
ROW_NUMBER() OVER (PARTITION BY type ORDER BY score DESC) AS rn
FROM property_place_metas
WHERE `type` IN(&#39;address&#39;, &#39;mls_address&#39;, &#39;state&#39;, &#39;county&#39;, &#39;city&#39;, &#39;zip_code&#39;, &#39;high_school&#39;, &#39;middle_junior_school&#39;)
AND `status` = &#39;active&#39;
AND `name` RLIKE &#39;^&quot;. $_GET[&#39;keyword&#39;] .&quot;&#39;
)

The CTE will produce a singular result set with the desired type column groupings ordered by score, that can then be used to retrieve the top n values from.

Simple Query - Parse later

For query simplicity, you can retrieve the highest limit of records of the grouping, such as the top 15 of each grouping, then iterate over the results and parse by type, limiting the results in your view or DTO. <br>
However, the original ordering and varying limits by the type column will not be able to be retained in the query since a UNION is not being used for each specific type value.

Example: db<>fiddle

WITH cte(id, type, place_id, name, full_name, score, rn) AS (
SELECT 
id, type, place_id, name, full_name, score, 
ROW_NUMBER() OVER (PARTITION BY type ORDER BY score DESC) AS rn
FROM property_place_metas
WHERE `type` IN(&#39;address&#39;, &#39;mls_address&#39;, &#39;state&#39;, &#39;county&#39;, &#39;city&#39;, &#39;zip_code&#39;, &#39;high_school&#39;, &#39;middle_junior_school&#39;)
AND `status` = &#39;active&#39;
AND `name` RLIKE &#39;^&quot;. $_GET[&#39;keyword&#39;] .&quot;&#39;
)
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score
FROM cte
WHERE cte.rn &lt;= 15

Preserve Order and Limit by type column

If the same order and limit by the type column is desired, a UNION would still need to be used to preserve the order and varying limits but will not have the same performance issues, since the CTE query is only being run a single time and the rest is handled by the internal query optimizer.

WITH cte(id, type, place_id, name, full_name, score, rn) AS (
SELECT 
id, type, place_id, name, full_name, score, 
ROW_NUMBER() OVER (PARTITION BY type ORDER BY score DESC) AS rn
FROM property_place_metas
WHERE `type` IN(&#39;address&#39;, &#39;mls_address&#39;, &#39;state&#39;, &#39;county&#39;, &#39;city&#39;, &#39;zip_code&#39;, &#39;high_school&#39;, &#39;middle_junior_school&#39;)
AND `status` = &#39;active&#39;
AND `name` RLIKE &#39;^&quot;. $_GET[&#39;keyword&#39;] .&quot;&#39;
)
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn &lt;= 4 AND cte.type = &#39;address&#39;
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn &lt;= 4 AND cte.type = &#39;mls_address&#39;
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn &lt;= 5 AND cte.type = &#39;state&#39;
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn &lt;= 5 AND cte.type = &#39;county&#39;
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn &lt;= 5 AND cte.type = &#39;city&#39;
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn &lt;= 15 AND cte.type = &#39;zip_code&#39;
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn &lt;= 5 AND cte.type = &#39;high_school&#39;
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn &lt;= 5 AND cte.type = &#39;middle_junior_school&#39;;

Closing Notes

To optimize the application performance further, I suggest creating an event in MySQL that runs at specific intervals to generate the top limit of rankings for each keyword/type groupings and inserts them into a dedicated table that can be queried from. This will significantly reduce the number of records that need to be parsed at runtime and act as a pseudo cache upon subsequent calls.

huangapple
  • 本文由 发表于 2023年7月11日 00:42:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/76655755.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定