英文:
How to optimize query to return the highest-scoring results for each type
问题
以下是您的查询的翻译,代码部分已被省略:
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'address' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 4)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'mls_address' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 4)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'state' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'county' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'city' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'zip_code' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 15)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'high_school' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id, type, place_id, name, full_name, score FROM property_place_metas WHERE type = 'middle_junior_school' AND name RLIKE '^" . $_GET['keyword'] . "' AND status = 'active' ORDER BY score DESC LIMIT 5)
这是您的查询的翻译,没有其他内容。
英文:
Here is my query
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'address' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 4)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'mls_address' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 4)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'state' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'county' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'city' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'zip_code' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 15)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'high_school' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 5)
UNION
(SELECT id,type,place_id,name,full_name,score FROM property_place_metas WHERE type = 'middle_junior_school' AND name RLIKE '^".$_GET['keyword']."' AND status = 'active' ORDER BY score DESC LIMIT 5)
Its working for me at the start, but after a while it became slow, and i want to optimize it with the same data model returned, like i want to select the whole data with a single query without using the UNION method
i cant do WHERE type In array, but it gives me like 160 rows for type high_school, and i want to select only 5 rows with the highest score
my table got like 800k records, so i want to optimise it to the utmost level
also here is SHOW CREATE TABLE property_place_metas;
result
CREATE TABLE `property_place_metas` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`status` enum('active','suspended') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`type` enum('address','mls_address','state','county','city','zip_code','high_school','middle_junior_school') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`place_id` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`place_geo_id` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`full_name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`score` int NOT NULL DEFAULT '0',
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `property_place_metas_status_index` (`status`),
KEY `property_place_metas_type_index` (`type`),
KEY `property_place_metas_place_id_index` (`place_id`),
KEY `property_place_metas_place_geo_id_index` (`place_geo_id`),
KEY `property_place_metas_name_index` (`name`),
KEY `property_place_metas_full_name_index` (`full_name`),
KEY `property_place_metas_score_index` (`score`)
) ENGINE=InnoDB AUTO_INCREMENT=1406600 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
this query is used for auto complete in search bar
答案1
得分: 1
索引优化
根据给定的标准,需要优化表索引,因为单独的索引没有被使用。例如,WHERE type and name and status
可以从多列索引中受益。
CREATE INDEX type_status_name ON property_place_metas(`type`, `status`, `name`)
索引中列的顺序影响了标准和性能。例如,(type, status, name)
仍然适用于 WHERE type
、WHERE type AND status
,但不适用于 WHERE status
、WHERE name
或 WHERE name AND status
。
因此,使用 type_status_name(type, status, name)
索引,可以删除 property_place_metas_type_index(type)
索引,因为它将变得多余。
如果您使用的是 MySQL 8.0,并且为标准定义了适当的索引,优化查询的一种方法是使用带有 ROW_NUMBER()
窗口函数 的 CTE(公共表达式)。
CTE
WITH cte(id, type, place_id, name, full_name, score, rn) AS (
SELECT
id, type, place_id, name, full_name, score,
ROW_NUMBER() OVER (PARTITION BY type ORDER BY score DESC) AS rn
FROM property_place_metas
WHERE `type` IN('address', 'mls_address', 'state', 'county', 'city', 'zip_code', 'high_school', 'middle_junior_school')
AND `status` = 'active'
AND `name` RLIKE '^" . $_GET['keyword'] . "'
)
CTE 将生成一个按分数降序排序的所需 type
列分组的单一结果集,然后可以用于检索前 n 个值。
简单查询 - 后续解析
为了查询简单,您可以检索每个分组的记录的最高限制,例如每个分组的前 15 条记录,然后遍历结果并按 type
进行解析,限制视图或 DTO 中的结果。
然而,由于没有为每个特定的 type
值使用 UNION
,原始的排序和不同的限制将无法在查询中保留。
示例:db<>fiddle
WITH cte(id, type, place_id, name, full_name, score, rn) AS (
SELECT
id, type, place_id, name, full_name, score,
ROW_NUMBER() OVER (PARTITION BY type ORDER BY score DESC) AS rn
FROM property_place_metas
WHERE `type` IN('address', 'mls_address', 'state', 'county', 'city', 'zip_code', 'high_school', 'middle_junior_school')
AND `status` = 'active'
AND `name` RLIKE '^" . $_GET['keyword'] . "'
)
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score
FROM cte
WHERE cte.rn <= 15
保留 type
列的顺序和限制
如果需要相同的 type
列的顺序和限制,仍然需要使用 UNION
来保留顺序和不同的限制,但由于 CTE 查询仅运行一次,其余由内部查询优化器处理,因此不会有相同的性能问题。
WITH cte(id, type, place_id, name, full_name, score, rn) AS (
SELECT
id, type, place_id, name, full_name, score,
ROW_NUMBER() OVER (PARTITION BY type ORDER BY score DESC) AS rn
FROM property_place_metas
WHERE `type` IN('address', 'mls_address', 'state', 'county', 'city', 'zip_code', 'high_school', 'middle_junior_school')
AND `status` = 'active'
AND `name` RLIKE '^" . $_GET['keyword'] . "'
)
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 4 AND cte.type = 'address'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 4 AND cte.type = 'mls_address'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'state'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'county'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'city'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 15 AND cte.type = 'zip_code'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'high_school'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'middle_junior_school';
结束语
为了进一步优化应用程序性能,建议在 MySQL 中创建一个事件,定期在特定间隔运行,生成每个关键字/类型分组的前限制排名,并将它们插入到一个专用表中,可以从中查询。这将显著减少需要在运行时解析的记录数量,并在后续调用时充当伪缓存。
英文:
Index Optimization
With the given criteria being used the table indexes need to be optimized, as the individual indexes are not being utilized. For example WHERE type and name and status
would benefit from a multi-column index.
CREATE INDEX type_status_name ON property_place_metas(`type`, `status`, `name`)
The order of the columns in the index plays a role on the criteria and their performance. eg (type, status, name)
would still work with WHERE type
, WHERE type AND status
but not WHERE status
, WHERE name
, or WHERE name AND status
.
So with a type_status_name(type, status, name)
index, the property_place_metas_type_index(type)
index can be removed as it would become redundant.
Provided you are using MySQL 8.0 and have appropriate indexes defined for the criteria, one approach to optimize the query is to utilize a CTE with the ROW_NUMBER()
Window function.
CTE
WITH cte(id, type, place_id, name, full_name, score, rn) AS (
SELECT
id, type, place_id, name, full_name, score,
ROW_NUMBER() OVER (PARTITION BY type ORDER BY score DESC) AS rn
FROM property_place_metas
WHERE `type` IN('address', 'mls_address', 'state', 'county', 'city', 'zip_code', 'high_school', 'middle_junior_school')
AND `status` = 'active'
AND `name` RLIKE '^". $_GET['keyword'] ."'
)
The CTE will produce a singular result set with the desired type
column groupings ordered by score, that can then be used to retrieve the top n values from.
Simple Query - Parse later
For query simplicity, you can retrieve the highest limit of records of the grouping, such as the top 15 of each grouping, then iterate over the results and parse by type
, limiting the results in your view or DTO. <br>
However, the original ordering and varying limits by the type
column will not be able to be retained in the query since a UNION
is not being used for each specific type
value.
Example: db<>fiddle
WITH cte(id, type, place_id, name, full_name, score, rn) AS (
SELECT
id, type, place_id, name, full_name, score,
ROW_NUMBER() OVER (PARTITION BY type ORDER BY score DESC) AS rn
FROM property_place_metas
WHERE `type` IN('address', 'mls_address', 'state', 'county', 'city', 'zip_code', 'high_school', 'middle_junior_school')
AND `status` = 'active'
AND `name` RLIKE '^". $_GET['keyword'] ."'
)
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score
FROM cte
WHERE cte.rn <= 15
Preserve Order and Limit by type
column
If the same order and limit by the type
column is desired, a UNION
would still need to be used to preserve the order and varying limits but will not have the same performance issues, since the CTE query is only being run a single time and the rest is handled by the internal query optimizer.
WITH cte(id, type, place_id, name, full_name, score, rn) AS (
SELECT
id, type, place_id, name, full_name, score,
ROW_NUMBER() OVER (PARTITION BY type ORDER BY score DESC) AS rn
FROM property_place_metas
WHERE `type` IN('address', 'mls_address', 'state', 'county', 'city', 'zip_code', 'high_school', 'middle_junior_school')
AND `status` = 'active'
AND `name` RLIKE '^". $_GET['keyword'] ."'
)
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 4 AND cte.type = 'address'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 4 AND cte.type = 'mls_address'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'state'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'county'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'city'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 15 AND cte.type = 'zip_code'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'high_school'
UNION
SELECT cte.id, cte.type, cte.place_id, cte.name, cte.full_name, cte.score FROM cte WHERE cte.rn <= 5 AND cte.type = 'middle_junior_school';
Closing Notes
To optimize the application performance further, I suggest creating an event in MySQL that runs at specific intervals to generate the top limit of rankings for each keyword/type groupings and inserts them into a dedicated table that can be queried from. This will significantly reduce the number of records that need to be parsed at runtime and act as a pseudo cache upon subsequent calls.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论