英文:
Retrieve latest update time for male and female members with low selectivity?
问题
以下是代码部分的翻译:
This is a SQL Server and want to find out the latest update time for male and female members. What is the most efficient way to create an index to find the data I need with such low selectivity? Thanks for all your help.
SELECT sex, COUNT(1) FROM membership
Produces:
sex | COUNT(1) |
---|---|
M | 887409 |
F | 679003 |
And
SELECT sex, max(UpdateTime) FROM membership
Produces:
sex | max(UpdateTime) |
---|---|
M | 2023-05-29 18:30:09 |
F | 2023-05-27 12:23:55 |
英文:
This is a SQL Server and want to find out the latest update time for male and female members. What is the most efficient way to create an index to find the data I need with such low selectivity? Thanks for all your help.
SELECT sex, COUNT(1) FROM membership
Produces:
sex | COUNT(1) |
---|---|
M | 887409 |
F | 679003 |
And
SELECT sex, max(UpdateTime) FROM membership
Produces:
sex | max(UpdateTime) |
---|---|
M | 2023-05-29 18:30:09 |
F | 2023-05-27 12:23:55 |
答案1
得分: 2
第一个查询可以纯粹通过在(sex)
上创建的单列索引来支持。但是,两个查询可以通过以下索引来支持:
INDEX IX NONCLUSTERED (sex ASC, UpdateTime DESC)
-- 或者
INDEX IX CLUSTERED (sex ASC, UpdateTime DESC)
在这种情况下,第二个查询可以通过为每个值分别执行TOP 1
查询来进行改进:
SELECT
m.*
FROM (VALUES
('M'),
('F')
) v(sex)
CROSS APPLY (
SELECT TOP (1)
m.sex,
m.UpdateTime
FROM membership m
WHERE m.sex = v.sex
ORDER BY UpdateTime DESC
) m;
如果您有一个列,不知道确切的值,但希望执行类似的跳跃扫描技术,您可以使用递归CTE和一个函数来实现:
CREATE FUNCTION dbo.GetTopValue (@sexPrevious char(1))
RETURNS TABLE
AS RETURN
SELECT TOP (1)
sex,
UpdateTime
FROM membership
WHERE sex > @sex
ORDER BY UpdateTime DESC;
然后,使用以下代码来执行查询:
WITH cte AS (
SELECT TOP (1)
sex,
UpdateTime
FROM membership
ORDER BY sex ASC, UpdateTime DESC
UNION ALL
SELECT
t.sex,
t.UpdateTime
FROM cte
CROSS APPLY dbo.GetTopValue(cte.sex) t
)
SELECT *
FROM cte;
请注意,以上代码段是SQL查询和代码示例,无需翻译。
英文:
The first query can be supported purely by a single column index on (sex)
. But both queries can instead be supported by the following index
INDEX IX NONCLUSTERED (sex ASC, UpdateTime DESC)
-- alternatively
INDEX IX CLUSTERED (sex ASC, UpdateTime DESC)
The second query could be improved in this case, by instead doing separate TOP 1
queries for each value
SELECT
m.*
FROM (VALUES
('M'),
('F')
) v(sex)
CROSS APPLY (
SELECT TOP (1)
m.sex,
m.UpdateTime
FROM membership m
WHERE m.sex = v.sex
ORDER BY UpdateTime DESC
) m;
If you have a column where you don't know the exact values, but wish to do a similar skip-scanning technique, you can use a recursive CTE and a function instead.
CREATE FUNCTION dbo.GetTopValue (@sexPrevious char(1))
RETURNS TABLE
AS RETURN
SELECT TOP (1)
sex,
UpdateTime
FROM membership
WHERE sex > @sex
ORDER BY UpdateTime DESC;
WITH cte AS (
SELECT TOP (1)
sex,
UpdateTime
FROM membership
ORDER BY sex ASC, UpdateTime DESC
UNION ALL
SELECT
t.sex,
t.UpdateTime
FROM cte
CROSS APPLY dbo.GetTopValue(cte.sex) t
)
SELECT *
FROM cte;
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论