优化递归CTE以计算总数的性能。

huangapple go评论108阅读模式
英文:

Improving Performance of Recursive CTE for calculating totals

问题

我有一个SQL Server数据库,其中包含一个名为Employee的表,其结构如下:

CREATE TABLE Employee 
(
    EmployeeID INT PRIMARY KEY,
    ManagerID INT,
    EmployeeName NVARCHAR(50),
    Salary DECIMAL(18, 2)
);

我正在使用递归的公共表达式(CTE)来计算每个员工及其下属的总薪水。然而,当有大量员工时,查询性能变得缓慢。

WITH RecursiveSalaryCTE AS 
(
    SELECT 
        EmployeeID, ManagerID, EmployeeName, Salary
    FROM 
        Employee
    WHERE 
        ManagerID IS NULL

    UNION ALL

    SELECT 
        e.EmployeeID, e.ManagerID, e.EmployeeName, e.Salary
    FROM 
        Employee e
    JOIN 
        RecursiveSalaryCTE r ON e.ManagerID = r.EmployeeID
)
SELECT 
    EmployeeID, EmployeeName, Salary,
    (SELECT SUM(Salary) FROM RecursiveSalaryCTE 
     WHERE ManagerID = e.EmployeeID) AS TotalSubordinateSalaries
FROM 
    RecursiveSalaryCTE e;

是否有更好的方法来优化这个查询?任何提供更快结果的替代方法或修改查询的建议都将不胜感激。

Employee表的示例数据:

INSERT INTO Employee (EmployeeID, ManagerID, EmployeeName, Salary)
VALUES (1, NULL, 'John', 10000.00),
       (2, 1, 'Alice', 7500.00),
       (3, 1, 'Bob', 8000.00),
       -- ... 更多数据 ...
       (1000, 999, 'Eve', 6000.00);
英文:

I have a SQL Server database with an Employee table containing the following structure:

CREATE TABLE Employee 
(
    EmployeeID INT PRIMARY KEY,
    ManagerID INT,
    EmployeeName NVARCHAR(50),
    Salary DECIMAL(18, 2)
);

I'm using a recursive Common Table Expression (CTE) to calculate the total salaries of each employee and their subordinates. However, with a large number of employees, the query performance becomes sluggish.

WITH RecursiveSalaryCTE AS 
(
    SELECT 
        EmployeeID, ManagerID, EmployeeName, Salary
    FROM 
        Employee
    WHERE 
        ManagerID IS NULL

    UNION ALL

    SELECT 
        e.EmployeeID, e.ManagerID, e.EmployeeName, e.Salary
    FROM 
        Employee e
    JOIN 
        RecursiveSalaryCTE r ON e.ManagerID = r.EmployeeID
)
SELECT 
    EmployeeID, EmployeeName, Salary,
    (SELECT SUM(Salary) FROM RecursiveSalaryCTE 
     WHERE ManagerID = e.EmployeeID) AS TotalSubordinateSalaries
FROM 
    RecursiveSalaryCTE e;

Is there a better way to optimize this query? Any alternative approaches or modifications to the query that could provide faster results would greatly appreciated.

Example data for the Employee table:

INSERT INTO Employee (EmployeeID, ManagerID, EmployeeName, Salary)
VALUES (1, NULL, 'John', 10000.00),
       (2, 1, 'Alice', 7500.00),
       (3, 1, 'Bob', 8000.00),
       -- ... more data ...
       (1000, 999, 'Eve', 6000.00);

答案1

得分: 1

避免在最终的SELECT语句中使用子查询应该能提升性能,如果您实际上希望得到所有依赖于任何深度级别的经理的人员总数,您必须按通用路径表达式分组(因此您必须在递归CTE中收集路径)...按managerID分组只会给您直接下属的总数。

with RecursiveSalaryCTE(EmployeeID, managerid, salary, path) as (
    SELECT EmployeeID, managerid, Salary, cast(concat('/', EmployeeID) as varchar(1000))
    FROM employee
    WHERE managerid IS NULL
    
    UNION ALL
    
    SELECT e.EmployeeID, e.managerid, e.salary, cast(concat(r.path, '/', e.EmployeeID) as varchar(1000))
    FROM employee e
    JOIN RecursiveSalaryCTE r ON e.managerid = r.EmployeeID

)
select EmployeeID, managerid, salary, Total_all_sub, 
    sum(salary) over(partition by managerid) as total_same_manager
from (
    select r.EmployeeID, r.managerid, r.salary,
        SUM(r1.Salary) Total_all_sub
    from RecursiveSalaryCTE r
    join RecursiveSalaryCTE r1 on substring(r1.path,1,len(r.path)) = r.path 
    group by r.EmployeeID, r.managerid, r.salary
) d
order by employeeid
;

链接:https://dbfiddle.uk/O1pMAy3p

英文:

Avoid using a sub-query as part of the final SELECT should improve performance and if you actually want the total of all people depending of a manager at any depth level you have to group by common path expression (thus you have to collect the path in the recursive CTE) ..., grouping by managerID gives you the total of direct subordinates only.

with RecursiveSalaryCTE(EmployeeID, managerid, salary, path) as (
    SELECT EmployeeID, managerid, Salary, cast(concat('/', EmployeeID) as varchar(1000))
    FROM employee
    WHERE managerid IS NULL
    
    UNION ALL
    
    SELECT e.EmployeeID, e.managerid, e.salary, cast(concat(r.path, '/', e.EmployeeID) as varchar(1000))
    FROM employee e
    JOIN RecursiveSalaryCTE r ON e.managerid = r.EmployeeID

)
select EmployeeID, managerid, salary, Total_all_sub, 
    sum(salary) over(partition by managerid) as total_same_manager
from (
    select r.EmployeeID, r.managerid, r.salary,
        SUM(r1.Salary) Total_all_sub
    from RecursiveSalaryCTE r
    join RecursiveSalaryCTE r1 on substring(r1.path,1,len(r.path)) = r.path 
    group by r.EmployeeID, r.managerid, r.salary
) d
order by employeeid
;

https://dbfiddle.uk/O1pMAy3p

huangapple
  • 本文由 发表于 2023年8月11日 00:20:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/76877611.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定