2023年3月7日 23:46:04go评论71阅读模式

英文:

Return 1 row from multiple joins to same table - SQL Server

问题

我有这些表格:

CREATE TABLE staff
(
    [case_id] int, 
    [staff] varchar(11), 
    [stafftype] varchar(10)
);

INSERT INTO staff
    ([case_id], [staff], [stafftype])
VALUES
    (1, 'Daffy, Duck', 'Primary'),
    (1, 'Bugs, Bunny', 'Additional'),
    (1, 'Elmer, Fudd', 'Additional'),
    (2, 'Daffy, Duck', 'Primary'),
    (2, 'Bugs, Bunny', 'Additional');

CREATE TABLE cases
(
    [case_id] int, 
    [casedate] datetime, 
    [caselocation] varchar(4)
);

INSERT INTO cases
    ([case_id], [casedate], [caselocation])
VALUES
    (1, '2023-01-01 00:00:00', 'Home'),
    (2, '2023-01-03 00:00:00', 'Away');

我想要为每个案例返回一行，每个案例只能有1个 Primary stafftype 和最多2个 Additional stafftypes。

例如，对于 case_id = 1 的结果集:

case_id	caselocation	PrimaryStaff	AdditionalStaff1	AdditionalStaff2
1	Home	Daffy, Duck	Bugs, Bunny	Elmer, Fudd

SQL Fiddle 演示: http://sqlfiddle.com/#!18/83bbc9/6

英文:

I have these tables:

CREATE TABLE staff
(
    [case_id] int, 
    [staff] varchar(11), 
    [stafftype] varchar(10)
);
    
INSERT INTO staff
    ([case_id], [staff], [stafftype])
VALUES
    (1, &#39;Daffy, Duck&#39;, &#39;Primary&#39;),
    (1, &#39;Bugs, Bunny&#39;, &#39;Additional&#39;),
    (1, &#39;Elmer, Fudd&#39;, &#39;Additional&#39;),
    (2, &#39;Daffy, Duck&#39;, &#39;Primary&#39;),
    (2, &#39;Bugs, Bunny&#39;, &#39;Additional&#39;);

CREATE TABLE cases
(
    [case_id] int, 
    [casedate] datetime, 
    [caselocation] varchar(4)
);
    
INSERT INTO cases
    ([case_id], [casedate], [caselocation])
VALUES
    (1, &#39;2023-01-01 00:00:00&#39;, &#39;Home&#39;),
    (2, &#39;2023-01-03 00:00:00&#39;, &#39;Away&#39;);

And I want to return a single row per case, there can only be 1 Primary stafftype and a max of 2 Additional stafftypes.

e.g. Result set for case_id = 1

case_id	caselocation	PrimaryStaff	AdditionalStaff1	AdditionalStaff2
1	Home	Daffy, Duck	Bugs, Bunny	Elmer, Fudd

SQL Fiddle Demo: http://sqlfiddle.com/#!18/83bbc9/6

答案1

得分: 1

这种方法是使用条件聚合来完成的。

Fiddle链接: http://sqlfiddle.com/#!18/83bbc9/10/0

with cte as (
select a.case_id
       ,casedate
       ,caselocation
       ,b.staff
       ,ROW_NUMBER() OVER (PARTITION By a.case_id ORDER BY stafftype DESC) RN
from cases a
inner join staff b on a.case_id = b.case_id
  )
 SELECT case_id
        ,casedate
        ,caselocation
        ,MAX(CASE WHEN RN = 1 THEN staff END) PrimaryStaff
        ,MAX(CASE WHEN RN = 2 THEN staff END) AdditionalStaff1
        ,MAX(CASE WHEN RN = 3 THEN staff END) AdditionalStaff2
 FROM cte
 GROUP BY case_id
        ,casedate
        ,caselocation

这个CTE为“主要”stafftype分配了行号“RN”为1，然后将2和3分配给次要stafftypes。然后，最终的选择使用“RN”来组织数据，而“MAX()”和“GROUP BY”组合在一起将数据合并为单行。

您可以选择CTE的内部部分，然后在底部的查询中去掉“MAX()”和“GROUP BY”，然后运行它，以逐步查看发生的情况。

例如，首先运行这个：

select a.case_id
       ,casedate
       ,caselocation
       ,b.staff
       ,ROW_NUMBER() OVER (PARTITION By a.case_id ORDER BY stafftype DESC) RN
from cases a
inner join staff b on a.case_id = b.case_id

然后运行这个：

with cte as (
select a.case_id
       ,casedate
       ,caselocation
       ,b.staff
       ,ROW_NUMBER() OVER (PARTITION By a.case_id ORDER BY stafftype DESC) RN
from cases a
inner join staff b on a.case_id = b.case_id
  )
 SELECT case_id
        ,casedate
        ,caselocation
        ,CASE WHEN RN = 1 THEN staff END PrimaryStaff
        ,CASE WHEN RN = 2 THEN staff END AdditionalStaff1
        ,CASE WHEN RN = 3 THEN staff END AdditionalStaff2
 FROM cte

英文:

One way to do this is conditional aggregation.

Fiddle: http://sqlfiddle.com/#!18/83bbc9/10/0

with cte as (
select a.case_id
       ,casedate
       ,caselocation
       ,b.staff
       ,ROW_NUMBER() OVER (PARTITION By a.case_id ORDER BY stafftype DESC) RN
from cases a
inner join staff b on a.case_id = b.case_id
  )
 SELECT case_id
        ,casedate
        ,caselocation
        ,MAX(CASE WHEN RN = 1 THEN staff END) PrimaryStaff
        ,MAX(CASE WHEN RN = 2 THEN staff END) AdditionalStaff1
        ,MAX(CASE WHEN RN = 3 THEN staff END) AdditionalStaff2
 FROM cte
 GROUP BY case_id
        ,casedate
        ,caselocation

The cte assigns the 'Primary' stafftype a row number RN of 1, then assigns 2 and 3 to the secondary stafftypes. Then the final select uses the RN to organize the data, and the MAX() and GROUP BY combine to roll the data up into single rows.

You can select only the inner portion of the CTE, then remove the MAX() and GROUP BY in the bottom query and run that to see what is happening step by step.

ie. Run this:

select a.case_id
       ,casedate
       ,caselocation
       ,b.staff
       ,ROW_NUMBER() OVER (PARTITION By a.case_id ORDER BY stafftype DESC) RN
from cases a
inner join staff b on a.case_id = b.case_id

Then this:

with cte as (
select a.case_id
       ,casedate
       ,caselocation
       ,b.staff
       ,ROW_NUMBER() OVER (PARTITION By a.case_id ORDER BY stafftype DESC) RN
from cases a
inner join staff b on a.case_id = b.case_id
  )
 SELECT case_id
        ,casedate
        ,caselocation
        ,CASE WHEN RN = 1 THEN staff END PrimaryStaff
        ,CASE WHEN RN = 2 THEN staff END AdditionalStaff1
        ,CASE WHEN RN = 3 THEN staff END AdditionalStaff2
 FROM cte

答案2

得分: 1

提供的只需要2个任意的 'Additional' 员工类型，一个简单的分组查询就足够了。

select 
   a.case_id
   ,casedate
   ,caselocation
   ,b.staff as PrimaryStaff
   ,AdditionalStaff1, AdditionalStaff2
from cases a
inner join staff b on a.case_id = b.case_id and b.stafftype = 'Primary'
inner join (
  select case_id, min([staff]) AdditionalStaff1
     , case when count(*) > 1 then max([staff]) end AdditionalStaff2
  from staff
  where stafftype = 'Additional'
  group by case_id
) c on c.case_id = a.case_id

英文:

Provided only 2 arbitrary 'Additional' stafftypes are required a plain group by will do.

select 
   a.case_id
   ,casedate
   ,caselocation
   ,b.staff as PrimaryStaff
   ,AdditionalStaff1, AdditionalStaff2
from cases a
inner join staff b on a.case_id = b.case_id and b.stafftype = &#39;Primary&#39;
inner join (
  select case_id, min([staff]) AdditionalStaff1
     , case when count(*) &gt; 1 then max([staff]) end AdditionalStaff2
  from staff
  where stafftype = &#39;Additional&#39;
  group by case_id
) c on c.case_id = a.case_id

答案3

得分: 1

以下是代码部分的翻译：

Option 1:

/* 使用 STRING_AGG() 和 "substring parsing" 解决方案 */
SELECT 
c.case_id,
c.caselocation,
f1.PrimaryStaff,
f1.AdditionalStaffList,
f2.*,
IIF(ISNULL(f2.AdditionalListLength,0) > 0, LEFT(f1.AdditionalStaffList,ISNULL(NULLIF(f2.AdditionalListSubIndex,0)-1,f2.AdditionalListLength)) , NULL) [AdditionalStaff1],
IIF(ISNULL(f2.AdditionalListSubIndex,0) > 0, RIGHT(f1.AdditionalStaffList,f2.AdditionalListLength-f2.AdditionalListSubIndex) , NULL) [AdditionalStaff2]
FROM
[cases] c
CROSS APPLY (
    SELECT STRING_AGG(s.staff,';') WITHIN GROUP (ORDER BY IIF(s.stafftype='Primary',1,2) ASC) StaffList,
    MAX(IIF(s.stafftype='Primary',LEN(s.staff)+1,0)) AdditionalStaffSubIndex
    FROM [staff] s WHERE s.[case_id] = c.[case_id] ) s12
OUTER APPLY (
    SELECT LEFT(s12.StaffList,s12.AdditionalStaffSubIndex-1) PrimaryStaff,
    SUBSTRING(s12.StaffList,s12.AdditionalStaffSubIndex+1,LEN(s12.StaffList)) AdditionalStaffList
) f1
OUTER APPLY (
    SELECT PATINDEX('%;%',f1.AdditionalStaffList) AdditionalListSubIndex,
    LEN(f1.AdditionalStaffList) AdditionalListLength
) f2

Option 2:

/* 基于 XML 的查询解决方案 */
SELECT 
c.case_id,
c.caselocation,
s1.staff [PrimaryStaff],
s2.staffList.value('(/s/staff)[1]','varchar(11)') [AdditionalStaff1],
s2.staffList.value('(/s/staff)[2]','varchar(11)') [AdditionalStaff2]
FROM
[cases] c
LEFT JOIN [staff] s1 ON s1.[case_id] = c.[case_id] AND s1.[stafftype] = 'Primary'
OUTER APPLY (SELECT CAST((SELECT s.staff FROM [staff] s WHERE s.[case_id] = c.[case_id] AND s.[stafftype] = 'Additional' FOR XML AUTO,ELEMENTS) AS XML) ) s2 (staffList)

感谢你的理解，如果有其他需要，请随时告诉我。

英文:

There are other great answers in this thread, and in many engineering decisions what is the "best" solution will often go by an "it depends" rule. However, I also think this is a small enough an example for a common enough architectural decision point that I think it's worth showing some other "ways" it can be solved. I had considered using STRING_AGG and then splitting back out as a sample of one way to do this. This requires some SUBSTRING+PATINDEX logic:

Option 1:

/* Use a STRING_AGG() and &quot;substring parsing&quot; solution */
SELECT 
c.case_id,
c.caselocation,
f1.PrimaryStaff,
f1.AdditionalStaffList,
f2.*
, IIF(ISNULL(f2.AdditionalListLength,0)   &gt;0,  LEFT(f1.AdditionalStaffList,ISNULL(NULLIF(f2.AdditionalListSubIndex,0)-1,f2.AdditionalListLength)) , NULL) [AdditionalStaff1]
, IIF(ISNULL(f2.AdditionalListSubIndex,0) &gt;0,  RIGHT(f1.AdditionalStaffList,f2.AdditionalListLength-f2.AdditionalListSubIndex) , NULL) [AdditionalStaff2]
FROM
[cases] c
CROSS APPLY (
    SELECT STRING_AGG(s.staff,&#39;;&#39;) WITHIN GROUP (ORDER BY IIF(s.stafftype=&#39;Primary&#39;,1,2) ASC) StaffList ,
    MAX(IIF(s.stafftype=&#39;Primary&#39;,LEN(s.staff)+1,0)) AdditionalStaffSubIndex
    FROM [staff] s WHERE s.[case_id] = c.[case_id] ) s12
OUTER APPLY (
    SELECT LEFT(s12.StaffList,s12.AdditionalStaffSubIndex-1) PrimaryStaff,
    SUBSTRING(s12.StaffList,s12.AdditionalStaffSubIndex+1,LEN(s12.StaffList)) AdditionalStaffList
) f1
OUTER APPLY (
    SELECT PATINDEX(&#39;%;%&#39;,f1.AdditionalStaffList) AdditionalListSubIndex,
    LEN(f1.AdditionalStaffList) AdditionalListLength
) f2

But I also realized this is a great example to show some of the power of built in XML processing that is available on all supported versions of SQL server. What this can do is abstract the need for a "row numbering" scheme away from the data query itself. XML ,and - JSON, both provide ways to take "a subset of data from a join or subquery" that may contain "0 or more rows" and turn that subset into a "logically single entity/expression" that can be parsed to get specific information "back out" at the end of (or later in) the query pipeline. This can be sometimes be useful in more complex query scenarios when the underlying "data architecture" is 'fixed'.

A snippet showing the basic premise of what the XML-based query will do "per case id":

--APPLY this:
SELECT s.staff FROM [staff] s WHERE s.[case_id] = 1 AND s.[stafftype] = &#39;Additional&#39; FOR XML AUTO,ELEMENTS

It's also possible that your data constraints may also look something like this as a way to force "only one primary staff per case" and ensuring that "each staff can only be on a given case_id once":

--Each staff is only ONCE per case_id:
CREATE UNIQUE NONCLUSTERED INDEX UQ_StaffCase ON [staff] ([case_id],[stafftype],[staff])
--Only ONE primary staff per case_id:
CREATE UNIQUE NONCLUSTERED INDEX UQ_StaffCase_Primary ON [staff] ([case_id]) INCLUDE ([staff]) WHERE [stafftype] = &#39;Primary&#39;

And so, the following XML-based query is straightforward to read, and could be adjusted to only join once. This gives you the option to change to an INNER join on the primary staff:

Option 2:

/* An XML-based query solution */
SELECT 
c.case_id,
c.caselocation,
s1.staff [PrimaryStaff],
s2.staffList.value(&#39;(/s/staff)[1]&#39;,&#39;varchar(11)&#39;) [AdditionalStaff1],
s2.staffList.value(&#39;(/s/staff)[2]&#39;,&#39;varchar(11)&#39;) [AdditionalStaff2]
FROM
[cases] c
LEFT JOIN [staff] s1 ON s1.[case_id] = c.[case_id] AND s1.[stafftype] = &#39;Primary&#39;
OUTER APPLY (SELECT CAST((SELECT s.staff FROM [staff] s WHERE s.[case_id] = c.[case_id] AND s.[stafftype] = &#39;Additional&#39; FOR XML AUTO,ELEMENTS) AS XML) ) s2 (staffList)

Further Reading:
Earlier above, I mentioned cases where you need to query a "fixed data architecture". If the architecture is still being designed/developed, you may have options to provide input or adjust how the schema is represented so that queries may be able to be constructed differently in the first place, or to take advantage of additional features of the storage and processing engines of sql server. This shows one possibility of initial schema design (where i have added "zz" and "yy" to names to :a,: prevent naming conflicts against the OP schema if testing on the same database, and :b,: clearly show these objects as testing and development schema items).

As additional context not directly presenting a solution to OP, i'll leave the below without further explanation:

/* Alternative Schema Scenario:*/
CREATE SCHEMA yyStaff
GO
CREATE SCHEMA yyCases
GO

CREATE TABLE yyStaff.zz_staff
(
    [staffid]    INT IDENTITY(1,1) NOT NULL CONSTRAINT CK_Staff UNIQUE      CLUSTERED    
  , [staff_name]       VARCHAR(11) NOT NULL CONSTRAINT PK_Staff PRIMARY KEY NONCLUSTERED WITH (DATA_COMPRESSION = ROW) --eg. &#39;Daffy, Duck&#39;
) WITH(DATA_COMPRESSION = ROW);

CREATE TABLE yyCases.zz_locationtype
(
    [locationtypeid] INT IDENTITY(1,1) NOT NULL CONSTRAINT CK_CaseLocationType UNIQUE      CLUSTERED    
  , [location]              VARCHAR(4) NOT NULL CONSTRAINT PK_CaseLocationType PRIMARY KEY NONCLUSTERED WITH (DATA_COMPRESSION = ROW) --eg. (&#39;Home&#39;),(&#39;Away&#39;)
) WITH(DATA_COMPRESSION = ROW);

CREATE TABLE yyCases.zz_case
(
    [caseid]            INT IDENTITY(1,1) NOT NULL   CONSTRAINT CK_Case UNIQUE CLUSTERED  
  , [casedate]          DATETIMEOFFSET(0) NOT NULL                                         
  , [reflocationtype]                 INT NOT NULL   CONSTRAINT FK_Case_CaseLocationType FOREIGN KEY REFERENCES yyCases.zz_locationtype (locationtypeid)
);

CREATE NONCLUSTERED INDEX IX_Case__CaseLocationType_Date ON yyCases.zz_case (reflocationtype,casedate)           WITH (DATA_COMPRESSION  = ROW);
CREATE NONCLUSTERED INDEX IX_Case__DateICaseLocationType ON yyCases.zz_case (casedate) INCLUDE (reflocationtype) WITH (DATA_COMPRESSION  = ROW);

/* Alternatively, since there are only two, just have an [IsPrimary] column on zz_casestaff ? */
--CREATE TABLE yyCases.zz_stafftype
--(
--    [stafftypeid] INT IDENTITY(1,1) NOT NULL CONSTRAINT CK_CaseStaffType UNIQUE      CLUSTERED    
--  , [stafftype]         VARCHAR(10) NOT NULL CONSTRAINT PK_CaseStaffType PRIMARY KEY NONCLUSTERED WITH (DATA_COMPRESSION = ROW) --eg. (&#39;Primary&#39;),(&#39;Additional&#39;)
--) WITH(DATA_COMPRESSION = ROW);
/* yes, see below */

CREATE TABLE yyCases.zz_casestaff
(
    [casestaffid] INT IDENTITY(1,1) NOT NULL CONSTRAINT CK_CaseStaff        UNIQUE      CLUSTERED 
  , [refcase]                   INT NOT NULL CONSTRAINT FK_CaseStaff_Case   FOREIGN KEY REFERENCES yyCases.zz_case  (caseid)
  , [refstaff]                  INT NOT NULL CONSTRAINT FK_CaseStaff_Staff  FOREIGN KEY REFERENCES yyStaff.zz_staff (staffid)
  , [isprimary]                 BIT NOT NULL
);

--Each staff is only ONCE per case_id:
CREATE UNIQUE NONCLUSTERED INDEX UQ_CaseStaff__Case_Staff    ON yyCases.zz_casestaff (refcase,refstaff) INCLUDE (IsPrimary);
--Only ONE primary staff per case_id:
CREATE UNIQUE NONCLUSTERED INDEX UQ_CaseStaff__CaseFPrimary  ON yyCases.zz_casestaff (refcase) INCLUDE (refstaff) WHERE IsPrimary = 1;
--For query processing: Sort cases by staff, and if primary:
CREATE NONCLUSTERED INDEX IX_CaseStaff_Staff ON yyCases.zz_casestaff (refstaff,isprimary) INCLUDE (refcase)

答案4

得分: 0

以下是代码的中文翻译部分：

WITH
  staff_sorted
AS
(
  SELECT
    *,
    ROW_NUMBER()
      OVER (
        PARTITION BY case_id, stafftype
            ORDER BY staff
    )
      AS ordinal
  FROM
    staff
),
  staff_pivotted
AS
(
  SELECT
    case_id,
    MAX(CASE WHEN stafftype = 'Primary'                    THEN staff END) AS staff_primary,
    MAX(CASE WHEN stafftype = 'Additional' AND ordinal = 1 THEN staff END) AS staff_additional_1,
    MAX(CASE WHEN stafftype = 'Additional' AND ordinal = 2 THEN staff END) AS staff_additional_2
  FROM
    staff_sorted
  GROUP BY
    case_id
)  
SELECT
  *
FROM
  cases            AS c
INNER JOIN
  staff_pivotted   AS s
    ON c.case_id = s.case_id

英文:

WITH
staff_sorted
AS
(
SELECT
*,
ROW_NUMBER()
OVER (
PARTITION BY case_id, stafftype
ORDER BY staff
)
AS ordinal
FROM
staff
),
staff_pivotted
AS
(
SELECT
case_id,
MAX(CASE WHEN stafftype = &#39;Primary&#39;                    THEN staff END) AS staff_primary,
MAX(CASE WHEN stafftype = &#39;Additional&#39; AND ordinal = 1 THEN staff END) AS staff_additional_1,
MAX(CASE WHEN stafftype = &#39;Additional&#39; AND ordinal = 2 THEN staff END) AS staff_additional_2
FROM
staff_sorted
GROUP BY
case_id
)  
SELECT
*
FROM
cases            AS c
INNER JOIN
staff_pivotted   AS s
ON c.case_id = s.case_id

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

返回相同表的多个连接中的1行 – SQL Server

问题

答案1

答案2

答案3

答案4

可以共享CTE名称的SQL脚本并行运行吗？

Generate JSON key-value dictionary using SQL Server with multiple rows.

在一个varchar列中，移除前导零和特殊字符后面的字符存在问题。

Laravel addSelect: Laravel 添加选择

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论