寻找一个先按照某字段排序,然后再按另一个字段分组的SQL查询。

huangapple go评论59阅读模式
英文:

Looking for an SQL query that does an ORDER BY first and then a GROUP BY

问题

以下是您要的翻译部分:

我有一张单一的表格(名为people),其中包括但不限于:_ID,code,FullName,ADDRESS1,ADDRESS2,ADDRESS3,COUNTRY_和_IsMember_等列。_ID_和_IsMember_列是smallints,其余列都是varchars。还有更多的列,但与此问题无关。

我的目标是生成可以贴在信封上并寄给家庭成员的通讯录标签。一个地址可以有一个家庭成员,也可以有整个家庭:父亲+母亲+孩子+孩子,例如。并非每个人都是成员(首先,您需要年满18岁)。成员的_IsMember_为1,否则_IsMember_为0。代码是该人员家族血统的表示:孩子的代码比父母的代码长;嫁入家庭的亲属(嫁入家庭的亲属)的代码比从出生时就加入家庭的人的代码长。

问题在于如何获得正确的SQL选择查询,以生成这7列(我不需要_IsMember_),但每个_ADDRESS1_只生成1行(=1个标签)。而_code_和_FullName_需要属于出生在家庭中的最年长的人,而不是嫁入家庭的人。换句话说,我想要拥有代码最短的地址上的人。

我的第一个想法是编写一个简单的查询,选择所有列。然后根据_code_的长度对结果进行排序。然后最后将其通过_ADDRESS1_列的Group By运行。

但是不允许在Order By之后使用Group By。反过来也不产生正确的结果:它返回的不是最早的原始家庭成员,而是该家庭的随机成员。

数据可能如下所示:

+----+------+----------+----------+----------+----------+----------+----------+
| ID | code | FullName | ADDRESS1 | ADDRESS2 | ADDRESS3 | COUNTRY  | IsMember |
+----+------+----------+----------+----------+----------+----------+----------+
| 1  | A1   | Alex     | Main 10  | 1234 AB  | New York | USA      | 1        |
| 2  | A1.2 | Fred     | Longstr 1| 5757 GE  | London   | UK       | 1        |
| 3  | A1a  | Alexa    | Main 10  | 1234 AB  | New York | USA      | 1        |
| 4  | J2   | John     | 2nd Str 7| 1970 BV  | The Hague| NL       | 1        |
| 5  | J2a  | Janet    | 2nd Str 7| 1970 BV  | The Hague| NL       | 1        |
| 6  | J2.1 | Alfred   | 2nd Str 7| 1970 BV  | The Hague| NL       | 1        |
| 7  | J2.2 | Eric     | 2nd Str 7| 1970 BV  | The Hague| NL       | 0        |
| 8  | J3   | Donald   | ObraStr 8| 8582 UT  | Toronto  | CA       | 0        |
+----+------+----------+----------+----------+----------+----------+----------+

期望的结果将是:

+----+------+----------+----------+----------+----------+----------+
| ID | code | FullName | ADDRESS1 | ADDRESS2 | ADDRESS3 | COUNTRY  |
+----+------+----------+----------+----------+----------+----------+
| 1  | A1   | Alex     | Main 10  | 1234 AB  | New York | USA      |
| 2  | A1.2 | Fred     | Longstr 1| 5757 GE  | London   | UK       |
| 4  | J2   | John     | 2nd Str 7| 1970 BV  | The Hague| NL       |
+----+------+----------+----------+----------+----------+----------+

换句话说:我们每个_ADDRESS1_只得到一个结果,该行上的相应数据属于代码最短的人。因此,在Main 1上是Alex,而不是Alexa,2nd Str 7上是John,而不是Janet。只考虑_IsMember_ = 1的行。最终结果中的行的顺序对我来说无关紧要。

所以我首先尝试了:

选择ID
,代码
,FullName
,ADDRESS1
,ADDRESS2
,ADDRESS3
,国家
从 `people`
其中IsMember = 1
按code的长度排序
Group By ADDRESS1
;

但是这会产生错误。

所以我尝试嵌套SQL代码,像这样:

SELECT id
,代码
,FullName
,ADDRESS1
,ADDRESS2
,ADDRESS3
,COUNTRY
从 (
  选择id
  ,代码
  ,FullName
  ,ADDRESS1
  ,ADDRESS2
  ,ADDRESS3
  ,COUNTRY
  从 `people`
  其中IsMember = 1
)作为tempvar
按code的长度排序
; 

但是当我尝试在某个地方添加Group By时,它也会失败。

当我尝试:

SELECT id
,代码
,FullName
,ADDRESS1
,ADDRESS2
,ADDRESS3
,COUNTRY
从 (
  选择id
  ,代码
  ,FullName
  ,ADDRESS1
  ,ADDRESS2
  ,ADDRESS3
  ,COUNTRY
  从 `people`
  其中IsMember = 1
  按长度排序(code)
)作为tempvar
按ADDRESS1分组
; 

然后,排序就会丢失,分组会再次返回家庭中的随机成员。

有人可以帮助我吗?

英文:

I have a single table (called people) that includes among others: ID, code, FullName, ADDRESS1, ADDRESS2, ADDRESS3, COUNTRY and IsMember. The ID and IsMember columns are smallints and the rest are varchars. There are more columns, but they are irrelevant to this question.

My goal is to produce address labels that we can stick onto envelopes and send a newsletter to family members. At an address can have a single family member, or a whole family: father + mother + kid + kid for example. Not every person is a member (for one, you need to be older than 18). Members have an IsMember of 1, otherwise IsMember is 0. The code is a representation of the lineage of the person: children have longer codes than their parents; in-law spouses (who married into the family) have longer codes than the person who was in the family from birth.

The problem is getting the correct SQL Select query that produces the 7 columns (I don't need IsMember), but only produces 1 line (= 1 label) per ADDRESS1. And the code and FullName need to belong to the oldest person that was born into the family, not married into the family. In other words, I want the person on an address that has the shortest code.

My first idea was to write a simple query that would select all the columns. Then Order By the result based on the length() of code. And then finally run that through a Group By of the ADDRESS1 column.

But you aren't allowed to use the Group By after the Order By. And the other way around doesn't produce the correct result: instead of returning the oldest original family member it returns a random member of that family.

The data could be something like:

+----+------+----------+----------+----------+----------+----------+----------+
| ID | code | FullName | ADDRESS1 | ADDRESS2 | ADDRESS3 | COUNTRY  | IsMember |
+----+------+----------+----------+----------+----------+----------+----------+
| 1  | A1   | Alex     | Main 10  | 1234 AB  | New York | USA      | 1        |
| 2  | A1.2 | Fred     | Longstr 1| 5757 GE  | London   | UK       | 1        |
| 3  | A1a  | Alexa    | Main 10  | 1234 AB  | New York | USA      | 1        |
| 4  | J2   | John     | 2nd Str 7| 1970 BV  | The Hague| NL       | 1        |
| 5  | J2a  | Janet    | 2nd Str 7| 1970 BV  | The Hague| NL       | 1        |
| 6  | J2.1 | Alfred   | 2nd Str 7| 1970 BV  | The Hague| NL       | 1        |
| 7  | J2.2 | Eric     | 2nd Str 7| 1970 BV  | The Hague| NL       | 0        |
| 8  | J3   | Donald   | ObraStr 8| 8582 UT  | Toronto  | CA       | 0        |
+----+------+----------+----------+----------+----------+----------+----------+

And the expected result would be:

+----+------+----------+----------+----------+----------+----------+
| ID | code | FullName | ADDRESS1 | ADDRESS2 | ADDRESS3 | COUNTRY  |
+----+------+----------+----------+----------+----------+----------+
| 1  | A1   | Alex     | Main 10  | 1234 AB  | New York | USA      |
| 2  | A1.2 | Fred     | Longstr 1| 5757 GE  | London   | UK       |
| 4  | J2   | John     | 2nd Str 7| 1970 BV  | The Hague| NL       |
+----+------+----------+----------+----------+----------+----------+

In other words: we get only one result per ADDRESS1 and the corresponding data on that line belongs to the person with the shortest code. So Alex instead of Alexa on Main 1, and John instead of Janet on 2nd Str 7.
Only lines with IsMember = 1 are taken into account.
The order of the lines in the end result is irrelevant to me.

So I first tried:

Select ID
, code
, FullName
, ADDRESS1
, ADDRESS2
, ADDRESS3
, COUNTRY
from `people`
Where IsMember = 1
Order by Length(code)
Group By ADDRESS1
;

But that gives an error.

So I switched to try to nest SQL code, like

SELECT id
, code
, FullName
, ADDRESS1
, ADDRESS2
, ADDRESS3
, COUNTRY
from (
  SELECT id
  , code
  , FullName
  , ADDRESS1
  , ADDRESS2
  , ADDRESS3
  , COUNTRY
  from `people`
  where IsMember = 1
) as tempvar
order by length(code)
; 

but that also fails when I try to add in the Group By somewhere.

When I try

SELECT id
, code
, FullName
, ADDRESS1
, ADDRESS2
, ADDRESS3
, COUNTRY
from (
  SELECT id
  , code
  , FullName
  , ADDRESS1
  , ADDRESS2
  , ADDRESS3
  , COUNTRY
  from `people`
  where IsMember = 1
  order by length(code)
) as tempvar
group by ADDRESS1
; 

then the ordering is lost, and the group comes up with a random member in the family again.

Can someone help me out here?

-Edited to add examples-

答案1

得分: 0

你可以尝试这段代码,我在这里使用了“order by”子句。

SELECT ID, code, FullName, ADDRESS1, ADDRESS2, ADDRESS3, COUNTRY
FROM (
  SELECT ID, code, FullName, ADDRESS1, ADDRESS2, ADDRESS3, COUNTRY,
         ROW_NUMBER() OVER (PARTITION BY ADDRESS1 ORDER BY LENGTH(code)) AS rn
  FROM people
  WHERE IsMember = 1
) AS tempvar
WHERE rn = 1;

希望这对你有帮助!

英文:

You can try this code and here I have used "order by" clause

SELECT ID, code, FullName, ADDRESS1, ADDRESS2, ADDRESS3, COUNTRY
FROM (
  SELECT ID, code, FullName, ADDRESS1, ADDRESS2, ADDRESS3, COUNTRY,
         ROW_NUMBER() OVER (PARTITION BY ADDRESS1 ORDER BY LENGTH(code)) AS rn
  FROM people
  WHERE IsMember = 1
) AS tempvar
WHERE rn = 1;

I hope this could help you!!

答案2

得分: 0

以下是地址的最小长度代码。我使用完整地址而不是ADDRESS1,因为如果您有人在这两个地址...

主要 10
1234 AB
纽约
美国

主要 10
5678 CD
芝加哥
美国

...您可能不希望所有邮件都寄到第一个地址。

此解决方案适用于SQL Server。您的情况可能有所不同。

with people (
  ID
  , code
  , FullName
  , ADDRESS1
  , ADDRESS2
  , ADDRESS3
  , COUNTRY 
  , IsMember
) as (
  select *
  from (
    values 
      (1 , 'A1'  , 'Alex'  , 'Main 10'  , '1234 AB', 'New York' , 'USA', 1)
    , (2 , 'A1.2', 'Fred'  , 'Longstr 1', '5757 GE', 'London'   , 'UK' , 1)
    , (3 , 'A1a' , 'Alexa' , 'Main 10'  , '1234 AB', 'New York' , 'USA', 1)
    , (4 , 'J2'  , 'John'  , '2nd Str 7', '1970 BV', 'The Hague', 'NL' , 1)
    , (5 , 'J2a' , 'Janet' , '2nd Str 7', '1970 BV', 'The Hague', 'NL' , 1)
    , (6 , 'J2.1', 'Alfred', '2nd Str 7', '1970 BV', 'The Hague', 'NL' , 1)
    , (7 , 'J2.2', 'Eric'  , '2nd Str 7', '1970 BV', 'The Hague', 'NL' , 0)
    , (8 , 'J3'  , 'Donald', 'ObraStr 8', '8582 UT', 'Toronto'  , 'CA' , 0)
  ) q (
      ID
    , code
    , FullName
    , ADDRESS1
    , ADDRESS2
    , ADDRESS3
    , COUNTRY 
    , IsMember
  )
),
a as (
  Select ID
  , code
  , FullName
  , ADDRESS1
  , ADDRESS2
  , ADDRESS3
  , COUNTRY
  , len(code) as codelen
  , min(len(code)) over (partition by ADDRESS1, ADDRESS2, ADDRESS3, COUNTRY) as mincodelen
  from people
  Where IsMember = 1
)

select *
from a
where codelen = mincodelen
order by ID
;
英文:

The code below will present the code of minimum length for the address. I use the full address instead of ADDRESS1 because if you have people at these two addresses...

Main 10
1234 AB
New York
USA

Main 10
5678 CD
Chicago
USA

...you probably don't want all of the mail going to the first address.

This solution is written for SQL Server. Your mileage may vary.

with people (
  ID
  , code
  , FullName
  , ADDRESS1
  , ADDRESS2
  , ADDRESS3
  , COUNTRY 
  , IsMember
) as (
  select *
  from (
    values 
      (1 , 'A1'  , 'Alex'  , 'Main 10'  , '1234 AB', 'New York' , 'USA', 1)
    , (2 , 'A1.2', 'Fred'  , 'Longstr 1', '5757 GE', 'London'   , 'UK' , 1)
    , (3 , 'A1a' , 'Alexa' , 'Main 10'  , '1234 AB', 'New York' , 'USA', 1)
    , (4 , 'J2'  , 'John'  , '2nd Str 7', '1970 BV', 'The Hague', 'NL' , 1)
    , (5 , 'J2a' , 'Janet' , '2nd Str 7', '1970 BV', 'The Hague', 'NL' , 1)
    , (6 , 'J2.1', 'Alfred', '2nd Str 7', '1970 BV', 'The Hague', 'NL' , 1)
    , (7 , 'J2.2', 'Eric'  , '2nd Str 7', '1970 BV', 'The Hague', 'NL' , 0)
    , (8 , 'J3'  , 'Donald', 'ObraStr 8', '8582 UT', 'Toronto'  , 'CA' , 0)
  ) q (
      ID
    , code
    , FullName
    , ADDRESS1
    , ADDRESS2
    , ADDRESS3
    , COUNTRY 
    , IsMember
  )
),
a as (
  Select ID
  , code
  , FullName
  , ADDRESS1
  , ADDRESS2
  , ADDRESS3
  , COUNTRY
  , len(code) as codelen
  , min(len(code)) over (partition by ADDRESS1, ADDRESS2, ADDRESS3, COUNTRY) as mincodelen
  from people
  Where IsMember = 1
)

select *
from a
where codelen = mincodelen
order by ID
;

huangapple
  • 本文由 发表于 2023年7月11日 06:12:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76657640.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定