英文:
SQL - Identify if a user is present every month
问题
我正在对在过去三个月内进行了交易的用户进行一些数据分析。
我想要做的是识别那些在数据表中的这两年内每个月都进行了特定交易类型(信用卡)的客户。如下所示,User A 在1、2、3月进行了信用卡交易,我想要一个标志,显示为“频繁”。
然而,User B 没有每个月都进行信用卡交易(第2月是借记卡),所以我想要为他们使用不同的标志名称(例如“不频繁”)。
我该如何使用SQL来识别用户是否每个月都进行了特定交易类型的交易?
| 日期 | 用户 | 金额 | 交易类型 | **标志 ** |
| 2022-01-15 | A | $15.00 | 信用卡 | **标志 ** |
...
| 2022-02-15 | A | $15.00 | 信用卡 | **标志 ** |
...
| 2022-03-15 | A | $15.00 | 信用卡 | **标志 ** |
...
...
| 2022-01-15 | B | $15.00 | 信用卡 | **标志 ** |
...
| 2022-02-15 | B | $15.00 | 借记卡 | **标志 ** |
...
| 2022-03-15 | B | $15.00 | 信用卡 | **标志 ** |
我已经尝试了以下方法,希望有更好或更简单的方法。
SELECT
Date, User, Amount, Transaction_Type,
CASE WHEN Count(present) = 3 THEN '频繁' ELSE '不频繁' END AS '标志'
FROM Transactions
LEFT JOIN (
SELECT
User, Month(Date), Count(Transaction_Type) as present
FROM
Transactions
WHERE
Transaction_Type = '信用卡'
GROUP BY
User, Month(Date)
HAVING
Count(Transaction_Type) > 0
) subquery
ON subquery.User = Transactions.User
GROUP BY
Date, User, Amount, Transaction_Type
请注意,这只是SQL查询的一部分,我将根据您的需求提供的信息来翻译。
英文:
I am performing some data analysis on users who have made transactions over the course of three months.
What I would like to do is identify customers who made specific transaction types (Credit) in every single month present in the data table over those two years. As you can see in the data table below, User A has performed a Credit transaction in months 1,2,3 and I would like a flag saying "Frequent" applied to the customer.
User B, however, has not performed a credit transaction every month (month 2 was Debit), and so I would like them to have a different flag name (e.g. "Infrequent").
How can I use SQL to identify if a user has made a specific transaction type each month?
| Date | User | Amount | Transaction Type | **Flag ** |
| 2022-01-15 | A | $15.00 | Credit | **Flag ** |
...
| 2022-02-15 | A | $15.00 | Credit | **Flag ** |
...
| 2022-03-15 | A | $15.00 | Credit | **Flag ** |
...
...
| 2022-01-15 | B | $15.00 | Credit | **Flag ** |
...
| 2022-02-15 | B | $15.00 | Debit | **Flag ** |
...
| 2022-03-15 | B | $15.00 | Credit | **Flag ** |
I have tried the following - hoping there is a better or more simple way.
SELECT
Date, User, Amount, Transaction_Type,
CASE WHEN Count(present) = 3 THEN 'Frequent' ELSE 'Infrequent'
FROM Transactions
LEFT JOIN (
SELECT
User,Month(Date),Count(Transaction_Type) as present
FROM
Transactions
WHERE
Transaction_Type = 'Credit'
GROUP BY
User,Month(Date)
Having
Count(Transaction_Type) > 0
) subquery
ON subquery.User = Transaction.User
GROUP BY
Date,User,Amount,Transaction_Type
答案1
得分: 1
以下是翻译的部分:
- "That is the way I would approach it."(这是我会处理的方式。)
- "Assuming you are using T-SQL I would make the following changes."(假设你在使用T-SQL,我会做以下更改。)
- "Instead of having the LEFT JOIN be to a sub-query, I would make the sub-query a CTE and then joint to that."(不要将LEFT JOIN连接到子查询,我会将子查询变成CTE,然后连接到它。)
- "I find it easier to grok when the main query is not full of sub-queries and you can test the CTE on its own more easily, plus if performance becomes an issue is relatively trivial to convert the CTE to a temp table without affecting the main query too much."(我觉得当主查询中没有充满子查询时更容易理解,你也可以更容易地单独测试CTE,而且如果性能成为问题,将CTE相对容易地转换为临时表,而不会对主查询产生太大影响。)
希望这有所帮助!如果您有其他翻译需求,请随时告诉我。
英文:
That is the way I would approach it. Assuming you are using T-SQL I would make the following changes. Instead of having the LEFT JOIN be to a sub-query, I would make the sub-query a CTE and then joint to that. I find it easier to grok when the main query is not full of sub-queries and you can test the CTE on its own more easily, plus if performance becomes an issue is relatively trivial to convert the CTE to a temp table. without affecting the main query too much.
You have a couple of problems I think. the first is that your subquery is going to return you the count of the credits in each month. If I make 3 credits in January this is going to flag me as frequent because the total is more than 3. You probably want to do a
COUNT(DISTINCT Transaction_type) AS hasCredit
to identify if there is AT LEAST ONE credit transaction, then have another aggregation that
SUM(hasCredit)
to get the number of months in which a credit appears.
using nested sub-queries means your LEFT JOIN would now be two sub-queries deep and dissapearing off the right hand side of your screen. Writing them as CTEs keeps the main logic clean and script narrow.
I think this does what you need, but can't test it because I don't have any sample data.
WITH CTE_HasCredit AS
(
SELECT
User
,Month(Date) AS [TransactionMonth]
,Count(DISTINCT Transaction_Type) AS [hasCredit]
FROM
Transactions
WHERE
Transaction_Type = 'Credit'
GROUP BY
User
,Month(Date)
Having
Count(Transaction_Type) > 0
)
,
CTE_isFrequent AS
(
SELECT
User
,SUM(hasCredit) AS [TotalCredits]
FROM
CTE_HasCredit
GROUP BY
User
)
SELECT
TXN.Date
, TXN.User
, TXN.Amount
, TXN.Transaction_Type
,CASE
WHEN FRQ.TotalCredits >= 3 THEN 'Frequent'
ELSE 'Infrequent'
END AS [customerType]
FROM
Transactions AS TXN
LEFT JOIN
CTE_isFrequent AS FRQ ON FRQ.User = TXN.User
GROUP BY
TXN.Date
,TXN.User
,TXN.Amount
,TXN.Transaction_Type
I don't think you need the GROUP BY on the main query either; it would de-dupe transactions for the same day for the same amount.
You might also want to look at the syntax for COUNT() OVER(). These would allow you to do the calculations in the main query and would look something like.
,CASE
WHEN COUNT(DISTINCT TXN.Transaction_Type) OVER(PARTITION BY User, MONTH(TXN.Date),TXN.Transaction_Type) >=3 THEN 'Frequent'
ELSE 'Infrequent'
END AS [customerType2]
This second way would give you customer type for both the Debits and Credits. I am not aware of a way to filter the COUNT() OVER() to just Credits, for that you would need to use the CTE method.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论