SQL – 每个月识别用户是否存在

huangapple go评论70阅读模式
英文:

SQL - Identify if a user is present every month

问题

我正在对在过去三个月内进行了交易的用户进行一些数据分析。

我想要做的是识别那些在数据表中的这两年内每个月都进行了特定交易类型(信用卡)的客户。如下所示,User A 在1、2、3月进行了信用卡交易,我想要一个标志,显示为“频繁”。

然而,User B 没有每个月都进行信用卡交易(第2月是借记卡),所以我想要为他们使用不同的标志名称(例如“不频繁”)。

我该如何使用SQL来识别用户是否每个月都进行了特定交易类型的交易?

| 日期        | 用户 |  金额   | 交易类型  |  **标志 **  |
| 2022-01-15 |   A  | $15.00  |       信用卡   |  **标志 **  |
...
| 2022-02-15 |   A  | $15.00  |       信用卡   |  **标志 **  |
...
| 2022-03-15 |   A  | $15.00  |       信用卡   |  **标志 **  |
...
...
| 2022-01-15 |   B  | $15.00  |       信用卡   |  **标志 **  |
...
| 2022-02-15 |   B  | $15.00  |       借记卡   |  **标志 **  |
...
| 2022-03-15 |   B  | $15.00  |       信用卡   |  **标志 **  |

我已经尝试了以下方法,希望有更好或更简单的方法。

SELECT
   Date, User, Amount, Transaction_Type,
   CASE WHEN Count(present) = 3 THEN '频繁' ELSE '不频繁' END AS '标志'
FROM Transactions
LEFT JOIN (
    SELECT 
        User, Month(Date), Count(Transaction_Type) as present
    FROM 
        Transactions
    WHERE 
        Transaction_Type = '信用卡'
    GROUP BY 
        User, Month(Date)
    HAVING 
        Count(Transaction_Type) > 0
) subquery
ON subquery.User = Transactions.User
GROUP BY    
    Date, User, Amount, Transaction_Type

请注意,这只是SQL查询的一部分,我将根据您的需求提供的信息来翻译。

英文:

I am performing some data analysis on users who have made transactions over the course of three months.

What I would like to do is identify customers who made specific transaction types (Credit) in every single month present in the data table over those two years. As you can see in the data table below, User A has performed a Credit transaction in months 1,2,3 and I would like a flag saying "Frequent" applied to the customer.

User B, however, has not performed a credit transaction every month (month 2 was Debit), and so I would like them to have a different flag name (e.g. "Infrequent").

How can I use SQL to identify if a user has made a specific transaction type each month?

| Date       | User |  Amount |  Transaction Type   |  **Flag **  |
| 2022-01-15 |   A  | $15.00  |       Credit        |  **Flag **  |
  ...
| 2022-02-15 |   A  | $15.00  |       Credit        |  **Flag **  |
  ...
| 2022-03-15 |   A  | $15.00  |       Credit        |  **Flag **  |
  ...
  ...
| 2022-01-15 |   B  | $15.00  |       Credit        |  **Flag **  |
  ...
| 2022-02-15 |   B  | $15.00  |       Debit         |  **Flag **  |
  ...
| 2022-03-15 |   B  | $15.00  |       Credit        |  **Flag **  |

I have tried the following - hoping there is a better or more simple way.

SELECT
   Date, User, Amount, Transaction_Type,
   CASE WHEN Count(present) = 3 THEN 'Frequent' ELSE 'Infrequent'

FROM Transactions

LEFT JOIN (
				SELECT 
					User,Month(Date),Count(Transaction_Type) as present
				FROM 
					Transactions
				WHERE 
					Transaction_Type = 'Credit'
				GROUP BY 
					User,Month(Date)
				Having 
					Count(Transaction_Type) > 0
			) subquery
				ON subquery.User = Transaction.User
GROUP BY    
		Date,User,Amount,Transaction_Type

答案1

得分: 1

以下是翻译的部分:

  • "That is the way I would approach it."(这是我会处理的方式。)
  • "Assuming you are using T-SQL I would make the following changes."(假设你在使用T-SQL,我会做以下更改。)
  • "Instead of having the LEFT JOIN be to a sub-query, I would make the sub-query a CTE and then joint to that."(不要将LEFT JOIN连接到子查询,我会将子查询变成CTE,然后连接到它。)
  • "I find it easier to grok when the main query is not full of sub-queries and you can test the CTE on its own more easily, plus if performance becomes an issue is relatively trivial to convert the CTE to a temp table without affecting the main query too much."(我觉得当主查询中没有充满子查询时更容易理解,你也可以更容易地单独测试CTE,而且如果性能成为问题,将CTE相对容易地转换为临时表,而不会对主查询产生太大影响。)

希望这有所帮助!如果您有其他翻译需求,请随时告诉我。

英文:

That is the way I would approach it. Assuming you are using T-SQL I would make the following changes. Instead of having the LEFT JOIN be to a sub-query, I would make the sub-query a CTE and then joint to that. I find it easier to grok when the main query is not full of sub-queries and you can test the CTE on its own more easily, plus if performance becomes an issue is relatively trivial to convert the CTE to a temp table. without affecting the main query too much.

You have a couple of problems I think. the first is that your subquery is going to return you the count of the credits in each month. If I make 3 credits in January this is going to flag me as frequent because the total is more than 3. You probably want to do a

COUNT(DISTINCT Transaction_type) AS hasCredit

to identify if there is AT LEAST ONE credit transaction, then have another aggregation that

SUM(hasCredit)

to get the number of months in which a credit appears.

using nested sub-queries means your LEFT JOIN would now be two sub-queries deep and dissapearing off the right hand side of your screen. Writing them as CTEs keeps the main logic clean and script narrow.

I think this does what you need, but can't test it because I don't have any sample data.

WITH CTE_HasCredit AS
(
    SELECT 
			User
			,Month(Date) AS [TransactionMonth]
			,Count(DISTINCT Transaction_Type) AS [hasCredit]
    FROM 
			Transactions
    WHERE 
			Transaction_Type = 'Credit'
    GROUP BY 
			User
			,Month(Date)
    Having 
			Count(Transaction_Type) > 0
) 
,
CTE_isFrequent AS 
(

	SELECT 
			User
			,SUM(hasCredit) AS [TotalCredits]
	FROM
			CTE_HasCredit
	GROUP BY
			User
)

SELECT
	   TXN.Date
	   , TXN.User
	   , TXN.Amount
	   , TXN.Transaction_Type
	   ,CASE 
			WHEN FRQ.TotalCredits >= 3 THEN 'Frequent' 
			ELSE 'Infrequent'
		END AS [customerType]
FROM 
		Transactions AS TXN
	LEFT JOIN 
		CTE_isFrequent AS FRQ ON FRQ.User = TXN.User
GROUP BY    
        TXN.Date
		,TXN.User
		,TXN.Amount
		,TXN.Transaction_Type

I don't think you need the GROUP BY on the main query either; it would de-dupe transactions for the same day for the same amount.
You might also want to look at the syntax for COUNT() OVER(). These would allow you to do the calculations in the main query and would look something like.

		,CASE
			WHEN COUNT(DISTINCT TXN.Transaction_Type) OVER(PARTITION BY User, MONTH(TXN.Date),TXN.Transaction_Type) >=3 THEN 'Frequent'
			ELSE 'Infrequent' 
		END AS [customerType2]

This second way would give you customer type for both the Debits and Credits. I am not aware of a way to filter the COUNT() OVER() to just Credits, for that you would need to use the CTE method.

huangapple
  • 本文由 发表于 2023年2月14日 05:33:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/75441389.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定