英文:
Count of unique days grouped by value - pandas
问题
我将为您翻译代码中的注释和文本,不包括代码本身。请看下面的翻译:
# 我的目标是在pandas数据框中的新列中分配唯一日期的累积计数。它应该计算来自“Date”的唯一日期的数量,按“Code”和“Item”分组。一旦“Code”或“Item”中的连续值被中断,计数应该重置为0。
import pandas as pd
df = pd.DataFrame({"Date":["2023-03-01", "2023-03-01", "2023-03-01", "2023-03-04", "2023-03-06", "2023-03-06", "2023-03-07", "2023-03-08", "2023-03-09","2023-03-01", "2023-03-02", "2023-03-03", "2023-03-03", "2023-03-03","2023-03-03", "2023-03-04", "2023-03-05", "2023-03-06"],
"Code":["X", "X", "X", "X", "X", "X", "X", "X", "X", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y"],
"Item":["A", "A", "A", "B", "B", "B", "B", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A"],
})
df["Date"] = pd.to_datetime(df["Date"])
df["Daily_Count"] = df.groupby(["Code", "Item", df["Date"].dt.date]).cumcount()
# 预期输出:
# Date Code Item Daily_Count
# 0 2023-03-01 X A 1
# 1 2023-03-01 X A 1
# 2 2023-03-01 X A 1
# 3 2023-03-04 X B 1
# 4 2023-03-06 X B 2
# 5 2023-03-06 X B 2
# 6 2023-03-07 X B 3
# 7 2023-03-08 X A 1
# 8 2023-03-09 X A 2
# 9 2023-03-01 Y A 1
# 10 2023-03-02 Y A 2
# 11 2023-03-03 Y A 3
# 12 2023-03-03 Y A 3
# 13 2023-03-03 Y A 3
# 14 2023-03-03 Y A 3
# 15 2023-03-04 Y A 4
# 16 2023-03-05 Y A 5
# 17 2023-03-06 Y A 6
希望这有助于您理解代码的功能。如果您有任何其他问题,请随时提出。
英文:
I'm aiming to assign a cumulative count of unique days to a new column in a pandas df. It should count the number of unique days, gathered from Date
, grouped by Code
and Item
. Once consecutive values in Code
or Item
are broken, the count should reset to 0.
import pandas as pd
df = pd.DataFrame({"Date":['2023-03-01', '2023-03-01', '2023-03-01', '2023-03-04', '2023-03-06', '2023-03-06', '2023-03-07', '2023-03-08', '2023-03-09','2023-03-01', '2023-03-02', '2023-03-03', '2023-03-03', '2023-03-03','2023-03-03', '2023-03-04', '2023-03-05', '2023-03-06'],
"Code":['X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y'],
"Item":['A', 'A', 'A', 'B', 'B', 'B', 'B', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A'],
})
df['Date'] = pd.to_datetime(df['Date'])
df['Daily_Count'] = df.groupby(['Code', 'Item', df['Date'].dt.date]).cumcount()
Intended output:
Date Code Item Daily_Count
0 2023-03-01 X A 1
1 2023-03-01 X A 1
2 2023-03-01 X A 1
3 2023-03-04 X B 1
4 2023-03-06 X B 2
5 2023-03-06 X B 2
6 2023-03-07 X B 3
7 2023-03-08 X A 1
8 2023-03-09 X A 2
9 2023-03-01 Y A 1
10 2023-03-02 Y A 2
11 2023-03-03 Y A 3
12 2023-03-03 Y A 3
13 2023-03-03 Y A 3
14 2023-03-03 Y A 3
15 2023-03-04 Y A 4
16 2023-03-05 Y A 5
17 2023-03-06 Y A 6
答案1
得分: 2
你需要将你的数值分组成(Code, Item)
组,你可以通过将这些值与它们的前一个值进行比较,并在它们中的一个发生变化时开始一个新的组:
g = (df[['Code','Item']] != df[['Code', 'Item']].shift()).any(axis=1).cumsum()
# 0 1
# 1 1
# 2 1
# 3 2
# 4 2
# 5 2
# 6 2
# 7 3
# 8 3
然后,你可以使用这些值对你的数据框进行分组,并计算每个组中Date
变化的次数:
df['Daily_Count'] = df.groupby(g)['Date'].transform(lambda g:(g != g.shift()).cumsum())
输出:
Date Code Item Daily_Count
0 2023-03-01 X A 1
1 2023-03-01 X A 1
2 2023-03-01 X A 1
3 2023-03-04 X B 1
4 2023-03-06 X B 2
5 2023-03-06 X B 2
6 2023-03-07 X B 3
7 2023-03-08 X A 1
8 2023-03-09 X A 2
9 2023-03-01 Y A 1
10 2023-03-02 Y A 2
11 2023-03-03 Y A 3
12 2023-03-03 Y A 3
13 2023-03-03 Y A 3
14 2023-03-03 Y A 3
15 2023-03-04 Y A 4
16 2023-03-05 Y A 5
17 2023-03-06 Y A 6
英文:
You need to group your values into (Code, Item)
groups, which you can do by comparing those values against their previous values, and starting a new group whenever one of them changes:
g = (df[['Code','Item']] != df[['Code', 'Item']].shift()).any(axis=1).cumsum()
# 0 1
# 1 1
# 2 1
# 3 2
# 4 2
# 5 2
# 6 2
# 7 3
# 8 3
You can then group your dataframe using these values, and sum the number of times the Date
changes in each group:
df['Daily_Count'] = df.groupby(g)['Date'].transform(lambda g:(g != g.shift()).cumsum())
Output:
Date Code Item Daily_Count
0 2023-03-01 X A 1
1 2023-03-01 X A 1
2 2023-03-01 X A 1
3 2023-03-04 X B 1
4 2023-03-06 X B 2
5 2023-03-06 X B 2
6 2023-03-07 X B 3
7 2023-03-08 X A 1
8 2023-03-09 X A 2
9 2023-03-01 Y A 1
10 2023-03-02 Y A 2
11 2023-03-03 Y A 3
12 2023-03-03 Y A 3
13 2023-03-03 Y A 3
14 2023-03-03 Y A 3
15 2023-03-04 Y A 4
16 2023-03-05 Y A 5
17 2023-03-06 Y A 6
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论