英文:
groupby cumsum (or cumcount) with cyclical data
问题
ID | SWITCH | Cum. Count |
---|---|---|
A | ON | 1 |
A | ON | 2 |
A | ON | 3 |
A | OFF | 1 |
A | OFF | 2 |
A | OFF | 3 |
A | ON | 1 |
A | ON | 2 |
A | ON | 3 |
... | ... | |
B | ON | 1 |
B | ON | 2 |
B | OFF | 1 |
B | OFF | 2 |
B | OFF | 3 |
B | ON | 1 |
B | ON | 2 |
B | ON | 3 |
英文:
I have a dataframe that looks like
ID | SWITCH |
---|---|
A | ON |
A | ON |
A | ON |
A | OFF |
A | OFF |
A | OFF |
A | ON |
A | ON |
A | ON |
... | ... |
B | ON |
B | ON |
B | ON |
B | OFF |
B | OFF |
B | OFF |
B | ON |
B | ON |
B | ON |
Column 'SWITCH' is cyclical data and I'd like to count the number of ON and OFF for each cycle like this:
ID | SWITCH | Cum. Count |
---|---|---|
A | ON | 1 |
A | ON | 2 |
A | ON | 3 |
A | OFF | 1 |
A | OFF | 2 |
A | OFF | 3 |
A | ON | 1 |
A | ON | 2 |
A | ON | 3 |
... | ... | |
B | ON | 1 |
B | ON | 2 |
B | OFF | 1 |
B | OFF | 2 |
B | OFF | 3 |
B | ON | 1 |
B | ON | 2 |
B | ON | 3 |
I'd tried cumsum
or cumcount
but it didn't reset the count when the next 'ON' cycle has come (it keeps counting on the number from the previous cycle).
What can I do?
答案1
得分: 1
尝试将差异的累加和也添加进去:
switch_blocsk = df['SWITCH'].ne(df['SWITCH'].shift()).cumsum()
df['cum.count'] = df.groupby(['ID', switch_blocks]).cumcount().add(1)
英文:
Try put in the cumsum of the difference as well:
switch_blocsk = df['SWITCH'].ne(df['SWITCH'].shift()).cumsum()
df['cum.count'] = df.groupby(['ID', switch_blocks]).cumcount().add(1)
答案2
得分: 1
你需要创建一个新列,表示'SWITCH'列的变化,然后可以使用'groupby'来执行累积计数。
结果:
ID | SWITCH | SWITCH_CHANGE | Cum. Count |
---|---|---|---|
0 | A | ON | 1 |
1 | A | ON | 0 |
2 | A | ON | 0 |
3 | A | OFF | 1 |
4 | A | OFF | 0 |
5 | A | OFF | 0 |
6 | A | ON | 1 |
7 | A | ON | 0 |
8 | A | ON | 0 |
9 | B | ON | 0 |
10 | B | ON | 0 |
11 | B | ON | 0 |
12 | B | OFF | 1 |
13 | B | OFF | 0 |
14 | B | OFF | 0 |
15 | B | OFF | 0 |
16 | B | OFF | 0 |
17 | B | OFF | 0 |
英文:
You need to create a new column which indicates the change in the 'SWITCH' column, then you can use 'groupby' to perform the cumulative count.
import pandas as pd
# Create sample data
df = pd.DataFrame({'ID': ['A'] * 9 + ['B'] * 9,
'SWITCH': ['ON'] * 3 + ['OFF'] * 3 + ['ON'] * 3 + ['ON'] * 3 + ['OFF'] * 3 + ['OFF'] * 3})
df['SWITCH_CHANGE'] = (df['SWITCH'] != df['SWITCH'].shift()).astype(int)
df['Cum. Count'] = df.groupby(['ID', df.SWITCH_CHANGE.cumsum()])['SWITCH'].cumcount() + 1
print(df)
Result:
ID SWITCH | SWITCH_CHANGE | Cum. Count | ||
---|---|---|---|---|
0 | A | ON | 1 | 1 |
1 | A | ON | 0 | 2 |
2 | A | ON | 0 | 3 |
3 | A | OFF | 1 | 1 |
4 | A | OFF | 0 | 2 |
5 | A | OFF | 0 | 3 |
6 | A | ON | 1 | 1 |
7 | A | ON | 0 | 2 |
8 | A | ON | 0 | 3 |
9 | B | ON | 0 | 1 |
10 | B | ON | 0 | 2 |
11 | B | ON | 0 | 3 |
12 | B | OFF | 1 | 1 |
13 | B | OFF | 0 | 2 |
14 | B | OFF | 0 | 3 |
15 | B | OFF | 0 | 4 |
16 | B | OFF | 0 | 5 |
17 | B | OFF | 0 | 6 |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论