英文:
I want to get the 'False' Count for every segment
问题
I have a data set that I have broken into segments based on a certain criteria. In another column I have it where the code returns 'True' if the value is equal to the one before it and 'False' if it is not equal. I am able to get the total count of 'False' values for the entire data set, but I am trying to get the total count of 'False' values per segment.
My code:
df['cols2'] = df['cols1'].diff().eq(0).replace({False : 0, True : 1})
counter_obj = Counter(df['cols2'])
false_count = counter_obj[False]
seg = df.groupby('segID')['cols2'].sum()
print(seg)
这是你的代码翻译。
英文:
I have a data set that I have broken into segments based on a certain criteria. In another column I have it where the code returns 'True' if the value is equal to the one before it and 'False' if it is not equal. I am able to get the total count of 'False' values for the entire data set, but I am trying to get the total count of 'False' values per segment.
My code:
df['cols2'] = df['cols1'].diff().eq(0).replace({False : 0, True : 1})
counter_obj = Counter(df['cols2'])
false_count = counter_obj[False]
seg = df.groupby('segID')[cols2 , false_count].sum()
print(seg)
答案1
得分: 0
Here's the translated code portion:
import pandas as pd
df = pd.DataFrame({'cols1': [1, 2, 3, 3, 4, 5, 5, 6, 7]})
df['cols2'] = df['cols1'].diff().eq(0).replace({False: 0, True: 1})
df['cs'] = df['cols2'].cumsum()
And here's the translated output for your dataframes:
Input:
cols1 cols2 cs
0 1 0 0
1 2 0 0
2 3 0 0
3 3 1 1
4 4 0 1
5 5 0 1
6 5 1 2
7 6 0 2
8 7 0 2
Output for aggregation by group:
cs
0 3
1 2
2 2
Output for data for each row:
cols1 cols2 cs count
0 1 0 0 3
1 2 0 0 3
2 3 0 0 3
3 3 1 1 2
4 4 0 1 2
5 5 0 1 2
6 5 1 2 2
7 6 0 2 2
8 7 0 2 2
Is there anything else you'd like to know or translate?
英文:
import pandas as pd
df = pd.DataFrame({'cols1': [1, 2, 3, 3, 4, 5, 5, 6, 7]})
df['cols2'] = df['cols1'].diff().eq(0).replace({False: 0, True: 1})
df['cs'] = df['cols2'].cumsum()
I can suggest creating a 'cs'
column with a cumulative sum
of 'cols2'
in order to divide the dataframe into groups
. As far as I understood you, you need to count only zeros in each segment.
Input
cols1 cols2 cs
0 1 0 0
1 2 0 0
2 3 0 0
3 3 1 1
4 4 0 1
5 5 0 1
6 5 1 2
7 6 0 2
8 7 0 2
To aggregate
a count by group:
agr = df.groupby('cs').apply(lambda x: x.loc[x['cols2'] == 0, 'cols2'].count())
Output
cs
0 3
1 2
2 2
and if need data for each row
:
df['count'] = df.groupby('cs')['cols2'].transform(lambda x: x[x==0].count())
Output
cols1 cols2 cs count
0 1 0 0 3
1 2 0 0 3
2 3 0 0 3
3 3 1 1 2
4 4 0 1 2
5 5 0 1 2
6 5 1 2 2
7 6 0 2 2
8 7 0 2 2
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论