2023年5月29日 12:39:40go评论105阅读模式

英文:

Pandas dataframe - groupby() blocks of constant value over multiple columns

问题

我可以将这个数据框按照'A'和'B'的常数值来分组，得到你期望的结果：

import pandas as pd
df = pd.DataFrame({
   'A': [1,1,1,1,2,2,2,1,1,3,3,3],
   'B': [0,0,1,1,0,0,0,1,1,0,0,0],
})
df.index.names = ['Index']
# 创建一个分组键，当'A'和'B'的值都发生变化时，分组键加一
group_key = (df['A'].ne(df['A'].shift()) | df['B'].ne(df['B'].shift())).cumsum()
df = df.groupby(group_key).apply(lambda x: x)
df.index.names = ['Block', 'Index']
df

这将产生你期望的结果，将数据框分组成了常数'A'和'B'的块。

英文:

I have the following pandas dataframe:

df = pd.DataFrame({
   &#39;A&#39;: [1,1,1,1,2,2,2,1,1,3,3,3],
   &#39;B&#39;: [0,0,1,1,0,0,0,1,1,0,0,0],
});
df.index.names = [&#39;Index&#39;]
df
 		A 	B
Index 		
0 		1 	0
1 		1 	0
2 		1 	1
3 		1 	1
4 		2 	0
5 		2 	0
6 		2 	0
7 		1 	1
8 		1 	1
9 		3 	0
10 		3 	0
11 		3 	0

I can group this dataframe into blocks of constant 'A' like so:

df = df.groupby(df[&#39;A&#39;].diff().ne(0).cumsum()).apply(lambda x: x)
df.index.names = [&#39;Block&#39;, &#39;Index&#39;]
df
 				A 	B
Block 	Index 		
1 		0 		1 	0
		1 		1 	0
		2 		1 	1
		3	 	1 	1
2 		4 		2 	0
		5	 	2 	0
		6 		2 	0
3 		7 		1 	1
		8 		1 	1
4 		9	 	3 	0
		10 		3 	0
		11 		3 	0

How do I instead group this dataframe into blocks of constant 'A' AND constant 'B'? My desired result is:

 				A 	B
Block 	Index 		
1 		0 		1 	0
		1 		1 	0
2		2 		1 	1
		3	 	1 	1
3 		4 		2 	0
		5	 	2 	0
		6 		2 	0
4 		7 		1 	1
		8 		1 	1
5 		9	 	3 	0
		10 		3 	0
		11 		3 	0

答案1

得分: 4

使用与 any 相同的逻辑 (df.diff().ne(0).any(axis=1).cumsum()) 作为分组器：

out = df.groupby(df.diff().ne(0).any(axis=1).cumsum(), group_keys=True).apply(lambda x: x)
out.index.names = ['Block', 'Index']

或者：

out = (df.assign(Block=df.diff().ne(0).any(axis=1).cumsum())
         .groupby('Block', group_keys=True)
         .apply(lambda x: x)
       )

输出：

             A  B
Block Index      
1     0      1  0
      1      1  0
2     2      1  1
      3      1  1
3     4      2  0
      5      2  0
      6      2  0
4     7      1  1
      8      1  1
5     9      3  0
      10     3  0
      11     3  0

英文:

Use the same logic with any (df.diff().ne(0).any(axis=1).cumsum()) as grouper:

out = df.groupby(df.diff().ne(0).any(axis=1).cumsum(), group_keys=True).apply(lambda x: x)
out.index.names = [&#39;Block&#39;, &#39;Index&#39;]

Or:

out = (df.assign(Block=df.diff().ne(0).any(axis=1).cumsum())
         .groupby(&#39;Block&#39;, group_keys=True)
         .apply(lambda x: x)
       )

Output:

             A  B
Block Index      
1     0      1  0
      1      1  0
2     2      1  1
      3      1  1
3     4      2  0
      5      2  0
      6      2  0
4     7      1  1
      8      1  1
5     9      3  0
      10     3  0
      11     3  0

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Pandas DataFrame – 在多列上使用groupby()函数分组连续数值块。

问题

答案1

将float的位重新解释为int在Python中。

4参数逻辑曲线拟合

Django: TypeError: XXX() got multiple values for argument ‘chat_id’

Locating a Web Element in a Drop-down list by Selenium Python

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。