pandas groupby will not do anything, including as_index if it detects the apply function doesn't do anything?

huangapple go评论49阅读模式
英文:

pandas groupby will not do anything, including as_index if it detects the apply function doesn't do anything?

问题

Pandas的groupby在知道不需要执行任何操作时会变得“智能”吗?

有一个DataFrame

test = pd.DataFrame([{'a': 1, 'b': 15}, {'a':1, 'b': 14}])

如果我们执行以下操作:

test.groupby('a').apply(lambda x: x.iloc[:-1])

然后Pandas将处理所有内容,

a	b
a			
1	0	1	15

但如果我们只是天真地什么都不做:

test.groupby('a').apply(lambda x: x)

返回:

a	b
0	1	15
1	1	14

返回的是DataFrame本身 - 通常我们应该期望索引会更改为'a',因为'as_index'的默认值为True,但它没有。

英文:

Pandas groupby will be 'smart' if it knows it doesn't need to do anything?

have a df

test = pd.DataFrame([{'a': 1, 'b': 15}, {'a':1, 'b': 14}])

if we do

test.groupby('a').apply(lambda x: x.iloc[:-1])

Then pandas will process everything,


a	b
a			
1	0	1	15

but if we just naively do nothing:

test.groupby('a').apply(lambda x: x)

returns:

a	b
0	1	15
1	1	14

the return is the df itself - Normally we should expect the index would be changed to 'a', as the default value for 'as_index' is True. but it doesn't.

答案1

得分: 2

是的,pandas在处理输出的方式上是“聪明的”,但不是你认为的那种方式。

是否添加分组键取决于输出的形状和索引。Pandas不会检测函数对值做了什么。

让我们使用pandas 1.5.3并添加1:

test.groupby('a').apply(lambda x: x+1)

我们会得到一个FutureWarning

FutureWarning: Not prepending group keys to the result index of transform-like apply. In the future, the group keys will be included in the index, regardless of whether the applied function returns a like-indexed object.
To preserve the previous behavior, use

    >>> .groupby(..., group_keys=False)

To adopt the future behavior and silence this warning, use 

    >>> .groupby(..., group_keys=True)
  test.groupby('a').apply(lambda x: x+1)

这清楚地解释了如果函数保持索引不变,那么分组键不会添加到索引中

如果你想要使用键,可以明确指定:

test.groupby('a', group_keys=True).apply(lambda x: x+1)

输出:

     a   b
a         
1 0  2  16
  1  2  15
英文:

Yes, pandas is "smart" in the way it processes the output, but not the way you think it is.

The decision to add the grouping keys depends on the shape and index of the output. Pandas doesn't detect what the function is doing to the values.

Let's use pandas 1.5.3 and just add 1:

test.groupby('a').apply(lambda x: x+1)

We get a FutureWarning:

FutureWarning: Not prepending group keys to the result index of transform-like apply. In the future, the group keys will be included in the index, regardless of whether the applied function returns a like-indexed object.
To preserve the previous behavior, use

	>>> .groupby(..., group_keys=False)

To adopt the future behavior and silence this warning, use 

	>>> .groupby(..., group_keys=True)
  test.groupby('a').apply(lambda x: x+1)

This clearly explains that if the function keeps the index unchanged, then the grouping keys is not added to the index.

If you want to have the key use an explicit:

test.groupby('a', group_keys=True).apply(lambda x: x+1)

Output:

     a   b
a         
1 0  2  16
  1  2  15

huangapple
  • 本文由 发表于 2023年5月29日 04:20:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/76353450.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定