2023年2月8日 18:43:23go评论98阅读模式

英文:

Is there a method to find each maximum value before changing to another index value?

问题

以下是您要翻译的内容：

for i, row in df.iloc[:-1].iterrows():
   if df['index_a'][i] == 0:
     df['all_max'][i] = df['column_b'][i].max()
   else:
     df['all_max'][i] = df['column_b'][i].max()

注意：您提供的代码部分不需要翻译，因此我只提供了代码的翻译部分。

英文:

I want to get the max for each index before its changing to another index. As shown in below dataframe example

index_a	column_b	all_max
0	10	-
0	20	-
0	30	30
1	50	50
1	30	-
1	20	-
1	10	-
0	70	70
0	60	-
0	40	-

... (so on)

but instead i get results like this using the function i mentioned below

index_a	column_b	all_max
0	10	70
0	20	70
0	30	70
1	50	70
1	30	70
1	20	70
1	10	70
0	70	70
0	60	70
0	40	70

... (so on)

the index row is not fixed repetition, some have more 1s or 0s.

I have tried using the .max() function but it only provide me for the max value inside the Column B

for i, row in df.iloc[:-1].iterrows():
   if df[&#39;index_a&#39;][i] == 0:
     df[&#39;all_max&#39;][i] = df[&#39;column_b&#39;][i].max()
   else:
     df[&#39;all_max&#39;][i] = df[&#39;column_b&#39;][i].max()

答案1

得分: 1

使用groupby.transform在连续的分组上获取每个组的最大值作为广播系列，然后使用where来识别最大行并分配值，否则为-：

group = df['index_a'].ne(df['index_a'].shift()).cumsum()
m = df.groupby(group)['column_b'].transform('max').eq(df['column_b'])
df['all_max'] = df['column_b'].where(m, '-')

输出结果：

   index_a  column_b all_max
0        0        10       -
1        0        20       -
2        0        30      30
3        1        50      50
4        1        30       -
5        1        20       -
6        1        10       -
7        0        70      70
8        0        60       -
9        0        40       -

英文:

Use groupby.tranform on successive groups to get the max per group as a broadcasted Series, then identify the max rows and assign the value with where, else a -:

group = df[&#39;index_a&#39;].ne(df[&#39;index_a&#39;].shift()).cumsum()
m = df.groupby(group)[&#39;column_b&#39;].transform(&#39;max&#39;).eq(df[&#39;column_b&#39;])
df[&#39;all_max&#39;] = df[&#39;column_b&#39;].where(m, &#39;-&#39;)

Output:

   index_a  column_b all_max
0        0        10       -
1        0        20       -
2        0        30      30
3        1        50      50
4        1        30       -
5        1        20       -
6        1        10       -
7        0        70      70
8        0        60       -
9        0        40       -

答案2

得分: 0

如果需要获取每个连续分组的所有最大值，请使用GroupBy.transform来比较移位值与累积和，然后使用Series.where根据原始的column_b进行比较，如果不匹配，则赋值为-：

m = (df.groupby(df['index_a'].ne(df['index_a'].shift()).cumsum())['column_b']
       .transform('max')
       .eq(df['column_b']))
df['all_max'] = df['column_b'].where(m, '-')

但如果只需要匹配每个相同分组中的第一个最大值，请使用：

i = df.groupby(df['index_a'].ne(df['index_a'].shift()).cumsum())['column_b'].idxmax()
df['all_max'] = '-'
df.loc[i, 'all_max'] = df['column_b']

你可以看到在更改的数据中的差异：

print(df)
   index_a  column_b
0        0        10
1        0        30 <- 第一个0组中有2个最大值
2        0        30
3        1        50
4        1        30
5        1        20
6        1        10
7        0        70
8        0        60
9        0        40

以下是两种方法的应用：

m = (df.groupby(df['index_a'].ne(df['index_a'].shift()).cumsum())['column_b']
       .transform('max')
       .eq(df['column_b']))
df['all_max1'] = df['column_b'].where(m, '-')
i = df.groupby(df['index_a'].ne(df['index_a'].shift()).cumsum())['column_b'].idxmax()
df['all_max2'] = '-'
df.loc[i, 'all_max2'] = df['column_b']

最终的输出如下：

print(df)
   index_a  column_b all_max1 all_max2
0        0        10        -        -
1        0        30       30       30
2        0        30       30        -
3        1        50       50       50
4        1        30        -        -
5        1        20        -        -
6        1        10        -        -
7        0        70       70       70
8        0        60        -        -

英文:

If need all maximum values per consecutive groups use GroupBy.transform by groups by compare shifted values with cumulative sums, compare by original column_b and assign - if no match in Series.where:

m = (df.groupby(df[&#39;index_a&#39;].ne(df[&#39;index_a&#39;].shift()).cumsum())[&#39;column_b&#39;]
       .transform(&#39;max&#39;)
       .eq(df[&#39;column_b&#39;]))
 
df[&#39;all_max&#39;] = df[&#39;column_b&#39;].where(m, &#39;-&#39;)
print (df)
   index_a  column_b all_max
0        0        10       -
1        0        20       -
2        0        30      30
3        1        50      50
4        1        30       -
5        1        20       -
6        1        10       -
7        0        70      70
8        0        60       -
9        0        40       -

But if need match only first maximal value per same groups use:

i = df.groupby(df[&#39;index_a&#39;].ne(df[&#39;index_a&#39;].shift()).cumsum())[&#39;column_b&#39;].idxmax()
df[&#39;all_max&#39;] = &#39;-&#39;
df.loc[i, &#39;all_max&#39;] = df[&#39;column_b&#39;]
print (df)
   index_a  column_b all_max
0        0        10       -
1        0        20       -
2        0        30      30
3        1        50      50
4        1        30       -
5        1        20       -
6        1        10       -
7        0        70      70
8        0        60       -
9        0        40       -

You can see difference in changed data:

print (df)
   index_a  column_b
0        0        10
1        0        30 &lt;- 2 maximums per first 0 group
2        0        30
3        1        50
4        1        30
5        1        20
6        1        10
7        0        70
8        0        60
9        0        40
m = (df.groupby(df[&#39;index_a&#39;].ne(df[&#39;index_a&#39;].shift()).cumsum())[&#39;column_b&#39;]
       .transform(&#39;max&#39;)
       .eq(df[&#39;column_b&#39;]))
 
df[&#39;all_max1&#39;] = df[&#39;column_b&#39;].where(m, &#39;-&#39;)
i = df.groupby(df[&#39;index_a&#39;].ne(df[&#39;index_a&#39;].shift()).cumsum())[&#39;column_b&#39;].idxmax()
df[&#39;all_max2&#39;] = &#39;-&#39;
df.loc[i, &#39;all_max2&#39;] = df[&#39;column_b&#39;]

print (df)
   index_a  column_b all_max1 all_max2
0        0        10        -        -
1        0        30       30       30
2        0        30       30        -
3        1        50       50       50
4        1        30        -        -
5        1        20        -        -
6        1        10        -        -
7        0        70       70       70
8        0        60        -        -

答案3

得分: 0

创建虚拟组以区分index_a列中相同的标识符：

df['all_max'] = (df.groupby(df['index_a'].ne(df['index_a'].shift()).cumsum())['column_b']
                   .transform('max'))
print(df)
# 输出
   index_a  column_b  all_max
0        0        10       30
1        0        20       30
2        0        30       30
3        1        50       50
4        1        30       50
5        1        20       50
6        1        10       50
7        0        70       70
8        0        60       70
9        0        40       70

英文:

Create virtual groups to distinguish same identifier in index_a column:

df[&#39;all_max&#39;] = (df.groupby(df[&#39;index_a&#39;].ne(df[&#39;index_a&#39;].shift()).cumsum())[&#39;column_b&#39;]
                   .transform(&#39;max&#39;))
print(df)
# Output
   index_a  column_b  all_max
0        0        10       30
1        0        20       30
2        0        30       30
3        1        50       50
4        1        30       50
5        1        20       50
6        1        10       50
7        0        70       70
8        0        60       70
9        0        40       70

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

有一种方法可以找到在切换到另一个索引值之前的每个最大值吗？

问题

答案1

答案2

答案3

改变 Python 模块的名称并保持向后兼容性的最佳实践是什么？

write a python program to input a number and count the occurrence of a given number in a given list

Python：如何将编码的Unicode变量的字符串转换为二进制变量

为什么在爬取Google搜索结果时BeautifulSoup返回None？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。