如何获取pandas数据框中每行的第二大值

huangapple go评论64阅读模式
英文:

How to get second max value of each row in a pandas dataframe

问题

这会在pandas数据框的每一行中获取最大值,并创建一个名为'max'的新列,然后从该列创建一个名为maxV的新列表。

df["max"] = df1.max(axis=1)
maxV = df['max'].tolist()

如何在一个名为'sec_max'的新列中获取每一行的第二大值?

英文:

This gets the maximum value for each row in a pandas dataframe in a new column named 'max' and then
creates a new list named maxV from that column.

df["max"] = df1.max(axis=1)
maxV = df['max'].tolist()

How can i get the second max value of each row in a new column named 'sec_max'?

答案1

得分: 1

import pandas as pd
import numpy as np

# 获取每行的第二大值的函数
def get_second_max(row):
    # 使用 np.partition 高效地找到第二大的值
    second_largest = np.partition(row, -2)[-2]
    return second_largest

# 对每行应用该函数,并将结果存储在 'sec_max' 列中
df['sec_max'] = df.apply(get_second_max, axis=1)

# 将 'sec_max' 列转换为名为 maxV 的新列表
sec_maxV = df['sec_max'].tolist()
英文:
import pandas as pd
import numpy as np

# Function to get the second maximum value in each row
def get_second_max(row):
    # Use np.partition to efficiently find the second largest value
    second_largest = np.partition(row, -2)[-2]
    return second_largest

# Apply the function to each row and store the result in 'sec_max' column
df['sec_max'] = df.apply(get_second_max, axis=1)

# Convert the 'sec_max' column to a new list named maxV
sec_maxV = df['sec_max'].tolist()

答案2

得分: 0

另一种解决方案,使用 np.sort

df[['sec_max', 'maxV']] = np.sort(df, axis=1)[:, -2:]
print(df)

打印:

   Col1  Col2  Col3  sec_max  maxV
0     1     3     2        2     3
1     6     4     5        5     6
2     7     8     8        8     8

使用的 df

   Col1  Col2  Col3
0     1     3     2
1     6     4     5
2     7     8     8
英文:

Another solution, using np.sort:

df[['sec_max', 'maxV']] = np.sort(df, axis=1)[:, -2:]
print(df)

Prints:

   Col1  Col2  Col3  sec_max  maxV
0     1     3     2        2     3
1     6     4     5        5     6
2     7     8     8        8     8

df used:

   Col1  Col2  Col3
0     1     3     2
1     6     4     5
2     7     8     8

答案3

得分: 0

import pandas as pd
import numpy as np

data = {"col1": [420, 380, 390],
        "col2": [50, 40, 45],
        "col3": [102, 60, 700]}

df = pd.DataFrame(data)

for i in range(0,len(df.index)):
    currentRow=df.iloc[i].values
    max1=currentRow.max()
    max2=currentRow.min()
    for j in range(0,len(currentRow)):
        if(currentRow[j]>max2 and currentRow[j]<max1):
            max2=currentRow[j]
    print(max1,max2)
英文:

here you are:

import pandas as pd
import numpy as np

data = {&quot;col1&quot;: [420, 380, 390], 
        &quot;col2&quot;: [50, 40, 45], 
        &quot;col3&quot;: [102, 60, 700]}

df = pd.DataFrame(data)

for i in range(0,len(df.index)):
    currentRow=df.iloc[i].values
    max1=currentRow.max()
    max2=currentRow.min()
    for j in range(0,len(currentRow)):
        if(currentRow[j]&gt;max2 and currentRow[j]&lt;max1):
            max2=currentRow[j]
    print(max1,max2)

答案4

得分: 0

df['第二大'] = df.apply(lambda row: row.nlargest(2).values[-1], axis=1)
print(df)

输出:

   a  b   c  第二大
0  1  3   2    2
1  2  4   1    2
2  3  5  10   5
英文:

You can also use this:

df[&#39;2nd largest&#39;] = df.apply(lambda row: row.nlargest(2).values[-1],axis=1)

Example Input:

import pandas as pd

df = pd.DataFrame({&#39;a&#39;:[1,2,3], &#39;b&#39;:[3,4,5], &#39;c&#39;:[2,1,10]})
print (df)

df:

   a  b   c
0  1  3   2
1  2  4   1
2  3  5  10

code:

df[&#39;2nd largest&#39;] = df.apply(lambda row: row.nlargest(2).values[-1],axis=1)
print (df)

Output:

   a  b   c  2nd largest
0  1  3   2            2
1  2  4   1            2
2  3  5  10            5

huangapple
  • 本文由 发表于 2023年7月18日 03:00:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/76707387.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定