如何获取pandas数据框中每行的第二大值

huangapple go评论114阅读模式
英文:

How to get second max value of each row in a pandas dataframe

问题

这会在pandas数据框的每一行中获取最大值,并创建一个名为'max'的新列,然后从该列创建一个名为maxV的新列表。

  1. df["max"] = df1.max(axis=1)
  2. maxV = df['max'].tolist()

如何在一个名为'sec_max'的新列中获取每一行的第二大值?

英文:

This gets the maximum value for each row in a pandas dataframe in a new column named 'max' and then
creates a new list named maxV from that column.

  1. df["max"] = df1.max(axis=1)
  2. maxV = df['max'].tolist()

How can i get the second max value of each row in a new column named 'sec_max'?

答案1

得分: 1

  1. import pandas as pd
  2. import numpy as np
  3. # 获取每行的第二大值的函数
  4. def get_second_max(row):
  5. # 使用 np.partition 高效地找到第二大的值
  6. second_largest = np.partition(row, -2)[-2]
  7. return second_largest
  8. # 对每行应用该函数,并将结果存储在 'sec_max' 列中
  9. df['sec_max'] = df.apply(get_second_max, axis=1)
  10. # 将 'sec_max' 列转换为名为 maxV 的新列表
  11. sec_maxV = df['sec_max'].tolist()
英文:
  1. import pandas as pd
  2. import numpy as np
  3. # Function to get the second maximum value in each row
  4. def get_second_max(row):
  5. # Use np.partition to efficiently find the second largest value
  6. second_largest = np.partition(row, -2)[-2]
  7. return second_largest
  8. # Apply the function to each row and store the result in 'sec_max' column
  9. df['sec_max'] = df.apply(get_second_max, axis=1)
  10. # Convert the 'sec_max' column to a new list named maxV
  11. sec_maxV = df['sec_max'].tolist()

答案2

得分: 0

另一种解决方案,使用 np.sort

  1. df[['sec_max', 'maxV']] = np.sort(df, axis=1)[:, -2:]
  2. print(df)

打印:

  1. Col1 Col2 Col3 sec_max maxV
  2. 0 1 3 2 2 3
  3. 1 6 4 5 5 6
  4. 2 7 8 8 8 8

使用的 df

  1. Col1 Col2 Col3
  2. 0 1 3 2
  3. 1 6 4 5
  4. 2 7 8 8
英文:

Another solution, using np.sort:

  1. df[['sec_max', 'maxV']] = np.sort(df, axis=1)[:, -2:]
  2. print(df)

Prints:

  1. Col1 Col2 Col3 sec_max maxV
  2. 0 1 3 2 2 3
  3. 1 6 4 5 5 6
  4. 2 7 8 8 8 8

df used:

  1. Col1 Col2 Col3
  2. 0 1 3 2
  3. 1 6 4 5
  4. 2 7 8 8

答案3

得分: 0

  1. import pandas as pd
  2. import numpy as np
  3. data = {"col1": [420, 380, 390],
  4. "col2": [50, 40, 45],
  5. "col3": [102, 60, 700]}
  6. df = pd.DataFrame(data)
  7. for i in range(0,len(df.index)):
  8. currentRow=df.iloc[i].values
  9. max1=currentRow.max()
  10. max2=currentRow.min()
  11. for j in range(0,len(currentRow)):
  12. if(currentRow[j]>max2 and currentRow[j]<max1):
  13. max2=currentRow[j]
  14. print(max1,max2)
英文:

here you are:

  1. import pandas as pd
  2. import numpy as np
  3. data = {&quot;col1&quot;: [420, 380, 390],
  4. &quot;col2&quot;: [50, 40, 45],
  5. &quot;col3&quot;: [102, 60, 700]}
  6. df = pd.DataFrame(data)
  7. for i in range(0,len(df.index)):
  8. currentRow=df.iloc[i].values
  9. max1=currentRow.max()
  10. max2=currentRow.min()
  11. for j in range(0,len(currentRow)):
  12. if(currentRow[j]&gt;max2 and currentRow[j]&lt;max1):
  13. max2=currentRow[j]
  14. print(max1,max2)

答案4

得分: 0

  1. df['第二大'] = df.apply(lambda row: row.nlargest(2).values[-1], axis=1)
  2. print(df)

输出:

  1. a b c 第二大
  2. 0 1 3 2 2
  3. 1 2 4 1 2
  4. 2 3 5 10 5
英文:

You can also use this:

df[&#39;2nd largest&#39;] = df.apply(lambda row: row.nlargest(2).values[-1],axis=1)

Example Input:

  1. import pandas as pd
  2. df = pd.DataFrame({&#39;a&#39;:[1,2,3], &#39;b&#39;:[3,4,5], &#39;c&#39;:[2,1,10]})
  3. print (df)

df:

  1. a b c
  2. 0 1 3 2
  3. 1 2 4 1
  4. 2 3 5 10

code:

  1. df[&#39;2nd largest&#39;] = df.apply(lambda row: row.nlargest(2).values[-1],axis=1)
  2. print (df)

Output:

  1. a b c 2nd largest
  2. 0 1 3 2 2
  3. 1 2 4 1 2
  4. 2 3 5 10 5

huangapple
  • 本文由 发表于 2023年7月18日 03:00:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/76707387.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定