“`python # Pandas – 如何根据条件返回特定行并仅返回特定列的行 “`

huangapple go评论68阅读模式
英文:

Python -Pandas- How to return speicfic rows based on conditions and return the specific column rows only

问题

以下是代码的翻译部分:

有一个数据框根据多个条件需要返回行并创建一个新列来存储这些行

示例

记录 = {
  'Name': ['Ankit', 'Amit', 'Aishwarya', 'Priyanka', 'Priya', 'Shaurya'],
  'Age': [21, 19, 20, 18, 17, 21],
  'Stream': ['Math', 'Commerce', 'Science', 'Math', 'Math', 'Science'],
  'Percentage': [88, 92, 95, 70, 65, 78],
  'hours': [1, 2, 3, 4, 5, 6]
}

条件如下如果年龄为19和21流为数学和商业则返回包括其他记录在内的小时并将这些小时存储在新创建的列中

示例输出
新列添加为`new_column`

Name      Age Stream   Percentage hours new_column
Ankit     21  Math      88         1      1
Amit      19  Commerce  92         2      2
Aishwarya 20  Science   95         3      0
Priyanka  18  Science   70         4      0
Priya     17  Math      65         5      0
Shaurya   21  Science   78         6      0

`new_column`中的0值表示过滤条件不满足
尝试了以下代码但结果不如预期而且不是一个简化版本

条件如下

```python
options1 = ['Math', 'Commerce']
options2 = [21, 19]

dataframe1 = dataframe[(dataframe['Stream'].isin(options1)) & (dataframe['Age'].isin(options2))]
dataframe1['new_column'] = dataframe1['hours']

dataframe = pd.merge(dataframe, dataframe1, on='Name', how='left')

还尝试了以下代码:

dataframe['New'] = dataframe['hours']
dataframe_bkp.loc[:, ['New']] = dataframe_bkp[['Stream', 'Age', 'New']].apply(lambda x: 0 if (x.Stream in ['Maths', 'Commerce'] and (x.Age in [19, 21])) else dataframe_bkp['New'], axis=1)

希望这些翻译对你有所帮助。

英文:

Have a dateframe and based on muliple conditions need to return the rows and to create a new column to store these rows.

Example

record = {
  'Name': ['Ankit', 'Amit', 'Aishwarya', 'Priyanka', 'Priya', 'Shaurya' ],
  'Age': [21, 19, 20, 18, 17, 21],
  'Stream': ['Math', 'Commerce', 'Science', 'Math', 'Math', 'Science'],
  'Percentage': [88, 92, 95, 70, 65, 78],hours=[1,2,3,4,5,6}

Condition like : If age is in 19 & 21 , Stream in Maths, Commerce then return hours along with the other records and these hours to be stored in a new column created for the rows returned

Example: Output:
New column added = new_column

Name      Age Stream Percentage hours new_column
Ankit     21  Math      88         1      1 
Amit      19 Commerce   92         2      2
Aishwarya 20 Science	95	       3      0
Priyanka  18  Science	70	       4      0
Priya	  17  English	65	       5      0
Shaurya	  21  Science	78	       6	  0

The 0 value in new_column since the filter conditions arent satisfied.
Tried below code , but the results not as expected and not a simplified version.

  • Conditions:
options1 = ['Math', 'Commerce']
options2 = [21,19]
    
dataframe1=dataframe[(dataframe['Stream'].isin(options)) & (dataframe['Age'].isin(options2))]
dataframe1['new_column']=dataframe1['hours']
    
dataframe=pd.merge(dataframe,dataframe1,on='Name',how='left')

Also tried with below code:

dataframe['New']=dataframe['hours']
dataframe_bkp.loc[:,['New']] =  dataframe_bkp[['Stream','Age','New']].apply(lambda x: 0 if (x.Stream in 
['Maths','Commerce'] & (x.Age in [19,21] ) else dataframe_bkp['New'],axis=1 )    

答案1

得分: 2

使用 Series.where

df['new_column'] = df['hours'].where(df['Age'].between(19, 21) & df['Stream'].isin(['Math', 'Commerce']), 0)

替代方法:

import numpy as np

m1 = df['Age'].between(19, 21)
m2 = df['Stream'].isin(['Math', 'Commerce'])

df['new_column'] = np.where(m1 & m2, df['hours'], 0)

输出:

        Name  Age    Stream  Percentage  hours  new_column
0      Ankit   21      Math          88      1           1
1       Amit   19  Commerce          92      2           2
2  Aishwarya   20   Science          95      3           0
3   Priyanka   18      Math          70      4           0
4      Priya   17      Math          65      5           0
5    Shaurya   21   Science          78      6           0
英文:

Use Series.where:

df['new_column'] = df['hours'].where(  df['Age'].between(19, 21)
                                     & df['Stream'].isin(['Math', 'Commerce']),
                                     0)

Alternative:

import numpy as np

m1 = df['Age'].between(19, 21)
m2 = df['Stream'].isin(['Math', 'Commerce'])

df['new_column'] = np.where(m1&m2, df['hours'], 0)

Output:

        Name  Age    Stream  Percentage  hours  new_column
0      Ankit   21      Math          88      1           1
1       Amit   19  Commerce          92      2           2
2  Aishwarya   20   Science          95      3           0
3   Priyanka   18      Math          70      4           0
4      Priya   17      Math          65      5           0
5    Shaurya   21   Science          78      6           0

huangapple
  • 本文由 发表于 2023年3月3日 20:52:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/75627325.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定