2023年3月3日 20:52:37go评论68阅读模式

英文:

Python -Pandas- How to return speicfic rows based on conditions and return the specific column rows only

问题

以下是代码的翻译部分：

有一个数据框，根据多个条件需要返回行，并创建一个新列来存储这些行。

示例：

记录 = {
  'Name': ['Ankit', 'Amit', 'Aishwarya', 'Priyanka', 'Priya', 'Shaurya'],
  'Age': [21, 19, 20, 18, 17, 21],
  'Stream': ['Math', 'Commerce', 'Science', 'Math', 'Math', 'Science'],
  'Percentage': [88, 92, 95, 70, 65, 78],
  'hours': [1, 2, 3, 4, 5, 6]
}

条件如下：如果年龄为19和21，流为数学和商业，则返回包括其他记录在内的小时，并将这些小时存储在新创建的列中。

示例输出：
新列添加为`new_column`

Name      Age Stream   Percentage hours new_column
Ankit     21  Math      88         1      1
Amit      19  Commerce  92         2      2
Aishwarya 20  Science   95         3      0
Priyanka  18  Science   70         4      0
Priya     17  Math      65         5      0
Shaurya   21  Science   78         6      0

在`new_column`中的0值表示过滤条件不满足。
尝试了以下代码，但结果不如预期，而且不是一个简化版本。

条件如下：

```python
options1 = ['Math', 'Commerce']
options2 = [21, 19]

dataframe1 = dataframe[(dataframe['Stream'].isin(options1)) & (dataframe['Age'].isin(options2))]
dataframe1['new_column'] = dataframe1['hours']

dataframe = pd.merge(dataframe, dataframe1, on='Name', how='left')

还尝试了以下代码：

dataframe['New'] = dataframe['hours']
dataframe_bkp.loc[:, ['New']] = dataframe_bkp[['Stream', 'Age', 'New']].apply(lambda x: 0 if (x.Stream in ['Maths', 'Commerce'] and (x.Age in [19, 21])) else dataframe_bkp['New'], axis=1)

希望这些翻译对你有所帮助。

英文:

Have a dateframe and based on muliple conditions need to return the rows and to create a new column to store these rows.

Example

record = {
  &#39;Name&#39;: [&#39;Ankit&#39;, &#39;Amit&#39;, &#39;Aishwarya&#39;, &#39;Priyanka&#39;, &#39;Priya&#39;, &#39;Shaurya&#39; ],
  &#39;Age&#39;: [21, 19, 20, 18, 17, 21],
  &#39;Stream&#39;: [&#39;Math&#39;, &#39;Commerce&#39;, &#39;Science&#39;, &#39;Math&#39;, &#39;Math&#39;, &#39;Science&#39;],
  &#39;Percentage&#39;: [88, 92, 95, 70, 65, 78],hours=[1,2,3,4,5,6}

Condition like : If age is in 19 & 21 , Stream in Maths, Commerce then return hours along with the other records and these hours to be stored in a new column created for the rows returned

Example: Output:
New column added = new_column

Name      Age Stream Percentage hours new_column
Ankit     21  Math      88         1      1 
Amit      19 Commerce   92         2      2
Aishwarya 20 Science	95	       3      0
Priyanka  18  Science	70	       4      0
Priya	  17  English	65	       5      0
Shaurya	  21  Science	78	       6	  0

The 0 value in new_column since the filter conditions arent satisfied.
Tried below code , but the results not as expected and not a simplified version.

Conditions:

options1 = [&#39;Math&#39;, &#39;Commerce&#39;]
options2 = [21,19]
    
dataframe1=dataframe[(dataframe[&#39;Stream&#39;].isin(options)) &amp; (dataframe[&#39;Age&#39;].isin(options2))]
dataframe1[&#39;new_column&#39;]=dataframe1[&#39;hours&#39;]
    
dataframe=pd.merge(dataframe,dataframe1,on=&#39;Name&#39;,how=&#39;left&#39;)

Also tried with below code:

dataframe[&#39;New&#39;]=dataframe[&#39;hours&#39;]
dataframe_bkp.loc[:,[&#39;New&#39;]] =  dataframe_bkp[[&#39;Stream&#39;,&#39;Age&#39;,&#39;New&#39;]].apply(lambda x: 0 if (x.Stream in 
[&#39;Maths&#39;,&#39;Commerce&#39;] &amp; (x.Age in [19,21] ) else dataframe_bkp[&#39;New&#39;],axis=1 )

答案1

得分: 2

使用 Series.where：

df['new_column'] = df['hours'].where(df['Age'].between(19, 21) & df['Stream'].isin(['Math', 'Commerce']), 0)

替代方法：

import numpy as np

m1 = df['Age'].between(19, 21)
m2 = df['Stream'].isin(['Math', 'Commerce'])

df['new_column'] = np.where(m1 & m2, df['hours'], 0)

输出：

        Name  Age    Stream  Percentage  hours  new_column
0      Ankit   21      Math          88      1           1
1       Amit   19  Commerce          92      2           2
2  Aishwarya   20   Science          95      3           0
3   Priyanka   18      Math          70      4           0
4      Priya   17      Math          65      5           0
5    Shaurya   21   Science          78      6           0

英文:

Use Series.where:

df[&#39;new_column&#39;] = df[&#39;hours&#39;].where(  df[&#39;Age&#39;].between(19, 21)
                                     &amp; df[&#39;Stream&#39;].isin([&#39;Math&#39;, &#39;Commerce&#39;]),
                                     0)

Alternative:

import numpy as np

m1 = df[&#39;Age&#39;].between(19, 21)
m2 = df[&#39;Stream&#39;].isin([&#39;Math&#39;, &#39;Commerce&#39;])

df[&#39;new_column&#39;] = np.where(m1&amp;m2, df[&#39;hours&#39;], 0)

Output:

        Name  Age    Stream  Percentage  hours  new_column
0      Ankit   21      Math          88      1           1
1       Amit   19  Commerce          92      2           2
2  Aishwarya   20   Science          95      3           0
3   Priyanka   18      Math          70      4           0
4      Priya   17      Math          65      5           0
5    Shaurya   21   Science          78      6           0

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

“`python # Pandas – 如何根据条件返回特定行并仅返回特定列的行 “`

问题

答案1

Python 确实无法运行

如何在Python中在特定点停止递归函数的执行

如何在由pandas.to_latex()生成的LaTeX表格中自动换行文本？

Manipulating a given DataFrame in order to recreate it in a different structure, Pandas Python

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论