2023年2月8日 09:07:44go评论88阅读模式

英文:

For each group add a new shifted column based on the value in another column

问题

import pandas as pd
data = {
    'id': ['AAA', 'AAA', 'AAA', 'BBB', 'BBB', 'BBB'],
    'Name': ['A', 'B', 'C', 'C', 'D', 'B'],
    'type': ['xx', 'yy', 'xx', 'xx', 'zz', 'yy'],
    'start': ['yes', 'no', 'no', 'yes', 'no', 'no']
}
df = pd.DataFrame(data)
def create_shifted_column(df):
    df['NAG'] = df['Name'].shift(-1)
    df.loc[df['start'] == 'yes', 'Name'] = df['NAG']
    df.drop('start', axis=1, inplace=True)
create_shifted_column(df)
print(df)

This code will create a new shifted column 'NAG' based on the 'Name' column when 'start' is 'no' and update the 'Name' column accordingly. The 'start' column is then dropped to match your expected output.

英文:

Given the dataframe below, I am trying to add a new shifted column based on yes/no value in the column start. However, my attempts are not really effective.

id     Name     type    start
AAA    A         xx      yes
AAA    B         yy      no
AAA    C         xx      no
BBB    C         xx      yes
BBB    D         zz      no
BBB    B         yy      no

In the dataframe above, given the value "no" in column "start", I would like to add a new shifted column with the value from "Name", as well as change the value on the column "Name" itself.

Example of the expected output (the column start can be deleted after the operation)

id     Name     type      NAG   
AAA    A         xx        B
AAA    A         yy        C
BBB    C         xx        D
BBB    C         zz        B

Even better (but this I can also fix it using a dictionary afterwards, probably not worth including it unless you have a better solution):

id     Name     type      NAG   typeNAG  
AAA    A         xx        B       yy
AAA    A         xx        C       xx
BBB    C         xx        D       zz
BBB    C         xx        B       yy

My very poor attempt:

def n_issue(row):
    if row[&#39;start&#39;] == &quot;no&quot;:
        return row[&#39;issueLabel&#39;]
    else:
        pass
ag[&quot;nag&quot;] = ag(n_issue, axis=1)

But using the above I cannot shift the column..

Any solution is very much appreciated!

答案1

得分: 1

# 复制 Name 和 type 列
df['NAG'] = df['Name']
df['typeNAG'] = df['type']
# 删除 start=no 的 Name 和 type 值
df['Name'] = df['Name'][df['start']=='yes']
df['type'] = df['type'][df['start']=='yes']
# 用上方单元格的值填充空单元格
df.ffill(inplace=True)
# 删除 start=yes 的行和 start 列
df = df[df['start']=='no']
df.drop(['start'], inplace=True, axis=1)

英文:

You can use the following code:

# Duplicate Name and type columns
df[&#39;NAG&#39;] = df[&#39;Name&#39;]
df[&#39;typeNAG&#39;] = df[&#39;type&#39;]
# Delete Name and type values where start=no
df[&#39;Name&#39;] = df[&#39;Name&#39;][df[&#39;start&#39;]==&#39;yes&#39;]
df[&#39;type&#39;] = df[&#39;type&#39;][df[&#39;start&#39;]==&#39;yes&#39;]
# Fill the empty cells with the cell above
df.ffill(inplace=True)
# Delete the start=yes rows and the start column
df = df[df[&#39;start&#39;]==&#39;no&#39;]
df.drop([&#39;start&#39;], inplace=True, axis=1)

Output:

id     Name     type      NAG   typeNAG  
AAA    A         xx        B       yy
AAA    A         xx        C       xx
BBB    C         xx        D       zz
BBB    C         xx        B       yy

答案2

得分: 1

你可以尝试以下操作：

m = df["start"] == "yes"
res = (
    df[m].merge(df[~m], on="id", suffixes=("", "NAG"))
    .drop(columns=["start", "startNAG"])
)

在列id上将df与自身合并，但左侧只包括"yes"行，右侧只包括"no"行。使用后缀，以获得接近你想要的结果。
删除start列。

样本结果：

    id Name type NameNAG typeNAG
0  AAA    A   xx       B      yy
1  AAA    A   xx       C      xx
2  BBB    C   xx       D      zz
3  BBB    C   xx       B      yy

英文:

You could try the folowing:

m = df[&quot;start&quot;] == &quot;yes&quot;
res = (
    df[m].merge(df[~m], on=&quot;id&quot;, suffixes=(&quot;&quot;, &quot;NAG&quot;))
    .drop(columns=[&quot;start&quot;, &quot;startNAG&quot;])
)

Merge df with itself on the column id, but on the left only with "yes"-rows and on the right only with "no"-rows. Use suffixes that yield something close to what you want.
Drop the start-columns.

Result for the sample:

    id Name type NameNAG typeNAG
0  AAA    A   xx       B      yy
1  AAA    A   xx       C      xx
2  BBB    C   xx       D      zz
3  BBB    C   xx       B      yy

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

对于每个组，根据另一列中的数值添加一个新的偏移列。

问题

答案1

答案2

Python，numba，具有自身类型字段的类

How to define a script in the venv/bin dir with pyproject.toml (in hatch or any other wrapper)

How do I press 2 keys at a time using pyautogui?

显示文件夹中的随机文件与Kivy（Python）

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。