英文:
For each group add a new shifted column based on the value in another column
问题
import pandas as pd
data = {
'id': ['AAA', 'AAA', 'AAA', 'BBB', 'BBB', 'BBB'],
'Name': ['A', 'B', 'C', 'C', 'D', 'B'],
'type': ['xx', 'yy', 'xx', 'xx', 'zz', 'yy'],
'start': ['yes', 'no', 'no', 'yes', 'no', 'no']
}
df = pd.DataFrame(data)
def create_shifted_column(df):
df['NAG'] = df['Name'].shift(-1)
df.loc[df['start'] == 'yes', 'Name'] = df['NAG']
df.drop('start', axis=1, inplace=True)
create_shifted_column(df)
print(df)
This code will create a new shifted column 'NAG' based on the 'Name' column when 'start' is 'no' and update the 'Name' column accordingly. The 'start' column is then dropped to match your expected output.
英文:
Given the dataframe below, I am trying to add a new shifted column based on yes/no value in the column start. However, my attempts are not really effective.
id Name type start
AAA A xx yes
AAA B yy no
AAA C xx no
BBB C xx yes
BBB D zz no
BBB B yy no
In the dataframe above, given the value "no" in column "start", I would like to add a new shifted column with the value from "Name", as well as change the value on the column "Name" itself.
Example of the expected output (the column start can be deleted after the operation)
id Name type NAG
AAA A xx B
AAA A yy C
BBB C xx D
BBB C zz B
Even better (but this I can also fix it using a dictionary afterwards, probably not worth including it unless you have a better solution):
id Name type NAG typeNAG
AAA A xx B yy
AAA A xx C xx
BBB C xx D zz
BBB C xx B yy
My very poor attempt:
def n_issue(row):
if row['start'] == "no":
return row['issueLabel']
else:
pass
ag["nag"] = ag(n_issue, axis=1)
But using the above I cannot shift the column..
Any solution is very much appreciated!
答案1
得分: 1
# 复制 Name 和 type 列
df['NAG'] = df['Name']
df['typeNAG'] = df['type']
# 删除 start=no 的 Name 和 type 值
df['Name'] = df['Name'][df['start']=='yes']
df['type'] = df['type'][df['start']=='yes']
# 用上方单元格的值填充空单元格
df.ffill(inplace=True)
# 删除 start=yes 的行和 start 列
df = df[df['start']=='no']
df.drop(['start'], inplace=True, axis=1)
英文:
You can use the following code:
# Duplicate Name and type columns
df['NAG'] = df['Name']
df['typeNAG'] = df['type']
# Delete Name and type values where start=no
df['Name'] = df['Name'][df['start']=='yes']
df['type'] = df['type'][df['start']=='yes']
# Fill the empty cells with the cell above
df.ffill(inplace=True)
# Delete the start=yes rows and the start column
df = df[df['start']=='no']
df.drop(['start'], inplace=True, axis=1)
Output:
id Name type NAG typeNAG
AAA A xx B yy
AAA A xx C xx
BBB C xx D zz
BBB C xx B yy
答案2
得分: 1
你可以尝试以下操作:
m = df["start"] == "yes"
res = (
df[m].merge(df[~m], on="id", suffixes=("", "NAG"))
.drop(columns=["start", "startNAG"])
)
- 在列
id
上将df
与自身合并,但左侧只包括"yes"
行,右侧只包括"no"
行。使用后缀,以获得接近你想要的结果。 - 删除
start
列。
样本结果:
id Name type NameNAG typeNAG
0 AAA A xx B yy
1 AAA A xx C xx
2 BBB C xx D zz
3 BBB C xx B yy
英文:
You could try the folowing:
m = df["start"] == "yes"
res = (
df[m].merge(df[~m], on="id", suffixes=("", "NAG"))
.drop(columns=["start", "startNAG"])
)
- Merge
df
with itself on the columnid
, but on the left only with"yes"
-rows and on the right only with"no"
-rows. Use suffixes that yield something close to what you want. - Drop the
start
-columns.
Result for the sample:
id Name type NameNAG typeNAG
0 AAA A xx B yy
1 AAA A xx C xx
2 BBB C xx D zz
3 BBB C xx B yy
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论