pandas:如何根据多个条件从列中替换字符串值

huangapple go评论182阅读模式
英文:

pandas: how replace string value from column by multi if condition

问题

我需要帮助解决我的问题。如果在“level_2”列中有“pe60”并且在“level_1”列中有“b”,则尝试将“pe60”替换为“pe61”,并且如果在“level_2”列中有“pe70”并且在“level_1”列中有“b”,则尝试将“pe70”替换为“pe71”。我的尝试但未成功的代码如下:

import pandas as pd
data = {'Name': ['Tom','nick','krish','jack','bob'],
        'level_1': ['a', 'b', 'a', 'b','a'],
        'level_2': ['pe60', 'pe70', 'pe71', 'pe60','pe60'],
        'level_3': [-2, -1, 4, 6,-4],
    }

df = pd.DataFrame(data)

print(df)

def f(row):
    if (row['level_2'] == 'pe60') & (row['level_1'] == 'b'):
        val = 'pe61'
    elif (row['level_2'] == 'pe70') & (row['level_1'] == 'b'):
        val = 'pe71'
    else:
        val = row['level_2']
    return val

df['level_2'] = df.apply(f, axis=1)
print(df)

我的解决方案应该是:

data_sol = {'Name': ['Tom', 'nick', 'krish', 'jack','bob'],
        'level_1': ['a', 'b', 'a', 'b','a'],
        'level_2': ['pe60', 'pe71', 'pe71', 'pe61','pe60'],
        'level_3': [-2, -1, 4, 6,-4],
    }

df_solution = pd.DataFrame(data_sol)
print(df_solution)

如何解决我的问题?

英文:

I need help for my problem. I try replace "pe60" with "pe61" in column "level_2" if have ("pe60" in column "level_2" and "b" in column "level_1") and the same replace "pe70" with "pe71" in column "level_2" if have ("pe70" in column "level_2" and "b" in column "level_1").
my try but not work is:

import pandas as pd
data = {'Name': ['Tom','nick','krish','jack','bob'],
        'level_1': ['a', 'b', 'a', 'b','a'],
        'level_2': ['pe60', 'pe70', 'pe71', 'pe60','pe60'],
        'level_3': [-2, -1, 4, 6,-4],
        }

df = pd.DataFrame(data)

print(df)

def f(row):
    if (row['level_2'] == 'pe60') & (row['level_1'] == 'b'):
        val = (row['level_2'] == 'pe61')
    elif (row['level_2'] == 'pe70') & (row['level_1'] == 'b'):
        val = (row['level_2'] == 'pe71')
    else:
        val = row['level_2']
    return val
df['level_2'] = df.apply(f, axis=1)
print(df)

my solution must be:

data_sol = {'Name': ['Tom', 'nick', 'krish', 'jack','bob'],
        'level_1': ['a', 'b', 'a', 'b','a'],
        'level_2': ['pe60', 'pe71', 'pe71', 'pe61','pe60'],
        'level_3': [-2, -1, 4, 6,-4],
        }

df_solution = pd.DataFrame(data_sol)
print(df_solution)

how can solve my problem ?

答案1

得分: 1

这是一种方法。

def my_func(x):
     return [
              x[0],
              x[1],
              "pe61" if (x[1]=="b" and x[2]=="pe60") else ("pe71" if (x[1]=="b" and x[2]=="pe70") else x[2]),
              x[3]
            ]

fixed_df = df.apply(my_func, axis=1, raw=True)

输出:

    Name level_1 level_2 level_3
0    Tom       a    pe60      -2
1   nick       b    pe71      -1
2  krish       a    pe71       4
3   jack       b    pe61       6
4    bob       a    pe60      -4

这是另一种方法。

def my_func2(x):
     return "pe61" if (x[1]=="b" and x[2]=="pe60") else ("pe71" if (x[1]=="b" and x[2]=="pe70") else x[2])

df["level_2"] = df.apply(my_func2, axis=1, raw=True)

新的 df

    Name level_1 level_2  level_3
0    Tom       a    pe60       -2
1   nick       b    pe71       -1
2  krish       a    pe71        4
3   jack       b    pe61        6
4    bob       a    pe60       -4
英文:

There are many ways to do this. Here's one way.

def my_func(x):
     return [
              x[0],
              x[1],
              "pe61" if (x[1]=="b" and x[2]=="pe60") else ("pe71" if (x[1]=="b" and x[2]=="pe70") else x[2]),
              x[3]
            ]

fixed_df = df.apply(my_func, axis=1, raw=True)

The output:

    Name level_1 level_2 level_3
0    Tom       a    pe60      -2
1   nick       b    pe71      -1
2  krish       a    pe71       4
3   jack       b    pe61       6
4    bob       a    pe60      -4

Here's another way.

def my_func2(x):
     return "pe61" if (x[1]=="b" and x[2]=="pe60") else ("pe71" if (x[1]=="b" and x[2]=="pe70") else x[2])

df["level_2"] = df.apply(my_func2, axis=1, raw=True)

The new df:

    Name level_1 level_2  level_3
0    Tom       a    pe60       -2
1   nick       b    pe71       -1
2  krish       a    pe71        4
3   jack       b    pe61        6
4    bob       a    pe60       -4

答案2

得分: 1

你真的很接近了!你想要返回数值,而不是测试相等性:

def f(row):
    if (row['level_2'] == 'pe60') & (row['level_1'] == 'b'):
        val = 'pe61'
    elif (row['level_2'] == 'pe70') & (row['level_1'] == 'b'):
        val = 'pe71'
    else:
        val = row['level_2']
    return val
英文:

You were really close! You want to return the value, not test for equality:

def f(row):
    if (row['level_2'] == 'pe60') & (row['level_1'] == 'b'):
        val = 'pe61'
    elif (row['level_2'] == 'pe70') & (row['level_1'] == 'b'):
        val = 'pe71'
    else:
        val = row['level_2']
    return val

答案3

得分: 1

你可以使用pandas内置函数和一个字典,如果你有多个这样的映射:

mappings = {'pe60': 'pe61', 
            'pe70': 'pe71'}
df.loc[(df['level_2'].isin(mappings.keys())) & (df['level_1'] == 'b'), 'level_2'] = df['level_2'].map(mappings)

请注意,这个条件检查的值是'level_1'与常量值'b'相对比的。如果这个值还取决于'level_2',解决方案会稍有不同。

英文:

You can use pandas built-in functions and a dictionary if you have multiple such mappings:

mappings = {'pe60': 'pe61', 
            'pe70': 'pe71'}
df.loc[(df['level_2'].isin(mappings.keys())) & (df['level_1'] == 'b'), 'level_2'] = df['level_2'].map(mappings)

#df

	Name	level_1	level_2	level_3
0	Tom	    a	    pe60	-2
1	nick	b	    pe71	-1
2	krish	a	    pe71	4
3	jack	b	    pe61	6
4	bob	    a	    pe60	-4

Please note that this check conditional value of 'level_1 against a constant value of 'b'. If this value is also dependent on the 'level_2', the solution will be slightly different.

huangapple
  • 本文由 发表于 2023年6月26日 05:05:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76552403.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定