如何将数据框元素转换为条件?

huangapple go评论65阅读模式
英文:

How to convert dataframe elements into conditions?

问题

我有一个数据框,想要将其中一列转换为条件。

例如,这是一个示例数据框:

df = pd.DataFrame({'a': ['>28', '27', '<26'], 'b': ['1', '2', '3']})

df
     a  b
0  >28  1
1   27  2
2  <26  3

我想生成一系列if语句来获取理想的b值:

if a > 28:
    b = 1
elif a == 27:
    b = 2
elif a < 26:
    b = 3

我应该如何做呢?在我的数据框中,所有元素都以字符串形式存储。

我尝试使用iloc()函数进行选择,但它无法处理范围条件(>)。

有没有一种优雅的方法可以做到这一点?还是我必须手动键入所有条件?

英文:

I have a dataframe and I want to convert one column into conditions.

For example, here is a sample dataframe:

df=pd.DataFrame({&#39;a&#39;:[&#39;&gt;28&#39;,&#39;27&#39;,&#39;&lt;26&#39;],&#39;b&#39;:[&#39;1&#39;,&#39;2&#39;,&#39;3&#39;]})

df
     a  b
0  &gt;28  1
1   27  2
2  &lt;26  3

I want to generate a series of if statements to get ideal b value:

if a &gt; 28:
    b=1
elif a=27:
    b=2
elif a &lt; 26:
    b=3

How can I do that? In my dataframe, all of elements are stored as string.

I was trying to use the iloc() function to select, but it can't deal with range conditions (&gt;)

If there an elegant way to do this? or do I have to manually type all the conditions?

答案1

得分: 2

如果您稍微修改您的列'a',可以使用numexpr

from numexpr import evaluate

a = 10
expr = 'a' + df['a'].mask(df['a'].str.isdigit(), other='==' + df['a'])
mask = expr.apply(evaluate, global_dict=globals(), local_dict=locals())

b = pd.to_numeric(df.loc[mask, 'b'], errors='coerce').head(1).squeeze() if any(mask) else None

输出:

>> b
3

>> expr
0     a>28
1    a==27
2     a<26
Name: a, dtype: object

>> mask
0    False
1    False
2     True
Name: a, dtype: bool
英文:

If you modify slightly your column 'a', you can use numexpr:

from numexpr import evaluate

a = 10
expr = &#39;a&#39; + df[&#39;a&#39;].mask(df[&#39;a&#39;].str.isdigit(), other=&#39;==&#39; + df[&#39;a&#39;])
mask = expr.apply(evaluate, global_dict=globals(), local_dict=locals())

b = pd.to_numeric(df.loc[mask, &#39;b&#39;], errors=&#39;coerce&#39;).head(1).squeeze() if any(mask) else None

Output:

&gt;&gt;&gt; b
3

&gt;&gt;&gt; expr
0     a&gt;28
1    a==27
2     a&lt;26
Name: a, dtype: object

&gt;&gt;&gt; mask
0    False
1    False
2     True
Name: a, dtype: bool

答案2

得分: 1

这段代码迭代DataFrame中的行,并根据列a上的条件执行操作。您可以在底部替换print语句以实现您的目标,这一点对我来说不太清楚。

import pandas as pd

df = pd.DataFrame({'a': ['>28', '27', '<26'], 'b': ['1', '2', '3']})

# 在数据框中迭代行
for index, row in df.iterrows():
    a_value = row['a']

    # 从条件中分离运算符和整数
    if '<' in a_value or '>' in a_value:
        operator = a_value[0]
        num = a_value[1:]
    else:
        # 为等号条件添加等号
        operator = '='
        num = a_value

    # 评估条件
    condition = f'a {operator} {num}'
    if eval(condition):
        # 在这里执行某些操作(仅打印b以显示满足条件)
        print(condition, row['b'])
英文:

This iterates over the rows in the DataFrame and performs an if condition based on column a, you can replace the print statement at the bottom with your desired goal, which was unclear to me.
import pandas as pd

df = pd.DataFrame({&#39;a&#39;: [&#39;&gt;28&#39;, &#39;27&#39;, &#39;&lt;26&#39;], &#39;b&#39;: [&#39;1&#39;, &#39;2&#39;, &#39;3&#39;]})
#Iterate over rows in dataframe
for index, row in df.iterrows():
    a_value = row[&#39;a&#39;]

    #Seperating operators from integers for conditions
    if &#39;&lt;&#39; or &#39;&gt;&#39; in a_value:
        operator = a_value[0]
        num = a_value[1:]
    else:
        #adding equal sign for equal conditions
        operator = &#39;=&#39;
        num = a_value

    #evaluating condition
    condition = f&#39;a {operator} {num}&#39;
    if eval(&#39;condition&#39;):
        #Do something here (just printing b to show the condition is being met
        print(condition, row[&#39;b&#39;])

答案3

得分: 1

    In [1]: import pandas as pd # added ------v
    
    In [2]: df = pd.DataFrame({'a': ['&gt;28', '==27', '&lt;26'], 'b': ['1', '2', '3']})
    
    In [3]: print(df)
          a  b
    0   &gt;28  1
    1  ==27  2
    2   &lt;26  3
---    
    In [4]: s = "\n".join([
       ...:     f"if a{a}:\n\tb = {b}" if idx == 0 else f"elif a{a}:\n\tb = {b}"
       ...:     for idx, (a, b) in enumerate(zip(df["a"], df["b"]))
       ...: ]) # s is a multi-line string
    
    In [5]: print(s)
    if a&gt;28:
            b = 1
    elif a==27:
            b = 2
    elif a&lt;26:
            b = 3
---    
    In [6]: a = 10
    
    In [7]: exec(s)
    
    In [8]: print(b)
    3
英文:

For fun, you can try this :

In [1]: import pandas as pd # added ------v

In [2]: df = pd.DataFrame({&#39;a&#39;: [&#39;&gt;28&#39;, &#39;==27&#39;, &#39;&lt;26&#39;], &#39;b&#39;: [&#39;1&#39;, &#39;2&#39;, &#39;3&#39;]})

In [3]: print(df)
      a  b
0   &gt;28  1
1  ==27  2
2   &lt;26  3

In [4]: s = &quot;\n&quot;.join([
   ...:     f&quot;if a{a}:\n\tb = {b}&quot; if idx == 0 else f&quot;elif a{a}:\n\tb = {b}&quot;
   ...:     for idx, (a, b) in enumerate(zip(df[&quot;a&quot;], df[&quot;b&quot;]))
   ...: ]) # s is a multi-line string

In [5]: print(s)
if a&gt;28:
        b = 1
elif a==27:
        b = 2
elif a&lt;26:
        b = 3

In [6]: a = 10

In [7]: exec(s)

In [8]: print(b)
3

huangapple
  • 本文由 发表于 2023年5月25日 02:46:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/76326564.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定