值错误在比较两列时发生

huangapple go评论73阅读模式
英文:

Value error while comparing a two columns

问题

我正在尝试将特定列的值与其中一列进行比较,并根据它们的比较索引将结果存储到新列中,例如:如果值相差超过10%,则为"Low",否则为"OK"。

df["Index"] = ""
def function(df):
    for i in range(1, len(df.columns)-2):
        if((df.columns.values[1]) == (df.columns.values[i+1])):
            if((df.iloc[:,1]) < (0.9 * df.iloc[:,i+1])):
                df["Index"] = "Low"
            else:
                df["Index"] = "OK"
function(df)

与此有关的错误是什么关系?

答:与这个错误相关的问题是你的代码中使用了比较操作符(<)来比较两个DataFrame列的元素,但这会导致Pandas返回一个布尔Series,而不是单个布尔值。因此,Python不知道如何将Series的真值转化为单一的True或False。要解决这个问题,你可以使用.any().all()方法,具体取决于你想要的逻辑。例如,如果你想要检查所有元素是否都满足条件,可以使用.all()

要减少代码结构的时间复杂度,你可以考虑使用向量化操作,而不是循环遍历DataFrame的列。这将更有效率地处理数据。但这可能需要对你的具体问题进行更详细的分析和修改代码。

英文:

I am trying to compare the values of a particular column with one of columns, and store the result to a new column based upon their comparison index, say: Low if value differ by more than 10%, and OK if otherwise.

df[&quot;Index&quot;] = &quot;&quot;
def function(df):
    for i in range(1, len(df.columns)-2):
        if((df.columns.values[1]) == (df.columns.values[i+1])):
            if((df.iloc[:,1]) &lt; (0.9 * df.iloc[:,i+1])):
                df[&quot;Index&quot;] = &quot;Low&quot;
            else:
                df[&quot;Index&quot;] = &quot;OK&quot;
function(df)

What is the relation of

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

to this? Would be great if someone can as well suggest ways to reduce the time complexity using same code structure.

答案1

得分: 0

你想要检查年龄是否与其后的任何列进行比较吗?如果其中一列的值少于90%,则结果为'Ok'吗?我不确定你需要的逻辑是否是这样的?

df = pd.DataFrame({'char':['A', 'B', 'C', 'D'],'Age':[20, 21, 19, 18],'Age1':[29, 27, 25, 26],'Age2':[60, 48, 55, 62], 'Age3':[60, 48, 55, 62],'Age4':[60, 48, 55, 62],'Age5':[18, 19, 17, 12]})

df["Index"] = ""

def function(df):
    for i in range(1, len(df.columns)-2):
        df['Index'] = np.where(df.iloc[:, 1] < 0.9 * df.iloc[:, i + 1], 'Low', 'Ok')

function(df)

输出:

  char  Age  Age1  Age2  Age3  Age4  Age5 Index
0    A   20    29    60    60    60    18    Ok
1    B   21    27    48    48    48    19    Ok
2    C   19    25    55    55    55    17    Ok
3    D   18    26    62    62    62    12    Ok

如果我将Age中的一个值更改为1,以便Age[i+1]列的值大于90%,则结果如下:

df.loc[df.index[0], 'Age'] = 1
function(df)

输出:

  char  Age  Age1  Age2  Age3  Age4  Age5 Index
0    A    1    29    60    60    60    18   Low
1    B   21    27    48    48    48    19    Ok
2    C   19    25    55    55    55    17    Ok
3    D   18    26    62    62    62    12    Ok

请注意,现在第一行的Index列的值为'Low',因为Age的值小于90%的Age1值。

英文:

Do you want to check Age against any of the columns after it? And if one of them is 90% less then result in 'Ok'? I am not sure if the logic you need is this?

df = pd.DataFrame({&#39;char&#39;:[&#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;D&#39;],&#39;Age&#39;:[20, 21, 19, 18],&#39;Age1&#39;:[29, 27, 25, 26],&#39;Age2&#39;:[60, 48, 55, 62], &#39;Age3&#39;:[60, 48, 55, 62],&#39;Age4&#39;:[60, 48, 55, 62],&#39;Age5&#39;:[18, 19, 17, 12]})

df[&quot;Index&quot;] = &quot;&quot;

def function(df):
    for i in range(1, len(df.columns)-2):
        df[&#39;Index&#39;] = np.where(df.iloc[:, 1] &lt; 0.9 * df.iloc[:, i + 1], &#39;Low&#39;, &#39;Ok&#39;)


function(df)

Output:

  char  Age  Age1  Age2  Age3  Age4  Age5 Index
0    A   20    29    60    60    60    18    Ok
1    B   21    27    48    48    48    19    Ok
2    C   19    25    55    55    55    17    Ok
3    D   18    26    62    62    62    12    Ok

If I change one of the Age values to 1 so that the Age[i+1] columns are more than 90% then result is

  char  Age  Age1  Age2  Age3  Age4  Age5 Index
0    A    1    29    60    60    60    18   Low
1    B   21    27    48    48    48    19    Ok
2    C   19    25    55    55    55    17    Ok
3    D   18    26    62    62    62    12    Ok

答案2

得分: 0

以下是翻译好的部分:

"I'm not quite sure about the logic behind your question but if there are many columns you could consider to use something like"

import pandas as pd
df = pd.DataFrame({'char':['A', 'B', 'C', 'D'],
                   'Age':[20, 21, 19, 18],
                   'Age1':[29, 27, 25, 26],
                   'Age2':[60, 48, 55, 62], 
                   'Age3':[60, 48, 55, 62],
                   'Age4':[60, 48, 55, 62],
                   'Age5':[18, 19, 17, 12]})

cols2compare = df.columns[df.columns.str.startswith("Age")]

diz = {True:"Ok", False:"Low"}

df["Index"] = df[cols2compare].apply(lambda x: x["Age"] < x.max()*.9, 
                                     axis=1).map(diz)
英文:

I'm not quite sure about the logic behind your question but if there are many columns you could consider to use something like

import pandas as pd
df = pd.DataFrame({&#39;char&#39;:[&#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;D&#39;],
                   &#39;Age&#39;:[20, 21, 19, 18],
                   &#39;Age1&#39;:[29, 27, 25, 26],
                   &#39;Age2&#39;:[60, 48, 55, 62], 
                   &#39;Age3&#39;:[60, 48, 55, 62],
                   &#39;Age4&#39;:[60, 48, 55, 62],
                   &#39;Age5&#39;:[18, 19, 17, 12]})

cols2compare = df.columns[df.columns.str.startswith(&quot;Age&quot;)]

diz = {True:&quot;Ok&quot;, False:&quot;Low&quot;}

df[&quot;Index&quot;] = df[cols2compare].apply(lambda x: x[&quot;Age&quot;] &lt; x.max()*.9, 
                                     axis=1).map(diz)

huangapple
  • 本文由 发表于 2020年1月3日 19:41:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/59578010.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定