Mapping str() over columns in a dataframe returns "TypeError: object of type 'map' has no len()"

huangapple go评论84阅读模式
英文:

Mapping str() over columns in a dataframe returns "TypeError: object of type 'map' has no len()"

问题

以下是您提供的代码的翻译部分:

def fixColsT(df1):
    null_check = np.eye(len(df1.iloc[:,1]), 12) # 创建矩阵
    null_check = map(np.isfinite, df1.iloc[:,1:12]) # 设置矩阵值。
    null_check = map(np.invert, df1.iloc[:,1:12]) # 用于错误检查的反转(未添加)
    df1.iloc[:,3:4] = map(str, df1.iloc[:,2:4]) # 问题所在。
    df1.iloc[:,3] = map(lambda entry: re.sub(r'[^0-9]', '', entry), df1.iloc[:,3])
    df1.iloc[:,4] = map(lambda x: x[len(x)-1], df1.iloc[:,4])
    df1.iloc[:,4] = map(lambda x: re.sub(r'[^A-Ö]', '', x), df1.iloc[:,4])
    df1.iloc[:,2] = (df1.iloc[:,2] - df1.iloc[:,4]) - df1.iloc[:,3] 
    df1.iloc[:,11] = map(fixnrRooms, df1.iloc[:,11])
    df1.iloc[:,12] = map(fixKitchen, df1.iloc[:,12])
    df1.columns = COLUMN_NAMES
    errorcheck = null_check.iloc[:,2]
    for i in CHECKS:
        errorcheck = errorcheck + null_check[:,CHECKS[i]]
    errors = np.where(errorcheck)[0]
    print(df1.iloc[4,:])
    return df1, errors

当您运行此代码时,会出现以下错误:

Traceback (most recent call last):
  File "[...]\main.py", line 403, in <module> 
    main()
  File "[...]\main.py", line 391, in main     
    df, errors = fixColsT(dfInput)
  File "[...]\main.py", line 118, in fixColsT 
    df1.iloc[:,3:4] = map(str,df1.iloc[:,2:4])
  File "[...]\Python\Python310\lib\site-packages\pandas\core\indexing.py", line 818, in __setitem__
    iloc._setitem_with_indexer(indexer, value, self.name)
  File "[...]\Python\Python310\lib\site-packages\pandas\core\indexing.py", line 1795, in _setitem_with_indexer
    self._setitem_with_indexer_split_path(indexer, value, name)
  File "[...]\Python\Python310\lib\site-packages\pandas\core\indexing.py", line 1836, in _setitem_with_indexer_split_path
    elif len(ilocs) == 1 and lplane_indexer == len(value) and not is_scalar(pi):
TypeError: object of type 'map' has no len()

我对如何解决这个问题不确定(请注意,我对Python相对新手),因为我需要以某种方式将这些列中的条目转换为字符串,以使其余代码正常工作。 (先前的解决方案迭代整个数据框的每一行,并手动转换每个条目,对于非常小的文件来说可以,但可以预期会有约60000行的文件,这是不合理的)。
如何在不迭代每个条目或每个列的情况下解决此问题(除非绝对必要)。

英文:

Im writing a program that takes in an excel-file, reorders the columns and discards the scraps. Everything works fine except for when i tried to rewrite a function that fixes all of the columns sio it wasent a complete mess. The result was the following:


def fixColsT(df1):
    null_check = np.eye(len(df1.iloc[:,1]),12) #create matrix
    null_check = map(np.isfinite, df1.iloc[:,1:12]) #set matrix-values.
    null_check = map(np.invert, df1.iloc[:,1:12]) #Invert for error-checking (not_added)
    df1.iloc[:,3:4] = map(str,df1.iloc[:,2:4]) #The problem.
    df1.iloc[:,3] = map(lambda entry: re.sub(r&#39;[^0-9]&#39;, &#39;&#39;, entry), df1.iloc[:,3])
    df1.iloc[:,4] = map(lambda x:x [len(x)-1], df1.iloc[:,4])
    df1.iloc[:,4] = map(lambda x: re.sub(r&#39;[^A-&#214;]&#39;, &#39;&#39;, x), df1.iloc[:,4])
    df1.iloc[:,2] = (df1.iloc[:,2] - df1.iloc[:,4]) - df1.iloc[:,3] 
    df1.iloc[:,11] = map(fixnrRooms, df1.iloc[:,11])
    df1.iloc[:,12] = map(fixKitchen, df1.iloc[:,12])
    df1.columns = COLUMN_NAMES
    errorcheck = null_check.iloc[:,2]
    for i in CHECKS:
        errorcheck = errorcheck + null_check[:,CHECKS[i]]
    errors = np.where(errorcheck)[0]
    print(df1.iloc[4,:])
    return df1, errors

When i run this, i get the following error:

Traceback (most recent call last):
  File &quot;[...]\main.py&quot;, line 403, in &lt;module&gt; 
    main()
  File &quot;[...]\main.py&quot;, line 391, in main     
    df, errors=fixColsT(dfInput)
  File &quot;[...]\main.py&quot;, line 118, in fixColsT 
    df1.iloc[:,3:4] = map(str,df1.iloc[:,2:4])
  File &quot;[...]\Python\Python310\lib\site-packages\pandas\core\indexing.py&quot;, line 818, in __setitem__
    iloc._setitem_with_indexer(indexer, value, self.name)
  File &quot;[...]\Python\Python310\lib\site-packages\pandas\core\indexing.py&quot;, line 1795, in _setitem_with_indexer
    self._setitem_with_indexer_split_path(indexer, value, name)
  File &quot;[...]\Python\Python310\lib\site-packages\pandas\core\indexing.py&quot;, line 1836, in _setitem_with_indexer_split_path
    elif len(ilocs) == 1 and lplane_indexer == len(value) and not is_scalar(pi):
TypeError: object of type &#39;map&#39; has no len()

I honestly have no idea how to proceed (keep in mind that im reasonably new to python in general) as i need to somehow convert the entries in these columns to strings for the rest of the code to work. (The previous solution iterated over every row of the entire dataframe and casted every entry manually which is fine for very small files but as it can be expected to get files of around 60000 rows, this is not reasonable).
How would one get around this without iterating over every entry or over every column individually (unless theat is absolutely neccessary).

答案1

得分: 1

看着那行代码:

df1.iloc[:,3:4] = map(str,df1.iloc[:,2:4])

左边是数据框的(列)切片,右边是在数据框上应用函数的映射。
有两点评论:

  • map 在数据框上效果不好,不清楚您是想在行、列还是每个条目上应用函数(我认为您想要在这里应用函数)。请改用 DataFrame.applyDataFrame.applymapmap 适用于 Series。
  • map 的返回值是一个惰性迭代器,但左边切片上的 = 操作需要知道正在复制的内容的长度(甚至形状),这在映射中不容易读取(您需要耗尽它并存储所有的值)。

总之,建议尝试使用以下代码:

df1.iloc[:,3] = df1.iloc[:,2:4].agg(''.join, axis=1)
df1.iloc[:,3] = df1.iloc[:,3].str.replace(r'[^0-9]', '')
df1.iloc[:,4] = df1.iloc[:,4].str.slice(-1)
df1.iloc[:,4] = df1.iloc[:,4].str.replace(r'[^A-Ö]', '')
df1.iloc[:,2] = (df1.iloc[:,2] - df1.iloc[:,4]) - df1.iloc[:,3] 
df1.iloc[:,11] = df1.iloc[:,11].map(fixnrRooms)
df1.iloc[:,12] = df1.iloc[:,12].map(fixKitchen)
英文:

Looking at that line:

df1.iloc[:,3:4] = map(str,df1.iloc[:,2:4])

The left hand side is a (column) slice slice of the dataframe, and the right-hand side is a map apply a function on a dataframe.
Two comments:

  • map does not work well on dataframe, it's ambiguous if you want to apply a function along rows, columns or on each entry (I think that's what you want here). Use DataFrame.apply or DataFrame.applymap instead, map for series.
  • The return of a map is a lazy iterator, but the = on the lhs slice would like to now the length (even shape) of the what is being copied, this is not trivially readable from a map (you need to exhaust it and store all its values).

All in all, try using:

    df1.iloc[:,3] = df1.iloc[:,2:4].agg(&#39;&#39;.join, axis=1)
    df1.iloc[:,3] = df1.iloc[:,3].str.replace(r&#39;[^0-9]&#39;, &#39;&#39;)
    df1.iloc[:,4] = df1.iloc[:,4].str.slice(-1)
    df1.iloc[:,4] = df1.iloc[:,4].str.replace(r&#39;[^A-&#214;]&#39;, &#39;&#39;)
    df1.iloc[:,2] = (df1.iloc[:,2] - df1.iloc[:,4]) - df1.iloc[:,3] 
    df1.iloc[:,11] = df1.iloc[:,11].map(fixnrRooms)
    df1.iloc[:,12] = df1.iloc[:,12].map(fixKitchen)

huangapple
  • 本文由 发表于 2023年6月29日 16:25:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76579289.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定