英文:
Mapping str() over columns in a dataframe returns "TypeError: object of type 'map' has no len()"
问题
以下是您提供的代码的翻译部分:
def fixColsT(df1):
null_check = np.eye(len(df1.iloc[:,1]), 12) # 创建矩阵
null_check = map(np.isfinite, df1.iloc[:,1:12]) # 设置矩阵值。
null_check = map(np.invert, df1.iloc[:,1:12]) # 用于错误检查的反转(未添加)
df1.iloc[:,3:4] = map(str, df1.iloc[:,2:4]) # 问题所在。
df1.iloc[:,3] = map(lambda entry: re.sub(r'[^0-9]', '', entry), df1.iloc[:,3])
df1.iloc[:,4] = map(lambda x: x[len(x)-1], df1.iloc[:,4])
df1.iloc[:,4] = map(lambda x: re.sub(r'[^A-Ö]', '', x), df1.iloc[:,4])
df1.iloc[:,2] = (df1.iloc[:,2] - df1.iloc[:,4]) - df1.iloc[:,3]
df1.iloc[:,11] = map(fixnrRooms, df1.iloc[:,11])
df1.iloc[:,12] = map(fixKitchen, df1.iloc[:,12])
df1.columns = COLUMN_NAMES
errorcheck = null_check.iloc[:,2]
for i in CHECKS:
errorcheck = errorcheck + null_check[:,CHECKS[i]]
errors = np.where(errorcheck)[0]
print(df1.iloc[4,:])
return df1, errors
当您运行此代码时,会出现以下错误:
Traceback (most recent call last):
File "[...]\main.py", line 403, in <module>
main()
File "[...]\main.py", line 391, in main
df, errors = fixColsT(dfInput)
File "[...]\main.py", line 118, in fixColsT
df1.iloc[:,3:4] = map(str,df1.iloc[:,2:4])
File "[...]\Python\Python310\lib\site-packages\pandas\core\indexing.py", line 818, in __setitem__
iloc._setitem_with_indexer(indexer, value, self.name)
File "[...]\Python\Python310\lib\site-packages\pandas\core\indexing.py", line 1795, in _setitem_with_indexer
self._setitem_with_indexer_split_path(indexer, value, name)
File "[...]\Python\Python310\lib\site-packages\pandas\core\indexing.py", line 1836, in _setitem_with_indexer_split_path
elif len(ilocs) == 1 and lplane_indexer == len(value) and not is_scalar(pi):
TypeError: object of type 'map' has no len()
我对如何解决这个问题不确定(请注意,我对Python相对新手),因为我需要以某种方式将这些列中的条目转换为字符串,以使其余代码正常工作。 (先前的解决方案迭代整个数据框的每一行,并手动转换每个条目,对于非常小的文件来说可以,但可以预期会有约60000行的文件,这是不合理的)。
如何在不迭代每个条目或每个列的情况下解决此问题(除非绝对必要)。
英文:
Im writing a program that takes in an excel-file, reorders the columns and discards the scraps. Everything works fine except for when i tried to rewrite a function that fixes all of the columns sio it wasent a complete mess. The result was the following:
def fixColsT(df1):
null_check = np.eye(len(df1.iloc[:,1]),12) #create matrix
null_check = map(np.isfinite, df1.iloc[:,1:12]) #set matrix-values.
null_check = map(np.invert, df1.iloc[:,1:12]) #Invert for error-checking (not_added)
df1.iloc[:,3:4] = map(str,df1.iloc[:,2:4]) #The problem.
df1.iloc[:,3] = map(lambda entry: re.sub(r'[^0-9]', '', entry), df1.iloc[:,3])
df1.iloc[:,4] = map(lambda x:x [len(x)-1], df1.iloc[:,4])
df1.iloc[:,4] = map(lambda x: re.sub(r'[^A-Ö]', '', x), df1.iloc[:,4])
df1.iloc[:,2] = (df1.iloc[:,2] - df1.iloc[:,4]) - df1.iloc[:,3]
df1.iloc[:,11] = map(fixnrRooms, df1.iloc[:,11])
df1.iloc[:,12] = map(fixKitchen, df1.iloc[:,12])
df1.columns = COLUMN_NAMES
errorcheck = null_check.iloc[:,2]
for i in CHECKS:
errorcheck = errorcheck + null_check[:,CHECKS[i]]
errors = np.where(errorcheck)[0]
print(df1.iloc[4,:])
return df1, errors
When i run this, i get the following error:
Traceback (most recent call last):
File "[...]\main.py", line 403, in <module>
main()
File "[...]\main.py", line 391, in main
df, errors=fixColsT(dfInput)
File "[...]\main.py", line 118, in fixColsT
df1.iloc[:,3:4] = map(str,df1.iloc[:,2:4])
File "[...]\Python\Python310\lib\site-packages\pandas\core\indexing.py", line 818, in __setitem__
iloc._setitem_with_indexer(indexer, value, self.name)
File "[...]\Python\Python310\lib\site-packages\pandas\core\indexing.py", line 1795, in _setitem_with_indexer
self._setitem_with_indexer_split_path(indexer, value, name)
File "[...]\Python\Python310\lib\site-packages\pandas\core\indexing.py", line 1836, in _setitem_with_indexer_split_path
elif len(ilocs) == 1 and lplane_indexer == len(value) and not is_scalar(pi):
TypeError: object of type 'map' has no len()
I honestly have no idea how to proceed (keep in mind that im reasonably new to python in general) as i need to somehow convert the entries in these columns to strings for the rest of the code to work. (The previous solution iterated over every row of the entire dataframe and casted every entry manually which is fine for very small files but as it can be expected to get files of around 60000 rows, this is not reasonable).
How would one get around this without iterating over every entry or over every column individually (unless theat is absolutely neccessary).
答案1
得分: 1
看着那行代码:
df1.iloc[:,3:4] = map(str,df1.iloc[:,2:4])
左边是数据框的(列)切片,右边是在数据框上应用函数的映射。
有两点评论:
map
在数据框上效果不好,不清楚您是想在行、列还是每个条目上应用函数(我认为您想要在这里应用函数)。请改用DataFrame.apply
或DataFrame.applymap
,map
适用于 Series。map
的返回值是一个惰性迭代器,但左边切片上的=
操作需要知道正在复制的内容的长度(甚至形状),这在映射中不容易读取(您需要耗尽它并存储所有的值)。
总之,建议尝试使用以下代码:
df1.iloc[:,3] = df1.iloc[:,2:4].agg(''.join, axis=1)
df1.iloc[:,3] = df1.iloc[:,3].str.replace(r'[^0-9]', '')
df1.iloc[:,4] = df1.iloc[:,4].str.slice(-1)
df1.iloc[:,4] = df1.iloc[:,4].str.replace(r'[^A-Ö]', '')
df1.iloc[:,2] = (df1.iloc[:,2] - df1.iloc[:,4]) - df1.iloc[:,3]
df1.iloc[:,11] = df1.iloc[:,11].map(fixnrRooms)
df1.iloc[:,12] = df1.iloc[:,12].map(fixKitchen)
英文:
Looking at that line:
df1.iloc[:,3:4] = map(str,df1.iloc[:,2:4])
The left hand side is a (column) slice slice of the dataframe, and the right-hand side is a map apply a function on a dataframe.
Two comments:
map
does not work well on dataframe, it's ambiguous if you want to apply a function along rows, columns or on each entry (I think that's what you want here). UseDataFrame.apply
orDataFrame.applymap
instead,map
for series.- The return of a map is a lazy iterator, but the
=
on the lhs slice would like to now the length (even shape) of the what is being copied, this is not trivially readable from a map (you need to exhaust it and store all its values).
All in all, try using:
df1.iloc[:,3] = df1.iloc[:,2:4].agg(''.join, axis=1)
df1.iloc[:,3] = df1.iloc[:,3].str.replace(r'[^0-9]', '')
df1.iloc[:,4] = df1.iloc[:,4].str.slice(-1)
df1.iloc[:,4] = df1.iloc[:,4].str.replace(r'[^A-Ö]', '')
df1.iloc[:,2] = (df1.iloc[:,2] - df1.iloc[:,4]) - df1.iloc[:,3]
df1.iloc[:,11] = df1.iloc[:,11].map(fixnrRooms)
df1.iloc[:,12] = df1.iloc[:,12].map(fixKitchen)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论