英文:
AttributeError: 'NoneType' object has no attribute 'lower' In a machine learning application in Python
问题
以下是您的代码和错误输出的中文翻译部分:
这是我运行时遇到错误的代码部分:
def my_tokenizer(text):
if text is None:
return []
else:
return text.split()
vectorizer = CountVectorizer(tokenizer=my_tokenizer)
tag_dtm = vectorizer.fit_transform(tag_data['Tags'])
这是我的错误输出:
---> 69 doc = doc.lower()
70 if accent_function is not None:
71 doc = accent_function(doc)
AttributeError: 'NoneType' object has no attribute 'lower'
我查看了示例解决方案,并从中获取了我的代码:
vectorizer = CountVectorizer(tokenizer = lambda x: x.split())
tag_dtm = vectorizer.fit_transform(tag_data['Tags'])
我将其转换为上面的形式。但我仍然遇到相同的错误,不知道应该如何修复它。
英文:
These are my codes that I get an error when I run:
def my_tokenizer(text):
if text is None:
return []
else:
return text.split()
vectorizer = CountVectorizer(tokenizer=my_tokenizer)
tag_dtm = vectorizer.fit_transform(tag_data['Tags'])
This is my error output:
---> 69 doc = doc.lower()
70 if accent_function is not None:
71 doc = accent_function(doc)
AttributeError: 'NoneType' object has no attribute 'lower'
I looked at the sample solutions and got my code from this:
vectorizer = CountVectorizer(tokenizer = lambda x: x.split())
tag_dtm = vectorizer.fit_transform(tag_data['Tags'])
I converted it to the above. But I still get the same error and I don't know what to fix:
答案1
得分: 1
默认情况下,CountVectorizer 会尝试将所有输入转换为小写。由于您的输入中包含 None,无法应用 lower()
。
为解决这个特定问题,您可以在初始化 CountVectorizer 时提供 lowercase = False
参数。然而,一个更安全的方法是在将输入传递给向量化器之前,将所有的 None 出现实例移除。
英文:
By default CountVectorizer will try to convert all inputs to lowercase. Since you have None in your input, lower()
cannot be applied.
To fix this particular problem you can provide lowercase = False
argument when initializing the CountVectorizer. However, a safer approach would be to remove all occurrences of None from your input before passing to the vectorizer.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论