AttributeError: 'NoneType' object has no attribute 'lower' In a machine learning application in Python

huangapple go评论69阅读模式
英文:

AttributeError: 'NoneType' object has no attribute 'lower' In a machine learning application in Python

问题

以下是您的代码和错误输出的中文翻译部分:

这是我运行时遇到错误的代码部分

def my_tokenizer(text):
    if text is None:
        return []
    else:
        return text.split()

vectorizer = CountVectorizer(tokenizer=my_tokenizer)
tag_dtm = vectorizer.fit_transform(tag_data['Tags'])

这是我的错误输出:

---> 69     doc = doc.lower()
     70 if accent_function is not None:
     71     doc = accent_function(doc)

AttributeError: 'NoneType' object has no attribute 'lower'

我查看了示例解决方案,并从中获取了我的代码:

vectorizer = CountVectorizer(tokenizer = lambda x: x.split())
tag_dtm = vectorizer.fit_transform(tag_data['Tags'])

我将其转换为上面的形式。但我仍然遇到相同的错误,不知道应该如何修复它。

英文:

These are my codes that I get an error when I run:

def my_tokenizer(text):
    if text is None:
        return []
    else:
        return text.split()

vectorizer = CountVectorizer(tokenizer=my_tokenizer)
tag_dtm = vectorizer.fit_transform(tag_data['Tags'])

This is my error output:

---> 69     doc = doc.lower()
     70 if accent_function is not None:
     71     doc = accent_function(doc)

AttributeError: 'NoneType' object has no attribute 'lower'

I looked at the sample solutions and got my code from this:

vectorizer = CountVectorizer(tokenizer = lambda x: x.split())
tag_dtm = vectorizer.fit_transform(tag_data['Tags'])

I converted it to the above. But I still get the same error and I don't know what to fix:

答案1

得分: 1

默认情况下,CountVectorizer 会尝试将所有输入转换为小写。由于您的输入中包含 None,无法应用 lower()

为解决这个特定问题,您可以在初始化 CountVectorizer 时提供 lowercase = False 参数。然而,一个更安全的方法是在将输入传递给向量化器之前,将所有的 None 出现实例移除。

英文:

By default CountVectorizer will try to convert all inputs to lowercase. Since you have None in your input, lower() cannot be applied.

To fix this particular problem you can provide lowercase = False argument when initializing the CountVectorizer. However, a safer approach would be to remove all occurrences of None from your input before passing to the vectorizer.

huangapple
  • 本文由 发表于 2023年5月7日 22:00:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/76194378.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定