为什么文本列出现浮点数数据类型错误?

huangapple go评论54阅读模式
英文:

Why is text column getting a float dtype error

问题

我有一个包含标题和文本的 CSV 文件用于分析。我正在创建一个第三列,将每个标题和文本合并在一起,以便我可以继续清理,但是我遇到了这个错误。这两列都不是数字,所以我不知道为什么会发生这个错误以及如何修复它。

df['text'] = df.apply(lambda row: row['title'] + ' ' + row['body'], axis=1)

我以为这可能是由于编码错误导致的,所以我尝试使用以下代码将文件读入 Colab:

df = pd.read_csv(io.BytesIO(uploaded['ASD.csv']), encoding='utf8')

但这根本不起作用,但我最近才开始使用 Colab,所以也许我做错了什么?

英文:

I have a csv with titles and text for analysis. I am making a third column which combines each title and body of text so I can continue with cleaning, but I am getting this error. Neither of these columns are numeric, so I am at a loss for why this error is happening and how to fix it.

df['text'] = df.apply(lambda row: row['title'] + ' ' + row['body'], axis = 1)



<ipython-input-20-326b7df1d9c7> in <lambda>(row)
----> 1 df['text'] = df.apply(lambda row: row['title'] + ' ' + row['body'], axis = 1)
      2 print(df)

TypeError: can only concatenate str (not "float") to str

I thought maybe it is due to encode error so I tried reading the file into colab with this code:

df = pd.read_csv(io.BytesIO(uploaded['ASD.csv']), encoding = 'utf8')

that didn't work at all, but I only recently switched to using colab, so maybe I did it wrong?

答案1

得分: 1

你的代码对我有用。我假设你是如何创建你的数据框的。

import pandas as pd

# 示例数据
data = {
    "title": ["文章 1", "文章 2", "文章 3"],
    "body": ["这是文章 1 的内容。", "这是文章 2 的内容。", "这是文章 3 的内容。"]
}

# 创建数据框
df = pd.DataFrame(data)

# 打印数据框
print(df)

df['text'] = df.apply(lambda row: row['title'] + ' ' + row['body'], axis=1)

print(df)

它会输出带有新的 'text' 列的前后结果。

英文:

Your code works for me. i assumed how you created you dataframe

import pandas as pd

# Sample data
data = {
    "title": ["Article 1", "Article 2", "Article 3"],
    "body": ["This is the conI’m of Article 1.", "This is the content of Article 2.", "This is the content of Article 3."]
}

# Create the dataframe
df = pd.DataFrame(data)

# Print the dataframe
print(df)

df['text'] = df.apply(lambda row: row['title'] + ' ' + row['body'], axis = 1)

print(df)

It outputs before and after with new 'text' column

huangapple
  • 本文由 发表于 2023年6月22日 01:57:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/76525979.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定