在Python中将数据框更改为字符串

huangapple go评论73阅读模式
英文:

Changing a Data Frame over a string in python

问题

我正在尝试使用以下代码更改数据框中的分类数据:

CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'embark_town', 'alone']
for i in CATEGORICAL_COLUMNS:
  dfTrain[i] = pd.factorize(dfTrain[i])[0]
dfTrain.head()

但是我收到了以下错误信息:

'DataFrame' object has no attribute 'i'

我该如何修复这个问题?

英文:

I'm trying to change Categorical Data from my data frame using the code

CATEGORICAL_COLUMNS = ['sex','n_siblings_spouses', 'parch', 'class',
               'embark_town', 'alone']
for i in CATEGORICAL_COLUMNS:
  dfTrain[i] = pd.factorize(dfTrain.i)[0]
dfTrain.head()

But I get the error:

'DataFrame' object has no attribute 'i'

How would I fix this?

答案1

得分: 1

i 不是一个属性,你不能使用点表示法:

CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class',
                       'embark_town', 'alone']

for i in CATEGORICAL_COLUMNS:
    dfTrain[i] = pd.factorize(dfTrain[i])[0]  # .i -> [i]

更新

如果你使用sklearn,你可以使用OrdinalEncoder

from sklearn.preprocessing import OrdinalEncoder

oe = OrdinalEncoder()
dfTrain[CATEGORICAL_COLUMNS] = oe.fit_transform(dfTrain[CATEGORICAL_COLUMNS])

然后使用transform_inverse来解码数值。

英文:

i is not an attribute, you can't use dot notation:

CATEGORICAL_COLUMNS = ['sex','n_siblings_spouses', 'parch', 'class',
                       'embark_town', 'alone']

for i in CATEGORICAL_COLUMNS:
    dfTrain[i] = pd.factorize(dfTrain[i])[0]  # .i -> [i]

Update

If you use sklearn, you can use OrdinalEncoder:

from sklearn.preprocessing import OrdinalEncoder

oe = OrdinalEncoder()
dfTrain[CATEGORICAL_COLUMNS] = oe.fit_transform(dfTrain[CATEGORICAL_COLUMNS])

and use transform_inverse to decode numeric values.

huangapple
  • 本文由 发表于 2023年3月12日 07:34:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/75710221.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定