如何在Python中将Excel中的数字列表导入为实际数字值,而不是字符串?

huangapple go评论62阅读模式
英文:

How can I import numeric lists of lists from Excel as actual numeric values, not strings in Python?

问题

我尝试使用列中每个垂直单元格内的列表的形式组织的数据,类似于这样。此列中所有单元格的数据类型都是数值,每个子列表都包含一个整数和一个浮点数。

我正在使用read_excel导入数据,并希望能够使用索引访问子列表的元素,例如:

dataframe["ColumnName"][0][0][0]

以获取第一个列表中的第一个项目。

然而,单元格的内容被读取为字符串(我还通过type()检查了这一点)。因此,例如,当我尝试上面的索引时,我会得到"[",字符串的第一个字符,而不是'2'。

我尝试过:

  • 对该列使用pd.to_numeric时,出现"无法解析字符串"错误。
  • 当我尝试迭代字符串并将字符更改为int()时,我得到:基数为10的int()的无效字面值。
  • 使用dtype = {"colname": float} - 无法转换(也无法转换为整数)。
  • 将其转换为NumPy数组,然后使用arr.astype(float)时,我得到"无法将字符串转换为浮点数"。

当我将数据复制粘贴到Python文件中作为一堆列表时,一切都正常,我可以使用索引访问元素,但显然我不想这样做。我想更好地理解为什么它会将列表转换为字符串,非常感谢任何建议。

英文:

Im trying to work with data organized as a lists of lists within each vertical cell of a column, like this. The data type is numeric for all cells in this column, and each sublist has an int and a float.

I am importing the data using read_excel, and would like to be able to access elements of the sublists with indexing eg.

dataframe["ColumnName"][0][0][0] 

to get the first item in the first list

However, the contents of the cells are being read as strings (I also checked this with type()). So for example when I try the indexing above I would get "[", the first character of the string, rather than '2'.

I tried:

  • using pd.to_numeric on the column I got an "unable to parse string" error
  • when I try and iterate through the string and change characters to int(), i get : invalid literal for int() with base 10
  • using dtype = {"colname": float) - It cant (also cannot convert to int)
  • converting to a numpy array and then using arr.astype(float) i get "could not convert string to float".

When I copy and paste the data into the python file as a bunch of lists, it's fine and I can access the elements with indexing, but obviously I don't want to be doing that. I would like to better understand why it's converting the lists to strings, any advice is much appreciated.

答案1

得分: 1

Your Excel elements are strings along the lines of "[[2, 2.3], [2, 3.4]]" so cannot be directly read as Python Lists and subLists from which the numeric values can be extracted and converted. You need to first change the string into a true List by using for example:

from ast import literal_eval
ml = literal_eval(df['ColName'][0])

and then to get the first value of the first subList:

print(ml[0][0])

No type conversion is necessary as the value is an int.

英文:

Your Excel elements are strings along the lines of "[[2, 2.3], [2, 3.4]]" so cannot be directly read as Python Lists and subLists from which the numeric values can be extracted and converted. You need to first change the string into a true List by using for example:

from ast import literal_eval
ml = literal_eval(df['ColName'][0])

and then to get the first value of the first subList:

print(ml[0][0])

No type conversion is necessary as the value is an int.

huangapple
  • 本文由 发表于 2023年5月30日 02:22:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/76359574.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定