英文:
How can I import numeric lists of lists from Excel as actual numeric values, not strings in Python?
问题
我尝试使用列中每个垂直单元格内的列表的形式组织的数据,类似于这样。此列中所有单元格的数据类型都是数值,每个子列表都包含一个整数和一个浮点数。
我正在使用read_excel
导入数据,并希望能够使用索引访问子列表的元素,例如:
dataframe["ColumnName"][0][0][0]
以获取第一个列表中的第一个项目。
然而,单元格的内容被读取为字符串(我还通过type()
检查了这一点)。因此,例如,当我尝试上面的索引时,我会得到"[",字符串的第一个字符,而不是'2'。
我尝试过:
- 对该列使用
pd.to_numeric
时,出现"无法解析字符串"错误。 - 当我尝试迭代字符串并将字符更改为
int()
时,我得到:基数为10的int()的无效字面值。 - 使用
dtype = {"colname": float}
- 无法转换(也无法转换为整数)。 - 将其转换为NumPy数组,然后使用
arr.astype(float)
时,我得到"无法将字符串转换为浮点数"。
当我将数据复制粘贴到Python文件中作为一堆列表时,一切都正常,我可以使用索引访问元素,但显然我不想这样做。我想更好地理解为什么它会将列表转换为字符串,非常感谢任何建议。
英文:
Im trying to work with data organized as a lists of lists within each vertical cell of a column, like this. The data type is numeric for all cells in this column, and each sublist has an int and a float.
I am importing the data using read_excel, and would like to be able to access elements of the sublists with indexing eg.
dataframe["ColumnName"][0][0][0]
to get the first item in the first list
However, the contents of the cells are being read as strings (I also checked this with type()). So for example when I try the indexing above I would get "[", the first character of the string, rather than '2'.
I tried:
- using pd.to_numeric on the column I got an "unable to parse string" error
- when I try and iterate through the string and change characters to int(), i get : invalid literal for int() with base 10
- using
dtype = {"colname": float)
- It cant (also cannot convert to int) - converting to a numpy array and then using
arr.astype(float)
i get "could not convert string to float".
When I copy and paste the data into the python file as a bunch of lists, it's fine and I can access the elements with indexing, but obviously I don't want to be doing that. I would like to better understand why it's converting the lists to strings, any advice is much appreciated.
答案1
得分: 1
Your Excel elements are strings along the lines of "[[2, 2.3], [2, 3.4]]" so cannot be directly read as Python Lists and subLists from which the numeric values can be extracted and converted. You need to first change the string into a true List by using for example:
from ast import literal_eval
ml = literal_eval(df['ColName'][0])
and then to get the first value of the first subList:
print(ml[0][0])
No type conversion is necessary as the value is an int
.
英文:
Your Excel elements are strings along the lines of "[[2, 2.3], [2, 3.4]]" so cannot be directly read as Python Lists and subLists from which the numeric values can be extracted and converted. You need to first change the string into a true List by using for example:
from ast import literal_eval
ml = literal_eval(df['ColName'][0])
and then to get the first value of the first subList:
print(ml[0][0])
No type conversion is necessary as the value is an int
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论