2023年5月30日 02:22:09go评论66阅读模式

英文:

How can I import numeric lists of lists from Excel as actual numeric values, not strings in Python?

问题

我尝试使用列中每个垂直单元格内的列表的形式组织的数据，类似于这样。此列中所有单元格的数据类型都是数值，每个子列表都包含一个整数和一个浮点数。

我正在使用read_excel导入数据，并希望能够使用索引访问子列表的元素，例如：

dataframe["ColumnName"][0][0][0]

以获取第一个列表中的第一个项目。

然而，单元格的内容被读取为字符串（我还通过type()检查了这一点）。因此，例如，当我尝试上面的索引时，我会得到"["，字符串的第一个字符，而不是'2'。

我尝试过：

对该列使用pd.to_numeric时，出现"无法解析字符串"错误。
当我尝试迭代字符串并将字符更改为int()时，我得到：基数为10的int()的无效字面值。
使用dtype = {"colname": float} - 无法转换（也无法转换为整数）。
将其转换为NumPy数组，然后使用arr.astype(float)时，我得到"无法将字符串转换为浮点数"。

当我将数据复制粘贴到Python文件中作为一堆列表时，一切都正常，我可以使用索引访问元素，但显然我不想这样做。我想更好地理解为什么它会将列表转换为字符串，非常感谢任何建议。

英文:

Im trying to work with data organized as a lists of lists within each vertical cell of a column, like this. The data type is numeric for all cells in this column, and each sublist has an int and a float.

I am importing the data using read_excel, and would like to be able to access elements of the sublists with indexing eg.

dataframe[&quot;ColumnName&quot;][0][0][0]

to get the first item in the first list

However, the contents of the cells are being read as strings (I also checked this with type()). So for example when I try the indexing above I would get "[", the first character of the string, rather than '2'.

I tried:

using pd.to_numeric on the column I got an "unable to parse string" error
when I try and iterate through the string and change characters to int(), i get : invalid literal for int() with base 10
using dtype = {"colname": float) - It cant (also cannot convert to int)
converting to a numpy array and then using arr.astype(float) i get "could not convert string to float".

When I copy and paste the data into the python file as a bunch of lists, it's fine and I can access the elements with indexing, but obviously I don't want to be doing that. I would like to better understand why it's converting the lists to strings, any advice is much appreciated.

答案1

得分: 1

Your Excel elements are strings along the lines of "[[2, 2.3], [2, 3.4]]" so cannot be directly read as Python Lists and subLists from which the numeric values can be extracted and converted. You need to first change the string into a true List by using for example:

from ast import literal_eval
ml = literal_eval(df['ColName'][0])

and then to get the first value of the first subList:

print(ml[0][0])

No type conversion is necessary as the value is an int.

英文:

from ast import literal_eval
ml = literal_eval(df[&#39;ColName&#39;][0])

and then to get the first value of the first subList:

print(ml[0][0])

No type conversion is necessary as the value is an int.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在Python中将Excel中的数字列表导入为实际数字值，而不是字符串？

问题

答案1

Pyspark Compare column strings, grouping if alphabetic character sets are same, but avoid similar words?

使用Tkinter（Python）对齐输入标签和结果。

PEP8在特定行之间

如何在tkinter中显示来自列表或文件夹的随机图像，并允许用户输入其名称。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论