2023年6月12日 05:13:35go评论84阅读模式

英文:

In Python, how do I read and write the actual word "None" (not the Keyword) between a .csv file and DataFrame?

问题

我想要testoutput.csv与testdata.csv相同。

英文:

I have a .csv that has the actual word "None" as a value in a field. When I read it into a DataFrame, the df reads "None" as the keyword and inserts <NA>. Later, when I rewrite the df to the .csv, all of the places where "None" was are replaced with blanks (,,) in the .csv.

testdata.csv:

Membership number,Last name,Date of birth,Status,Color
240200,,Wilson,None,Red

import pandas as pd
filename = &quot;testdata.csv&quot;
data_file = &quot;testoutput.csv&quot;
with open(filename, &#39;r&#39;, newline=&#39;&#39;, ):
    # Read data into a DataFrame
    user_df = pd.read_csv(filename)
    
user_df.to_csv(data_file, index=False)

testoutput.csv:

Membership number,Last name,Date of birth,Status,Color
240200,,Wilson,,Red

I want testoutput.csv to be the same as the testdata.csv.

答案1

得分: 0

我已移除了多余的文件打开操作，正如我所预期的那样，我无法复制这个问题。下面是一个演示输出等于输入的会话。

源代码：

import pandas as pd
user_df = pd.read_csv('x.csv')
print(user_df)    
user_df.to_csv('x1.csv', index=False)

输出：

timr@Tims-NUC:~/src$ cat x.csv
Membership number,Last name,Date of birth,Status,Color
240200,,Wilson,None,Red
timr@Tims-NUC:~/src$ python x.py
   Membership number  Last name Date of birth Status Color
0             240200        NaN        Wilson   None   Red
timr@Tims-NUC:~/src$ cat x1.csv
Membership number,Last name,Date of birth,Status,Color
240200,,Wilson,None,Red

后续：
这似乎与版本相关。read_csv 函数包括一个 na_values 参数，用于标识应解释为 NaN 的字符串列表，而（至少在 2.0 版本中）"None" 在该列表中。

因此，两个解决方案是：要么指定一个较短的列表给 na_values，要么设置 keep_default_na=False 以停止所有 NaN 解释。

英文:

I've removed the extraneous file open, and as I expected, I cannot duplicate the issue. Here is a session demonstrating that the output equals the input.

Source:

import pandas as pd
user_df = pd.read_csv(&#39;x.csv&#39;)
print(user_df)    
user_df.to_csv(&#39;x1.csv&#39;, index=False)

Output:

timr@Tims-NUC:~/src$ cat x.csv
Membership number,Last name,Date of birth,Status,Color
240200,,Wilson,None,Red
timr@Tims-NUC:~/src$ python x.py
   Membership number  Last name Date of birth Status Color
0             240200        NaN        Wilson   None   Red
timr@Tims-NUC:~/src$ cat x1.csv
Membership number,Last name,Date of birth,Status,Color
240200,,Wilson,None,Red

Followup

This appears to be version-related. The read_csv function does include a na_values parameter that identifies the list of strings that should be interpreted as NaN, and (at least in 2.0) "None" is on that list.

So, the two solutions are: specify a shorter list to na_values, or set keep_default_na=False to stop all NaN interpretation.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

In Python, how do I read and write the actual word "None" (not the Keyword) between a .csv file and DataFrame?

问题

答案1

Followup

Python Tkinter的grid方法因某种原因未按预期工作。

怎样用更简洁的代码从列表中同时为两个变量编写循环

在循环中使用random.choice。

如何衡量高度不平衡数据集的性能？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。