如何从 Jupyter 笔记本中导出经过清理的数据,而不是原始数据。

huangapple go评论72阅读模式
英文:

How to export cleaned data from a jupyter notebook, not the original data

问题

我刚刚开始学习使用Jupyter笔记本。我有一个名为“Diseases”的数据文件。

打开数据文件

import pandas as pd
df = pd.read_csv('Diseases.csv')

选择来自名为“DIABETES”的列的数据,即选择具有糖尿病的受试者ID,是1,否则是0。

df[df.DIABETES > 1]

现在我想导出这些经过清理的数据(行数较少)

df.to_csv('diabetes-filtered.csv')

这将导出原始数据文件,而不是具有较少行的已筛选df。
我在另一个问题中看到需要使用inplace参数。但我不知道该如何使用。

英文:

I have just started to learn to use Jupyter notebook. I have a data file called 'Diseases'.

Opening data file

import pandas as pd
df = pd.read_csv('Diseases.csv')

Choosing data from a column named 'DIABETES', i.e choosing subject IDs that have diabetes, yes is 1 and no is 0.

df[df.DIABETES >1]

Now I want to export this cleaned data (that has fewer rows)

df.to_csv('diabetes-filtered.csv')

This exports the original data file, not the filtered df with fewer rows.
I saw in another question that the inplace argument needs to be used. But I don't know how.

答案1

得分: 4

你忘记将筛选后的 DataFrame 分配回 df1

import pandas as pd 
df = pd.read_csv('Diseases.csv')
df1 = df[df.DIABETES > 1]
df1.to_csv('diabetes-filtered.csv')

或者你可以将筛选和导出到文件链接在一起:

import pandas as pd 
df = pd.read_csv('Diseases.csv')
df[df.DIABETES > 1].to_csv('diabetes-filtered.csv')
英文:

You forget assign back filtered DataFrame, here to df1:

import pandas as pd 
df = pd.read_csv('Diseases.csv')
df1 = df[df.DIABETES >1]
df1.to_csv('diabetes-filtered.csv')

Or you can chain filtering and exporting to file:

import pandas as pd 
df = pd.read_csv('Diseases.csv')
df[df.DIABETES >1].to_csv('diabetes-filtered.csv')

huangapple
  • 本文由 发表于 2020年1月6日 15:11:39
  • 转载请务必保留本文链接:https://go.coder-hub.com/59608026.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定