英文:
Creating a new dataframe where a field is blank in the original dataframe
问题
Using Python3 and Pandas. I am admittedly pretty new and I'm having a hard time searching for an answer to this question.
我正在使用Python3和Pandas。坦白说,我是个新手,我在寻找答案时遇到了困难。
I have a dataframe that contains lots of information and I'm trying to get a dataframe that is just the items where one specific field in the original is blank.
我有一个包含大量信息的数据框,我试图获得一个只包含原始数据中特定字段为空的项目的数据框。
I have queried my database to get a dataframe I am calling full_df which is all information on all items in the database. I want to now create a new dataframe that selects just the items where one field in full_df is blank.
我已经查询了我的数据库,获得了一个数据框,我称之为full_df,其中包含了数据库中所有项目的所有信息。现在,我想创建一个新的数据框,只选择full_df中某个字段为空的项目。
This is what I've tried:
这是我尝试过的方法:
no_rate = full_df[(full_df['rate'] == "")]
Which is returning nothing even though I know for a fact that there are loads of items where 'rate' is blank. I expected the dataframe no_rate to be populated with all the items where 'rate' is blank.
尽管我明知道有很多'rate'字段为空的项目,但这段代码返回了空值。我期望数据框no_rate中包含所有'rate'字段为空的项目。
How do I select those items for this new dataframe?
我该如何选择这些项目放入新的数据框中?
英文:
Using Python3 and Pandas. I am admittedly pretty new and I'm having a hard time searching for an answer to this question.
I have a dataframe that contains lots of information and I'm trying to get a dataframe that is just the items where one specific field in the original is blank.
I have queried my database to get a dataframe I am calling full_df which is all information on all items in the database. I want to now create a new dataframe that selects just the items where one field in full_df is blank.
This is what I've tried:
no_rate = full_df[(full_df['rate'] == "")]
Which is returning nothing even though I know for a fact that there are loads of items where 'rate' is blank. I expected the dataframe no_rate to be populated with all the items where 'rate' is blank.
How do I select those items for this new dataframe?
答案1
得分: 0
这是你要的翻译部分:
-
首先,你需要检查你的
rate
列的数据类型是字符串还是对象。可以使用df.dtypes
来查看。如果不是字符串,那么你就不能用""
来测试它。 -
其次,要进行条件选择,可以使用
loc
。 -
如果你的
rate
列看起来像这样:
df = pd.DataFrame({'Rate': ['good', 'good', 'bad', 'medium', '', 'bad', '', 'good']})
df
那么你可以写:
df.loc[df['Rate']==""]
将会得到:
Rate
4
6
这实际上显示了内容,但由于没有内容,所以看起来只有行号。为了更清晰地看到结果,让我们添加另一列。
- 添加另一列以查看结果更清晰:
df['Color'] = ['Red', 'Blue', 'Yellow', 'Red', 'Yellow', 'Red', 'Green', 'Blue']
df
和
df.loc[df['Rate'] == ""]
将显示:
Rate Color
4 Yellow
6 Green
- 如果你的
rate
实际上是一个数字:
df['Decimal_Rate'] = [.8, .8, .3, .6, np.nan, .2, np.nan, .9]
df
如果你想要隔离空的数字单元格,你可以这样做:
df.loc[df['Decimal_Rate'].isna()]
这将得到:
Rate Color Decimal_Rate
4 Yellow
6 Green
英文:
There are a few things you need to do. First of all, is the data type of your rate column a string, or object? df.dtypes
will tell you. If not, then you can't test it against ""
.
Second, and more to the point, a way to do a conditional select is by useing loc
.
So, if your rate column looks like this
df = pd.DataFrame({'Rate': ['good', 'good', 'bad', 'medium', '', 'bad', '', 'good']})
df
Rate
0 good
1 good
2 bad
3 medium
4
5 bad
6
7 good
then you could write
df.loc[df['Rate']==""]
and get
Rate
4
6
which is actually showing you the contents, but since there is nothing in there, it looks like just the row numbers. Let's add another column to see the results more plainly.
df['Color'] = ['Red', 'Blue', 'Yellow', 'Red', 'Yellow', 'Red', 'Green', 'Blue']
df
Rate Color
0 good Red
1 good Blue
2 bad Yellow
3 medium Red
4 Yellow
5 bad Red
6 Green
7 good Blue
and
df.loc[df['Rate'] == ""]
shows
Rate Color
4 Yellow
6 Green
So, what if your rate is actually a number
df['Decimal_Rate'] = [.8, .8, .3, .6, np.nan, .2, np.nan, .9]
df
Rate Color Decimal_Rate
0 good Red 0.8
1 good Blue 0.8
2 bad Yellow 0.3
3 medium Red 0.6
4 Yellow
5 bad Red 0.2
6 Green
7 good Blue 0.9
if you wanted to isolate the empty cells of numbers, you can go like this:
df.loc[df['Decimal_Rate'].isna()]
which results in
Rate Color Decimal_Rate
4 Yellow
6 Green
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论