英文:
Getting a syntax error when I want to delete row based on column value
问题
以下是翻译好的部分:
我有我的数据框 df_daily
,我想根据我的日期列来检查重复项,该列对应于 df_daily[0]
,那些由于具有相同日期而成为重复的行,我希望它们被删除,只保留不重复的部分。我尝试了以下方法,但出现了语法错误。
dp_check = df_daily.drop[df_daily[df_daily[0].drop_duplicates(keep = False)].index, inplace = True]
请注意,我只提供了代码的翻译,没有回答其他问题。
英文:
I have my dataframe df_daily
and I’d like to check for duplicates based on my date column which corresponds to df_daily[0]
and those rows that would be duplicates due to having the same date I’d like for them to be deleted and maintain only what is not a duplicate. I tried the following but am getting syntax error.
dp_check = df_daily.drop[df_daily[df_daily[0].drop_duplicates(keep = False)].index, inplace = True]
答案1
得分: 1
使用Series.duplicated
与倒置掩码~
在布尔索引
中:
df_daily = pd.DataFrame({0:[4,5,4,6,5,8]})
dp_check = df_daily[~df_daily[0].duplicated(keep = False)]
或者使用DataFrame.drop_duplicates
与subset参数:
dp_check = df_daily.drop_duplicates(subset=[0], keep = False)
print (dp_check)
0
3 6
5 8
英文:
Use Series.duplicated
with inverted mask by ~
in boolean indexing
:
df_daily = pd.DataFrame({0:[4,5,4,6,5,8]})
dp_check = df_daily[~df_daily[0].duplicated(keep = False)]
Or DataFrame.drop_duplicates
with subset parameter:
dp_check = df_daily.drop_duplicates(subset=[0], keep = False)
print (dp_check)
0
3 6
5 8
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论