英文:
Getting a syntax error when I want to delete row based on column value
问题
以下是翻译好的部分:
我有我的数据框 df_daily,我想根据我的日期列来检查重复项,该列对应于 df_daily[0],那些由于具有相同日期而成为重复的行,我希望它们被删除,只保留不重复的部分。我尝试了以下方法,但出现了语法错误。
dp_check = df_daily.drop[df_daily[df_daily[0].drop_duplicates(keep = False)].index, inplace = True]
请注意,我只提供了代码的翻译,没有回答其他问题。
英文:
I have my dataframe df_daily and I’d like to check for duplicates based on my date column which corresponds to df_daily[0] and those rows that would be duplicates due to having the same date I’d like for them to be deleted and maintain only what is not a duplicate. I tried the following but am getting syntax error.
dp_check = df_daily.drop[df_daily[df_daily[0].drop_duplicates(keep = False)].index, inplace = True]
答案1
得分: 1
使用Series.duplicated与倒置掩码~在布尔索引中:
df_daily = pd.DataFrame({0:[4,5,4,6,5,8]})
dp_check = df_daily[~df_daily[0].duplicated(keep = False)]
或者使用DataFrame.drop_duplicates与subset参数:
dp_check = df_daily.drop_duplicates(subset=[0], keep = False)
print (dp_check)
0
3 6
5 8
英文:
Use Series.duplicated with inverted mask by ~ in boolean indexing:
df_daily = pd.DataFrame({0:[4,5,4,6,5,8]})
dp_check = df_daily[~df_daily[0].duplicated(keep = False)]
Or DataFrame.drop_duplicates with subset parameter:
dp_check = df_daily.drop_duplicates(subset=[0], keep = False)
print (dp_check)
0
3 6
5 8
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论