英文:
How can I find a row inside pandas DataFrame with row data?
问题
假设我有一个 pandas DataFrame:
import pandas as pd
df = pd.DataFrame({'id': [0, 2, 1], 'name': ['Sheldon', 'Howards', 'Leonard'], 'points': [10, 5, 20]})
我想在这个 DataFrame 中搜索具有值 { 'id': 2, 'name': 'Howards', 'points': 5 }
的行。如果存在,我该如何搜索并获取它的索引?
这里是我的问题。我有一个方法,它接收一个具有未知键的字典和一个具有未知列的 DataFrame。我需要在这个 DataFrame 中搜索,以确定是否存在所搜索的行...
我找到了这个答案,它提到了一个名为 iterrows 的方法。这是找到行的最佳方法吗?代码如下:
import pandas as pd
df = pd.DataFrame({'c1': [10, 11, 12], 'c2': [100, 110, 120]})
df = df.reset_index()
search = {'c1': 12, 'c2': 120}
index = -1
for idx, row in df.iterrows():
if row == search:
index = idx
如果不是,什么是最佳方法?
英文:
Let's say that I have a pandas DataFrame:
import pandas as pd
df = pd.DataFrame({'id': [0, 2, 1], 'name': ['Sheldon', 'Howards', 'Leonard'], 'points': [10, 5, 20]})
I wanted to search for a row with the values {'id': 2, 'name': 'Howards', 'points': 5}
inside this DataFrame. How can I search it to receive the index from it, if it exists?
Here comes my problem. I have a method that receives a dict with unknown keys and a DataFrame with unknown columns too. I need to search inside this DataFrame to discover if I have the searched row inside than...
I found this answer that says about a method named iterrows. Is this the best way to find the row? Code:
import pandas as pd
df = pd.DataFrame({'c1': [10, 11, 12], 'c2': [100, 110, 120]})
df = df.reset_index()
search = {'c1': 12, 'c2': 120}
index = -1
for idx, row in df.iterrows():
if row == search:
index = idx
If not, what is the best way?
答案1
得分: 1
使用DataFrame,您可以根据需要选择/过滤数据。您可以包含所有条件,或只包含其中一些条件。结果DataFrame将包含与条件匹配的所有行。这比使用循环更有效。
import pandas as pd
df = pd.DataFrame({'id': [0, 2, 1], 'name': ['Sheldon', 'Howards', 'Leonard'], 'points': [10, 5, 20]})
row = df[(df['id']==2) & (df['name']=='Howards') & (df['points']==5) ]
print(row)
print("index=", row.index[0])
print("id=", row.iloc[0].id)
结果为:
id name points
1 2 Howards 5
index= 1
id= 2
英文:
With a dataframe you can select/filter the data according to your needs.
You can include all the conditions, or just some of them.
The resulting dataframe will contains all the rows matching the conditions.
This is more effective than using loops.
import pandas as pd
df = pd.DataFrame({'id': [0, 2, 1], 'name': ['Sheldon', 'Howards', 'Leonard'], 'points': [10, 5, 20]})
row = df[(df['id']==2) & (df['name']=='Howards') & (df['points']==5) ]
print(row)
print("index=", row.index[0])
print("id=", row.iloc[0].id)
result is:
id name points
1 2 Howards 5
index= 1
id= 2
答案2
得分: 1
使用np.logical_and
在筛选条件上:
df.index[np.logical_and(*[df[k].eq(v) for k, v in search_d.items()])]
Index([1], dtype='int64')
英文:
With np.logical_and
on filter clauses:
df.index[np.logical_and(*[df[k].eq(v) for k, v in search_d.items()])]
Index([1], dtype='int64')
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论