英文:
select rows in multiplr conditions in Pandas
问题
我有一个包含许多行和列以及一个特定条件列表的数据框。
示例数据框如下。
|索引|水果|食谱|大小|价格|
|---|---|---|---|---|
|2|苹果|烧|大|100|
|3|香蕉|煎|小|100|
|5|苹果|切片|大|100|
|7|苹果|煎|小|100|
|11|菠萝|煎|小|100|
|13|芒果|煎|小|100|
和
order = [("苹果", "煎", "大"), ("苹果", "煎", "大"), ...]
isin
在多个条件下不起作用。
我想要从数据框中仅选择顺序中的组合,而不使用iterrows。
英文:
I have a dataFrame with many rows and columns and also have a specific condition list.
example DataFrame is below.
index | fruit | recipe | size | price |
---|---|---|---|---|
2 | apple | burn | big | 100 |
3 | banana | fry | small | 100 |
5 | apple | slice | big | 100 |
7 | apple | fry | small | 100 |
11 | pineapple | fry | small | 100 |
13 | mango | fry | small | 100 |
and
order = [("apple", "fry", "big"), ("apple", "fry", "big"), ...]
isin
not working in multiple conditions.
I want to pick only combinations in order from DataFrame, not using iterrows.
答案1
得分: 1
你可以尝试numpy广播来比较数值
order = [("apple", "fry", "big"), ("apple", "fry", "small")]
mask = ((df[['fruit', 'recipe', 'size']].to_numpy()[:, None] == np.array(order))
.all(axis=-1)
.any(axis=-1))
out = df[mask]
$ print(mask)
[False False False True False False]
$ print(out)
fruit recipe size price
3 apple fry small 100
英文:
You can try numpy broadcasting to compare value
order = [("apple", "fry", "big"), ("apple", "fry", "small")]
mask = ((df[['fruit', 'recipe', 'size']].to_numpy()[:, None] == np.array(order))
.all(axis=-1)
.any(axis=-1))
out = df[mask]
$ print(mask)
[False False False True False False]
$ print(out)
fruit recipe size price
3 apple fry small 100
答案2
得分: 1
假设列表 order
中的每个元素只包含 3 个元素,您可以在不会出现太多性能问题的情况下执行以下操作:
df.loc[pd.Series(zip(df['fruit'], df['recipe'], df['size'])).isin(order)]
英文:
Assuming each element in list order
is of only 3 elements, you could do below without much performance issue:
df.loc[pd.Series(zip(df['fruit'], df['recipe'], df['size'])).isin(order)]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论