英文:
Pandas Sorting with condition
问题
假设我有以下的数据框:
price | position | datetime |
---|---|---|
30 | 1 | 2023-01-08 |
30 | 4 | 2023-01-08 |
140 | 2 | 2023-01-09 |
140 | 3 | 2023-01-09 |
50 | 1 | 2023-01-09 |
50 | 4 | 2023-01-09 |
260 | 2 | 2023-01-09 |
260 | 3 | 2023-01-09 |
数据框中'position'和'datetime'都按升序排列。
我想要在相同的'datetime'值内对'position'中特定的值(1和4)按降序排序,最终结果如下所示:
price | position | datetime |
---|---|---|
30 | 4 | 2023-01-08 |
30 | 1 | 2023-01-08 |
140 | 2 | 2023-01-09 |
140 | 3 | 2023-01-09 |
50 | 4 | 2023-01-09 |
50 | 1 | 2023-01-09 |
260 | 2 | 2023-01-09 |
260 | 3 | 2023-01-09 |
我尝试了这段代码:
df_trade = df_trade[(df_trade.position == 4) | (df_trade.position == 1)].sort_values(by="position", ascending=False)
但是它会报错:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
我想有更好、更简单的方法。
英文:
Suppose I have following dataframe:
price | position | datetime |
---|---|---|
30 | 1 | 2023-01-08 |
30 | 4 | 2023-01-08 |
140 | 2 | 2023-01-09 |
140 | 3 | 2023-01-09 |
50 | 1 | 2023-01-09 |
50 | 4 | 2023-01-09 |
260 | 2 | 2023-01-09 |
260 | 3 | 2023-01-09 |
The dataframe has both 'position' and 'datetime' in ascending order.
I want specific values in 'position' ( 1 and 4) to be sorted in descending order within the same 'datetime' values to have the end result as following:
price | position | datetime |
---|---|---|
30 | 4 | 2023-01-08 |
30 | 1 | 2023-01-08 |
140 | 2 | 2023-01-09 |
140 | 3 | 2023-01-09 |
50 | 4 | 2023-01-09 |
50 | 1 | 2023-01-09 |
260 | 2 | 2023-01-09 |
260 | 3 | 2023-01-09 |
I tried this code
df_trade = df_trade[df_trade.position == 4 or df_trade.position == 1].sort_values(by = ["position"], ascending = False)
but it gives
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I guess there must be better, easy ways.
答案1
得分: 1
使用 numpy.lexsort
在“position” Series 中,其中您使用了负值的 1/4:
import numpy as np
order = np.lexsort([df['position'].mul(np.where(df['position'].isin([1,4]), -1, 1)), df['datetime']])
out = df.iloc[order]
Pandas 等效代码:
(df.assign(key=df['position'].mul(np.where(df['position'].isin([1,4]), -1, 1)))
.sort_values(by=['datetime', 'key'])
.drop(columns='key')
)
输出:
price position datetime
1 30 4 2023-01-08
0 30 1 2023-01-08
5 50 4 2023-01-09
4 50 1 2023-01-09
2 140 2 2023-01-09
6 260 2 2023-01-09
3 140 3 2023-01-09
7 260 3 2023-01-09
英文:
Use numpy.lexsort
with the "position" Series in which you use the negative values of 1/4:
import numpy as np
order = np.lexsort([df['position'].mul(np.where(df['position'].isin([1,4]), -1, 1)), df['datetime']])
out = df.iloc[order]
Pandas equivalent:
(df.assign(key=df['position'].mul(np.where(df['position'].isin([1,4]), -1, 1)))
.sort_values(by=['datetime', 'key'])
.drop(columns='key')
)
Output:
price position datetime
1 30 4 2023-01-08
0 30 1 2023-01-08
5 50 4 2023-01-09
4 50 1 2023-01-09
2 140 2 2023-01-09
6 260 2 2023-01-09
3 140 3 2023-01-09
7 260 3 2023-01-09
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论