Pandas 使用多个条件进行分组和筛选

huangapple go评论71阅读模式
英文:

Pandas groupby and filter with multiple conditions

问题

我要过滤这个数据框中的股票,其中当天(06-01)有正的蜡烛(收盘价大于开盘价),并且与前一天形成LL。在这个示例中,应该只返回VANGD.IS。

您可以通过以下方式过滤LL:

df.groupby(['Symbol']).apply(lambda x: x[x["Low"] > x["Low"].shift(-1)])

然而,这只会返回匹配股票的第一行。您需要获取整个组。

然后,您可以过滤具有正蜡烛的股票。

请注意,这个示例中只有4支股票。请记住,原始数据框中有将近500支股票需要过滤。

英文:
        Date    Symbol   Open   High    Low  Close
0 2023-05-31  GEDIK.IS   7.90   8.01   7.77   7.87
1 2023-06-01  GEDIK.IS   7.92   8.20   7.89   8.14
2 2023-05-31  MIPAZ.IS   7.87   7.90   7.74   7.84
3 2023-06-01  MIPAZ.IS   7.84   8.06   7.80   8.05
4 2023-05-31  SUNTK.IS  36.20  37.52  35.48  37.00
5 2023-06-01  SUNTK.IS  37.20  38.30  36.60  38.30
6 2023-05-31  VANGD.IS   7.26   7.36   6.95   7.08
7 2023-06-01  VANGD.IS   7.09   7.63   6.92   7.48

I want to filter stocks from this dataframe where current day (06-01) has a positive candle (Close>Open) and creates a LL with the previous day.
In this example it should return VANGD only.
I can filter LLs by

df.groupby(['Symbol']).apply(lambda x: x[x["Low"] > x["Low"].shift(-1)])

However this only returns first row of the matching stock. I need to get entire group.

                 Date    Symbol  Open  High   Low  Close
Symbol
VANGD.IS 6 2023-05-31  VANGD.IS  7.26  7.36  6.95   7.08

Than I can filter stocks with positive candle.
Also there are only 4 stocks in this example. Keep in mind that original df have nearly 500 stocks to be filtered.

Many thanks.

答案1

得分: 1

看起来你想要提取匹配的符号名称,然后使用.isin()来找到所有对应的行。

如果符号都是相邻的,你可以通过比较符号的偏移来替换.groupby().apply

LLs = df.loc[
   (df['Symbol'] == df['Symbol'].shift(-1)) & 
   (df['Low'] > df['Low'].shift(-1)), 
   'Symbol'
]

df[df['Symbol'].isin(LLs)]
        Date    Symbol  Open  High   Low  Close
6 2023-05-31  VANGD.IS  7.26  7.36  6.95   7.08
7 2023-06-01  VANGD.IS  7.09  7.63  6.92   7.48
英文:

It looks like you want to extract the matching Symbol names then .isin() to find all corresponding rows.

If the Symbols are all adjacent, you can replace the .groupby().apply by also comparing the shifted Symbol:

LLs = df.loc[
   (df['Symbol'] == df['Symbol'].shift(-1)) & 
   (df['Low'] > df['Low'].shift(-1)), 
   'Symbol'
]

df[df['Symbol'].isin(LLs)]
        Date    Symbol  Open  High   Low  Close
6 2023-05-31  VANGD.IS  7.26  7.36  6.95   7.08
7 2023-06-01  VANGD.IS  7.09  7.63  6.92   7.48

huangapple
  • 本文由 发表于 2023年6月1日 22:26:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/76382959.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定