创建子表格,基于数据框的列数值。

huangapple go评论69阅读模式
英文:

Creating Subtables from Dataframe Based on Column Value

问题

我想要创建基于"BUY"和"SELL"之间距离的动态长度的子表。在匹配这两个数据点并创建表格后,它会移动到SELL后的下一个BUY实例。

即,如果在子表中存在多个"BUY",它将跳过这些以找到一个"SELL"点。

目标是让它看起来像这样:

英文:

I have a large dataframe that looks like this (continuing onward):

创建子表格,基于数据框的列数值。

I want to create subtables that are of a dynamic length based on the distance between BUY and SELL. After it matches these two data points and creates the table it would move to the next instance of BUY after the SELL.

i.e. if multiple BUY's exist inside the subtable, it would skip over these to find a SELL point.

The goal is for it to look something like this:

创建子表格,基于数据框的列数值。

创建子表格,基于数据框的列数值。

创建子表格,基于数据框的列数值。

答案1

得分: 1

以下是代码的翻译部分:

import pandas as pd
import numpy as np

df = pd.DataFrame({'Current Time': ['18:08:20']*16,
                   'Price': np.arange(157.01, 157.17, 0.01),
                   'Moving Avg': [np.nan]*3 + list(np.arange(157.01, 157.13, 0.01)),
                   'Action': [np.nan]*3 + ['BUY']*3 + ['SELL']*3 + ['BUY'] + ['SELL'] + ['BUY']*4 + ['SELL']})

# 创建具有虚假价格和移动平均值的模拟数据帧

df_new = df[df['Action'].notna()]

grp = (df_new['Action'] == 'SELL').cumsum().shift().bfill()

dd = dict(tuple(df_new.groupby(grp)))

list_of_dfs = [g for _, g in dd.items() if len(g) > 1]

list_of_dfs

输出:

[  Current Time   Price  Moving Avg Action
 3     18:08:20  157.04      157.01    BUY
 4     18:08:20  157.05      157.02    BUY
 5     18:08:20  157.06      157.03    BUY
 6     18:08:20  157.07      157.04   SELL,
    Current Time   Price  Moving Avg Action
 9     18:08:20  157.10      157.07    BUY
 10    18:08:20  157.11      157.08   SELL,
    Current Time   Price  Moving Avg Action
 11    18:08:20  157.12      157.09    BUY
 12    18:08:20  157.13      157.10    BUY
 13    18:08:20  157.14      157.11    BUY
 14    18:08:20  157.15      157.12    BUY
 15    18:08:20  157.16      157.13   SELL]
英文:

Try this:

import pandas as pd
import numpy as np

df = pd.DataFrame({'Current Time': ['18:08:20']*16,
                   'Price':np.arange(157.01,157.17,.01),
                   'Moving Avg':[np.nan]*3+list(np.arange(157.01,157.13,.01)),
                   'Action':[np.nan]*3+['BUY']*3+['SELL']*3+['BUY']+['SELL']+['BUY']*4+['SELL']})

#built mock dataframe with fake prices and moving aver

df_new = df[df['Action'].notna()]

grp = (df_new['Action'] == 'SELL').cumsum().shift().bfill()

dd = dict(tuple(df_new.groupby(grp)))

list_of_dfs = [g for _, g in dd.items() if len(g) > 1]

list_of_dfs

Output:

[  Current Time   Price  Moving Avg Action
 3     18:08:20  157.04      157.01    BUY
 4     18:08:20  157.05      157.02    BUY
 5     18:08:20  157.06      157.03    BUY
 6     18:08:20  157.07      157.04   SELL,
    Current Time   Price  Moving Avg Action
 9      18:08:20  157.10      157.07    BUY
 10     18:08:20  157.11      157.08   SELL,
    Current Time   Price  Moving Avg Action
 11     18:08:20  157.12      157.09    BUY
 12     18:08:20  157.13      157.10    BUY
 13     18:08:20  157.14      157.11    BUY
 14     18:08:20  157.15      157.12    BUY
 15     18:08:20  157.16      157.13   SELL]

huangapple
  • 本文由 发表于 2023年3月23日 11:24:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/75819002.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定