问题

我有两个数据集，分别包含两种不同时间段（1小时和5分钟）的货币价格数据（最高价、最低价、开盘价和收盘价）。

对于每个1小时时间段数据行，我有价格目标，以及交易的目标方向（要么是1，要么是-1）。

对于目标方向为'1'的交易，我试图找到5分钟图上第一个"低点"小于或等于目标水平的点。相反，对于目标方向为'-1'的交易，我想找到5分钟图上第一个"高点"大于或等于目标交易水平的点。

我已经提供了下面的代码来演示我的需求。

我在实践中遇到的问题是，我正在查看跨越10-20年的时间段，这使得连接变得非常缓慢。
我将下面的示例设置为从2016年到2022年，这样在我的计算机上不会太慢，但如果扩展到超过10年，它开始变得非常痛苦。

这可能是我不得不忍受的事情，但我正在寻求两方面的指导：
1）是否有更快/更高效的方法来实现我下面概述的内容？
2）您可以看到，我将买入和卖出交易分开连接到两个连接点。是否有办法将它们合并为一个单一的连接？这并不是非常重要，因为它在我的代码中不占用太多的“空间”，但出于教育目的，我感兴趣。

非常感谢您提前的帮助，
Phil

英文:

I have two data sets with currency data prices (High, Low, Open and Close) on two different timeframes (1 hour and 5 minute).

I have price targets for each row of the 1 Hour timeframe data, and a target direction for the trade (either 1 or -1).

For trades with a direction of '1', I'm trying to find the first point on the 5 minute chart where the "low" is <= the target level. Conversely, for trades with a direction of '-1', I want to find the first point on the 5-minute chart where the "high" is >= the target trade level.

I've put the code below to demonstrate what I'm after.

The problem I'm having in practice is that I'm looking at time periods spanning 10 - 20 years, which makes the joins very slow.
I've set the example below to be from 2016 to 2022 so it's not too slow on my computer, but if you extend it out to more than 10 years it starts to really become a pain.

This may be something that I have to live with, but I'm looking for guidance on two things:

Is there a faster / more efficient way of achieving what I've outlined below?
You can see that I've separated the buy and sell trades into two joins at the end. Is there a way to combine that into one single join? It's not really that important as it doesn't take up too much 'space' in my code, but I'm interested for educational purposes.

Thanks heaps in advance,
Phil

# Load required packages
library(data.table)
library(dplyr)

# Define timeframes for the data
Start &lt;- as.POSIXct(&quot;2016-01-01 00:00:00&quot;)
End &lt;- as.POSIXct(&quot;2022-01-01 23:55:00&quot;)

Hours &lt;- floor(as.numeric(difftime(End,Start,units = &quot;hours&quot;))) + 1
Minutes &lt;- floor(as.numeric(difftime(End,Start,units = &quot;mins&quot;)) / 5) + 1

# Create the Hourly data
set.seed(123)
hourly_prices &lt;- data.table(
  datetime = seq(Start, End, by = &quot;hour&quot;),
  open = rnorm(Hours, mean = 100, sd = 1),
  high = rnorm(Hours, mean = 101, sd = 1),
  low = rnorm(Hours, mean = 99, sd = 1),
  close = rnorm(Hours, mean = 100, sd = 1),
  Direction = sample(c(1,-1),Hours,replace = T)) %&gt;%
  .[,Target_price := ifelse(Direction == -1,rnorm(.N, mean = 104, sd = 1),rnorm(.N,mean = 97,sd = 1))]

# Create the 5-minute data
set.seed(456)
minute_prices &lt;- data.table(
  datetime = seq(Start, End, by = &quot;5 min&quot;),
  open = rnorm(Minutes, mean = 100, sd = 1),
  high = rnorm(Minutes, mean = 101, sd = 1),
  low = rnorm(Minutes, mean = 99, sd = 1),
  close = rnorm(Minutes, mean = 100, sd = 1),
  Position = seq_len(Minutes))

# Join the two data.tables to find the first point at which price passes the target levels
hourly_prices[(Direction == 1),Location := minute_prices[.SD, on = .(datetime &gt; datetime, low &lt;= Target_price),mult = &quot;first&quot;,x.Position]]
hourly_prices[(Direction == -1),Location := minute_prices[.SD, on = .(datetime &gt; datetime, high &gt;= Target_price),mult = &quot;first&quot;,x.Position]]

答案1

得分: 2

以下是代码部分的翻译：

一种简单的加速方法是将`minute_prices`数据表的子集，仅包括那些是当前小时的累积最小/最大值的行（减去1秒，因为连接是 `datetime > datetime`）：

dtM <- copy(minute_prices)
dtH <- copy(hourly_prices)

system.time({
  dtH[(Direction == 1), Location := dtM[dtM[, low == cummin(low), as.integer(datetime - 1) %/% 3600L][[2]]][.SD, on = .(datetime > datetime, low <= Target_price), mult = "first", x.Position]]
  dtH[(Direction == -1), Location := dtM[dtM[, high == cummax(high), as.integer(datetime - 1) %/% 3600L][[2]]][.SD, on = .(datetime > datetime, high >= Target_price), mult = "first", x.Position]]
})
#>    user  system elapsed 
#>    6.77    0.01    6.81

system.time({
  hourly_prices[(Direction == 1), Location := minute_prices[.SD, on = .(datetime > datetime, low <= Target_price), mult = "first", x.Position]]
  hourly_prices[(Direction == -1), Location := minute_prices[.SD, on = .(datetime > datetime, high >= Target_price), mult = "first", x.Position]]
})
#>    user  system elapsed 
#>   22.97    0.00   23.04

identical(dtH, hourly_prices)
#> [1] TRUE

希望这对你有所帮助。

英文:

One simple ~3x speed-up would be to subset the minute_prices data.table to include only those rows that are the cumulative min/max for the current hour (minus 1 second, since the join is datetime > datetime):

dtM &lt;- copy(minute_prices)
dtH &lt;- copy(hourly_prices)

system.time({
  dtH[(Direction == 1),Location := dtM[dtM[, low == cummin(low), as.integer(datetime - 1)%/%3600L][[2]]][.SD, on = .(datetime &gt; datetime, low &lt;= Target_price),mult = &quot;first&quot;,x.Position]]
  dtH[(Direction == -1),Location := dtM[dtM[, high == cummax(high), as.integer(datetime - 1)%/%3600L][[2]]][.SD, on = .(datetime &gt; datetime, high &gt;= Target_price),mult = &quot;first&quot;,x.Position]]
})
#&gt;    user  system elapsed 
#&gt;    6.77    0.01    6.81

system.time({
  hourly_prices[(Direction == 1),Location := minute_prices[.SD, on = .(datetime &gt; datetime, low &lt;= Target_price),mult = &quot;first&quot;,x.Position]]
  hourly_prices[(Direction == -1),Location := minute_prices[.SD, on = .(datetime &gt; datetime, high &gt;= Target_price),mult = &quot;first&quot;,x.Position]]
})
#&gt;    user  system elapsed 
#&gt;   22.97    0.00   23.04

identical(dtH, hourly_prices)
#&gt; [1] TRUE

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

快速数据表与时间序列的连接

问题

答案1

R Shiny：使用actionButton 从输入值生成随机值并呈现绘图。

R: 编写基于图的函数

动态选择多列，这些列的名称存储为变量。

Y轴最小值在双轴图表上

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论