Algorithm to calculate x(y) with given x(t) and y(t) where x and y are sampled with different frequency

huangapple go评论81阅读模式
英文:

Algorithm to calculate x(y) with given x(t) and y(t) where x and y are sampled with different frequency

问题

我正在尝试绘制一个关于 x(y) 的图形,其中 x(t)y(t) 是来自不同设备的不同测量值。 t 表示测量值从设备中获取的Unix时间戳,因此示例数据可能如下所示:

x = [(2.2, 1689178112), (2.3, 1689178113), (2.4, 1689178115), ...]
y = [(1.0, 1689178100), (2.0, 1689178200), (3.0, 1689178400), ...]
英文:

I'm trying to plot a graph of x(y) where x(t) and y(t) are different measurements, coming from different devices. t is represented as unix timestamp at which the measurement was taken from the device, so example data might look like this:

x = [ (2.2,  1689178112), (2.3,  1689178113), (2.4,  1689178115), ... ]  
y = [ (1.0,  1689178100), (2.0,  1689178200), (3.0,  1689178400), ... ] 

答案1

得分: 2

除非有更好的内置方法,否则您可以通过插值来自行实现这个功能。

思路是:“在时间t1时,我有一个Y1的y值。我想找到在时间t1时可以绘制[Y1, X1]的X1值。”

然而,问题在于您不能保证在数组中在时间t1时会有一个X1值。

为了解决这个问题,您可以插值X1值,即基于排序后的x点数组中周围点的上下文,找到在线上的x值(假设是线性的)在时间t1时。

如果时间点t1小于上下文(x数组)中的最小时间值,则无法进行插值,因为需要两个点才能进行插值。为了解决这个问题,使用最小时间样本处的点。如果t1大于上下文中的最大时间样本(取最大时间值处的值)也是一样的。

因此,在您的情况下,**x(y)**是不感兴趣的(给定小数据集),因为y点数组中的每个点的时间值都小于x点的最小时间值,或者大于x点的最大时间值。

然而,获取**y(x)**更有趣,您可以看到插值正在发生!

x_points = [(2.2, 1689178112), (2.3, 1689178113), (2.4, 1689178115), (2.9, 1689178120)]
y_points = [(1.0, 1689178100), (2.0, 1689178200), (3.0, 1689178400)]

def interpolate(t, points):
    if t <= points[0][1]:
        return points[0][0]

    if t >= points[-1][1]:
        return points[-1][0]

    for i in range(1, len(points)):

        if t == points[i][1]:
            return points[i][0]

        # less than the current point, must be between this point and the last one
        if t < points[i][1]:
            p1 = points[i - 1]
            p2 = points[i]

            pct_interval = ((t - p1[1]) / (p2[1] - p1[1]))

            return p1[0] + pct_interval * (p2[0] - p1[0])

def make_x_of_y(x_points, y_points):
    points = []

    for y in y_points:
        points.append((y[0], interpolate(y[1], x_points)))

    return points

def make_y_of_x(x_points, y_points):
    points = []

    for x in x_points:
        points.append((x[0], interpolate(x[1], y_points)))

    return points

print(make_x_of_y(x_points, y_points)) # [(1.0, 2.2), (2.0, 2.9), (3.0, 2.9)], 无聊的
print(make_y_of_x(x_points, y_points)) # [(2.2, 1.12), (2.3, 1.13), (2.4, 1.15), (2.9, 1.2)], 不无聊,线性插值!

def test_interpolate():
    res = True

    res &= round(interpolate(1689178111, x_points), 2) == 2.2
    res &= round(interpolate(1689178112, x_points), 2) == 2.2
    res &= round(interpolate(1689178113, x_points), 2) == 2.3
    res &= round(interpolate(1689178114, x_points), 2) == 2.35
    res &= round(interpolate(1689178115, x_points), 2) == 2.4
    res &= round(interpolate(1689178116, x_points), 2) == 2.5
    res &= round(interpolate(1689178117, x_points), 2) == 2.6
    res &= round(interpolate(1689178118, x_points), 2) == 2.7
    res &= round(interpolate(1689178119, x_points), 2) == 2.8
    res &= round(interpolate(1689178120, x_points), 2) == 2.9
    res &= round(interpolate(1689178121, x_points), 2) == 2.9

    return res

print(test_interpolate()) # True

注意:这可以通过更快的最近点搜索函数进行优化(而不是迭代每个点)。像是二分搜索来找到给定时间t的两个点将会比O(n^2)的实现要快得多。

另外,这假设时间点之间的数据是线性的。如果您的数据不是这种情况,您将需要不同的插值函数。

英文:

Unless there is a nicer built in, you can do this yourself by interpolating.

The idea is:
"At time t1, I have a y value of Y1. I want to find my x value X1 at time t1 so that I can plot [Y1, X1]"

The issue, however, is that you are not guaranteed to have an X1 at time t1 in your array.

To solve this, you can interpolate your X1 value i.e. find the point on the line where x would be (assuming linear) at time t1 based on the context of the surrounding points in the sorted array of x points.

If a point in time t1 is less than the smallest time value in your context (x array), no interpolation can be done because you need two points to interpolate. To get around this, the point at the smallest time sample is used. The same is done if t1 is greater than the largest time sample in your context (the value at the largest time value is taken).

Because of this, in your case x(y) is uninteresting (given the small set of data) as each point in array of y points has a time value less than the smallest of the x point's time value, or, greater than the greatest x point's time value.

However, taking y(x) is more interesting, and you can see the interpolation happening!

x_points = [ (2.2,  1689178112), (2.3,  1689178113), (2.4,  1689178115), (2.9,  1689178120)]  
y_points = [ (1.0,  1689178100), (2.0,  1689178200), (3.0,  1689178400)] 

def interpolate(t, points):
    if t &lt;= points[0][1]:
        return points[0][0]
    
    if t &gt;= points[-1][1]:
        return points[-1][0]

    for i in range(1, len(points)):

        if t == points[i][1]:
            return points[i][0]

        # less than the current point, must be btween this point and the last one 
        if t &lt; points[i][1]:
            p1 = points[i - 1]
            p2 = points[i]

            pct_interval = ((t - p1[1]) / (p2[1] - p1[1]))

            return p1[0] + pct_interval * (p2[0] - p1[0]) 

def make_x_of_y(x_points, y_points):
    points = []

    for y in y_points:
        points.append((y[0], interpolate(y[1], x_points)))

    return points

def make_y_of_x(x_points, y_points):
    points = []

    for x in x_points:
        points.append((x[0], interpolate(x[1], y_points)))

    return points

print(make_x_of_y(x_points, y_points)) # [(1.0, 2.2), (2.0, 2.9), (3.0, 2.9)], boring
print(make_y_of_x(x_points, y_points)) # [(2.2, 1.12), (2.3, 1.13), (2.4, 1.15), (2.9, 1.2)], not boring, linear interpolation!


def test_interpolate():
    res = True
    
    res &amp;= round(interpolate(1689178111, x_points), 2) == 2.2
    res &amp;= round(interpolate(1689178112, x_points), 2) == 2.2
    res &amp;= round(interpolate(1689178113, x_points), 2) == 2.3
    res &amp;= round(interpolate(1689178114, x_points), 2) == 2.35
    res &amp;= round(interpolate(1689178115, x_points), 2) == 2.4
    res &amp;= round(interpolate(1689178116, x_points), 2) == 2.5
    res &amp;= round(interpolate(1689178117, x_points), 2) == 2.6
    res &amp;= round(interpolate(1689178118, x_points), 2) == 2.7
    res &amp;= round(interpolate(1689178119, x_points), 2) == 2.8
    res &amp;= round(interpolate(1689178120, x_points), 2) == 2.9
    res &amp;= round(interpolate(1689178121, x_points), 2) == 2.9

    return res

print(test_interpolate()) # True

A note: this can be optimized with a faster closest-point search function (rather than iterating through every point). Something like binary search to find the two points for a given time t will be much faster (than the O(n^2) implementation).

Also, this assumes linear data between points in time. If this is not the case for your data, you will need a different interpolation function.

huangapple
  • 本文由 发表于 2023年7月13日 00:14:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/76672586.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定