2023年4月11日 02:22:43go评论109阅读模式

英文:

Flux aggregate function to calculate the difference from last to first

问题

我正在尝试学习InfluxDB的Flux查询语言。我正在使用InfluxDB OSS 2.7。

我有一个时间序列，记录了我的电表的用电量。它报告的是一个不断增加的数字，单位是千瓦时（KWh），我想显示每天我使用了多少瓦时（Wh），使用了一个带有aggregateWindow的自定义函数。以下是我尝试过的内容：

myFunc = (tables=<-, column) => {
  a = tables
    |> first(column: column)
    |> findRecord(fn: (key) => true, idx: 0)

  b = tables
    |> last(column: column)
    |> findRecord(fn: (key) => true, idx: 0)

  d = b._value - a._value

  return tables
    |> first()
    |> map(fn: (r) => ({ r with _value: d}))
}

from(bucket: "a")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "el")
  |> filter(fn: (r) => r["_field"] == "ACTIVE_IMPORT")
  |> aggregateWindow(every: 1d, fn: myFunc, createEmpty: false)
  |> yield(name: "Wh")

但是这返回了一个新的表，其中所有的_value都是相同的数字（在我的情况下是134）。

我原本希望变量a和b会有每个_window_的第一个和最后一个值，并且d会表示每个窗口中的用电量 - 但是事实似乎并非如此。

英文:

I'm trying to learn the flux query language for InfluxDB. I'm using InfluxDB OSS 2.7.

I have a time-series with power usage from my power meter. It reports an ever increasing number in KWh, and I want to show how many Wh I have used per day, by using a custom function with aggregateWindow. Here is what I have tried:

myFunc = (tables=&lt;-, column) =&gt; {
  a = tables
    |&gt; first(column: column)
    |&gt; findRecord(fn: (key) =&gt; true, idx: 0)

  b = tables
    |&gt; last(column: column)
    |&gt; findRecord(fn: (key) =&gt; true, idx: 0)

  d = b._value - a._value

  return tables
    |&gt; first()
    |&gt; map(fn: (r) =&gt; ({ r with _value: d}))
}

from(bucket: &quot;a&quot;)
  |&gt; range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |&gt; filter(fn: (r) =&gt; r[&quot;_measurement&quot;] == &quot;el&quot;)
  |&gt; filter(fn: (r) =&gt; r[&quot;_field&quot;] == &quot;ACTIVE_IMPORT&quot;)
  |&gt; aggregateWindow(every: 1d, fn: myFunc, createEmpty: false)
  |&gt; yield(name: &quot;Wh&quot;)

But this returns a new table, where all _value have the same number (in my case 134).

I was hoping that the variables a and b would have the first and the last value of each window, and that d would represent the usage in each window - but this does not seem to be the case.

答案1

得分: 2

如果您想要为每个窗口查找第一个和最后一个值之间的差异，可以使用spread函数。

要确切，spread计算的是最小值和最大值之间的差异（而不是第一个和最后一个），但在一个始终递增的系列中，这两者是相同的。

然而，这并不精确。如果考虑以下数据：

时间戳	值
2023-04-12T00:00:00Z	100
2023-04-12T01:00:00Z	101
2023-04-12T02:00:00Z	102
2023-04-12T03:00:00Z	103
2023-04-12T04:00:00Z	104
2023-04-12T05:00:00Z	105
...（以下省略）

然后，如果您在每一天内获取差异，将会得到：

from(bucket: "a")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "el")
  |> filter(fn: (r) => r["_field"] == "ACTIVE_IMPORT")
  |> aggregateWindow(every: 1d, fn: spread, createEmpty: false)
  |> yield(name: "Wh")

日期	第一个	最后一个	差异
2023-04-13T00:00:00Z	100	123	23
2023-04-13T00:00:00Z	124	147	23

然而，每天的实际消耗量是24。使用spread会错过每天的一个间隔（从123到124的消耗量未被计算）。当然，如果您的数据粒度更高（例如：每分钟或每秒），那么缺失的值将不那么重要。

为了解决这个问题，我建议只获取每天的单个值（最后一个），然后使用difference函数。这将执行“滚动差异”，因此从每个值中减去前一个值，并将为您提供更好的结果（所有小时都将得到计算）。

from(bucket: "a")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "el")
  |> filter(fn: (r) => r["_field"] == "ACTIVE_IMPORT")
  |> aggregateWindow(every: 1d, fn: last, createEmpty: false)
  |> difference()
  |> yield(name: "Wh")

使用last进行聚合将得到：

日期	最后一个
2023-04-13T00:00:00Z	123
2023-04-13T00:00:00Z	147

然后应用difference将得到：

日期	差异(last)
2023-04-13T00:00:00Z
2023-04-13T00:00:00Z	24

注意：第一个值将为null，因为它没有任何内容用于执行difference。

英文:

If you want to find for each window the difference between the first and last values you can use the spread function.
To be precise spread calculates the difference between the minimum and maximum values (not first and last) but in an always-increasing series the two are the same.

This though is not precise. If you consider the following data:

timestamp	value
2023-04-12T00:00:00Z	100
2023-04-12T01:00:00Z	101
2023-04-12T02:00:00Z	102
2023-04-12T03:00:00Z	103
2023-04-12T04:00:00Z	104
2023-04-12T05:00:00Z	105
2023-04-12T06:00:00Z	106
2023-04-12T07:00:00Z	107
2023-04-12T08:00:00Z	108
2023-04-12T09:00:00Z	109
2023-04-12T10:00:00Z	110
2023-04-12T11:00:00Z	111
2023-04-12T12:00:00Z	112
2023-04-12T13:00:00Z	113
2023-04-12T14:00:00Z	114
2023-04-12T15:00:00Z	115
2023-04-12T16:00:00Z	116
2023-04-12T17:00:00Z	117
2023-04-12T18:00:00Z	118
2023-04-12T19:00:00Z	119
2023-04-12T20:00:00Z	120
2023-04-12T21:00:00Z	121
2023-04-12T22:00:00Z	122
2023-04-12T23:00:00Z	123
2023-04-13T00:00:00Z	124
2023-04-13T01:00:00Z	125
2023-04-13T02:00:00Z	126
2023-04-13T03:00:00Z	127
2023-04-13T04:00:00Z	128
2023-04-13T05:00:00Z	129
2023-04-13T06:00:00Z	130
2023-04-13T07:00:00Z	131
2023-04-13T08:00:00Z	132
2023-04-13T09:00:00Z	133
2023-04-13T10:00:00Z	134
2023-04-13T11:00:00Z	135
2023-04-13T12:00:00Z	136
2023-04-13T13:00:00Z	137
2023-04-13T14:00:00Z	138
2023-04-13T15:00:00Z	139
2023-04-13T16:00:00Z	140
2023-04-13T17:00:00Z	141
2023-04-13T18:00:00Z	142
2023-04-13T19:00:00Z	143
2023-04-13T20:00:00Z	144
2023-04-13T21:00:00Z	145
2023-04-13T22:00:00Z	146
2023-04-13T23:00:00Z	147

Then if you take the spread in each day you will get:

from(bucket: &quot;a&quot;)
  |&gt; range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |&gt; filter(fn: (r) =&gt; r[&quot;_measurement&quot;] == &quot;el&quot;)
  |&gt; filter(fn: (r) =&gt; r[&quot;_field&quot;] == &quot;ACTIVE_IMPORT&quot;)
  |&gt; aggregateWindow(every: 1d, fn: spread, createEmpty: false)
  |&gt; yield(name: &quot;Wh&quot;)

day	first	last	spread
2023-04-13T00:00:00Z	100	123	23
2023-04-13T00:00:00Z	124	147	23

The actual consumption for each day, instead is 24. By using spread you are missing one interval for each day (the consumption from 123 to 124 is never accounted). Of course if your data has higher granulatiry (e.g.: every minute or every second) the missing value will be a lot less significant.

To solve this I would suggest just getting a single value for each day (the last) and then using the difference function. This will then do a "rolling difference" so subtract to each value the previous one and will give you a better result (all hours will be accounted for).

from(bucket: &quot;a&quot;)
  |&gt; range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |&gt; filter(fn: (r) =&gt; r[&quot;_measurement&quot;] == &quot;el&quot;)
  |&gt; filter(fn: (r) =&gt; r[&quot;_field&quot;] == &quot;ACTIVE_IMPORT&quot;)
  |&gt; aggregateWindow(every: 1d, fn: last, createEmpty: false)
  |&gt; difference()
  |&gt; yield(name: &quot;Wh&quot;)

Aggregating with last will give:

day	last
2023-04-13T00:00:00Z	123
2023-04-13T00:00:00Z	147

Then applying difference will result in:

day	difference(last)
2023-04-13T00:00:00Z
2023-04-13T00:00:00Z	24

> NOTE: the first value will be null since it does not have anything before it to di the difference with

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Flux聚合函数，用于计算从最后到第一个的差异。

问题

答案1

Spring Boot Metrics.counter(“x”).count() 在 Influxdb 中始终返回 0。

如何从Influx数据库查询中按升序获取日期，目前日期以字符串格式呈现。

如何检查InfluxDB存储桶中是否存在标签？

Transforming annotated csv (influxdb) to normal csv file using python script

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论