如何创建一个偏差查询?

huangapple go评论59阅读模式
英文:

How can i create a deviations query?

问题

我想编写一个查询,以允许我计算由创建的订单数量引起的偏差。

任务:查询应该回顾过去的7天数据,并基于此数据构建一个最小允许阈值(MAT)。如果在最短时间段(5分钟)内的订单数量小于MAT,则会生成警报。

特点:订单数量直接影响时间和季节性。

在搜索互联网时,我找到了有关所谓的泊松分布的信息,并尝试将其应用于问题,但没有成功。

在Prometheus中有一些函数,如day_of_week()、avg_over_time()和stddev_over_time()。

从我的尝试中,我能够做到以下几点:

  1. 过去5分钟内订单数量的差异。
    sum(delta(my_search_counter{service_name="car.book.v1"}[5m]))
  2. 过去30分钟内以5分钟为分辨率的五分钟平均时间变化。
    avg_over_time(sum(delta(my_search_counter{service_name="car.book.v1"}[5m]))[1w:5m])
  3. 均方差:
    stddev_over_time(sum(delta(my_search_counter{service_name="car.book.v1"}[5m]))[1w:5m])

这就是我卡住的地方,无法弄清如何构建正确的查询。也许还有其他更简单的方法,但我还没有找到。

我尝试将这些查询组合在一起,使用加法、减法和除法。

英文:

I wanted to write a query that would allow me to calculate deviations by the number of created orders.

Task: the query should look back 7 days and based on this data build a minimum allowable threshold (MAT). If the number of orders for a minimum period of time (5 minutes) is less than MAT, then an alert will be generated.

Features: The number of orders directly affects the time of day and seasonality.

Having searched the Internet, I found information about so-called Poisson distribution, and tried to apply it to the problem, but it didn't work.

In prometheus there are such functions as day_of_week(), avg_over_time() and stddev_over_time.

From what I was able to do:

  1. The difference between the number of orders in the last 5 min.
    sum(delta(my_search_counter{service_name="car.book.v1"}[5m])
  2. Five-minute average time variation over the last 30 minutes with a resolution of 5 minutes
    avg_over_time(sum(delta(my_search_counter{service_name="car.book.v1"}[5m]))[1w:5m])
  3. Mean square deviation:
    stddev_over_time(sum(delta(my_search_counter{service_name="car.book.v1"}[5m]))[1w:5m])

This is where I'm stuck and can't figure out how to build a proper query. Maybe there is another way, simpler, but I haven't found it.

I tried to combine these queries with each other using addition, subtraction and division.

答案1

得分: 1

我不确定这是什么统计数据,以及这是否足够作为阈值,但这是您描述的查询。

sum(increase(my_search_counter{service_name="car.book.v1"}[5m]))
< sum(increase(my_search_counter{service_name="car.book.v1"}[5m] offset 1w))
  - stddev_over_time(sum(increase(my_search_counter{service_name="car.book.v1"}[5m] offset 1w))[1d:5m])

如果在过去的5分钟内的订单数量小于一周前相同5分钟内订单数量减去一周前当前时刻之前24小时内订单数量的标准差,它将返回一个值。

您可能需要稍微调整标准差部分的乘数,以获得合理的警报百分比。

英文:

I'm not sure what statistics is this, and how adequate this is as a threshold, but here is query you described.

sum(increase(my_search_counter{service_name=&quot;car.book.v1&quot;}[5m]))
&lt; sum(increase(my_search_counter{service_name=&quot;car.book.v1&quot;}[5m] offset 1w))
  - stddev_over_time(sum(increase(my_search_counter{service_name=&quot;car.book.v1&quot;}[5m] offset 1w))[1d:5m])

It returns value if number of order over last 5 minutes is less then number of orders over same 5 minutes 1 week ago minus standard deviation of orders number over 24 hours presiding current moment 1 week ago.

You might need to play a little with multiplier for stddev part, to get a reasonable percent of alerts.

huangapple
  • 本文由 发表于 2023年6月1日 21:34:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76382456.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定