InfluxDB: 如何处理缺失的数据?

huangapple go评论61阅读模式
英文:

InfluxDB: How to deal with missing data?

问题

问题描述

我们执行了许多时间序列查询,这些查询有时会出现问题,通常是通过API(Python)执行的,有时会因数据缺失而导致完全失败。

由于这种情况,我们不确定在哪里可以获得关于如何处理我们时间序列(influxdb)数据库中的缺失数据的特定问题的答案。

例子

以一个示例来描述问题...

我们有一些时间序列数据,比如我们测量房间的温度,现在我们有许多房间,有时传感器会死机或停止工作一两周,然后我们会更换它们,依此类推,在这段时间内数据是缺失的。

现在我们尝试执行某些计算,它们失败了,比如我们想要计算每天的平均温度,现在这将失败,因为有些天我们的传感器没有测量输入。

我们考虑的一个方法是,我们只是对那一天的数据进行插值。使用最后和第一个可用的值,并将该值放在没有数据的那些天。

这有许多缺点,主要的一个是由于虚假数据,你不能信任它,对于我们那些更严肃的流程,我们更愿意不存储虚假数据(或插值数据)。

我们想知道对于这个问题有哪些可能的替代方案,以及我们在哪里可以找到资源来教育自己有关这个主题。

英文:

Question Description

We are performing a lot of timeseries queries, these queries sometimes result in issues, they are usually performed through an API (Python) and sometimes result in complete failure due to data missing.

Due to this situation we are not sure where to educate ourselves and get the answer to this specific question on, how to deal with missing data in our timeseries (influxdb) database

Example

To describe a problem in an example..

We have some timeseries data, let's say we measure the temperature of the room, now we have many rooms and sometimes sensors die or stop working for a week or two, then we replace them and so on, in that timeframe the data is missing.

Now we try to perform certain calculations, they fail, let's say we want to calculate the temperature average per each day, now this will fail because some days we have no measurement input on the sensors.

One approach that we thought of is that we just interpolate the data for that day. Use the last and the first available and just place that value for the days that there is no data available.

This has many downsides, major one being due to fake data, you can't trust it and for our processes that are a bit more serious we would prefer to not store fake data (or interpolated).

We were wondering what the possible alternatives were to this question and where can we find the resource to educate ourselves on such topic.

答案1

得分: 0

以下是翻译好的部分:

"Answer"
这个想法是我们用nullNone这样的数据来填补缺失的数值,空白的部分。这样,我们可以使用Influxdb内置的填充功能。
https://docs.influxdata.com/influxdb/cloud/query-data/flux/fill/

就像在这个例子中,我们能够填充空值,从而对数据进行进一步的查询和分析操作。

上面的链接引用包含了我们可以使用的所有方法,来解决和填充缺失的数据数值。

英文:

Answer

The idea is that we fill the missing values, the gaps, with data that is null or None. This way we can use influxdb built-in fill.
https://docs.influxdata.com/influxdb/cloud/query-data/flux/fill/

InfluxDB: 如何处理缺失的数据?

Like in this example, we are able to fill null values and thereby perform any additional queries and actions on the data on analysis.

The link reference above contains all of the methodologies that we can use to resolve and fill in the missing data values.

huangapple
  • 本文由 发表于 2023年2月6日 19:44:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/75360882.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定