问题

我想要一个图表，显示最近请求我的 Web 服务器的所有 IP 的总请求数。这样的功能可行吗？我可以通过 Prometheus 添加一个查询然后再删除它吗？

英文:

I want to have a graph where all recent IPs that requested my webserver get shown as total request count. Is something like this doable? Can I add a query and remove it afterwards via Prometheus?

答案1

得分: 3

从技术上讲，是的。你需要进行以下操作：

在你的服务器上公开一些指标（可能是一个计数器），比如 requests_count，并设置一个标签，比如 ip。
每当你收到一个请求时，将该指标的标签设置为请求者的 IP 地址，并将指标的值增加。
在 Grafana 中，将该指标绘制成图表，可能需要按 IP 地址求和处理，以处理多个水平扩展的服务器处理请求的情况，例如 sum(your_prometheus_namespace_requests_count) by (ip)。
在 Grafana 中，将图表的图例设置为 {{ ip }}，以便将每条线命名为它所代表的 IP 地址。

然而，每个指标具有不同的标签值会导致 Prometheus 时间序列数据库中存在一个全新的指标；你可以将类似 requests_count{ip="192.168.0.1"}=1 的指标视为在内存消耗方面与 requests_count_ip_192_168_0_1{}=1 类似。目前，Prometheus TSDB 中保存的每个指标实例需要大约 3kB 的内存。这意味着，如果你处理数百万个请求，仅仅这一个指标就会消耗掉几个 GB 的内存。关于这个问题的更详细解释可以在这个答案中找到：https://stackoverflow.com/a/69167162/511258

考虑到这一点，如果你确定只有少量的 IP 地址连接（比如内部局域网或分发给少数已知客户端），这种方法是有意义的。但是，如果你计划部署到互联网上，这将为人们提供一个非常容易（很可能是无意识地）使你的监控系统崩溃的方式。

你可以考虑其他的替代方案，例如，Grafana 能够从一些常见的日志聚合平台接收数据，所以也许你可以进行一些结构化的（例如 JSON）日志记录，将其保存在 Elasticsearch 中，然后从其中的数据创建图表。

英文:

Technically, yes. You will need to:

Expose some metric (probably a counter) in your server - say, requests_count, with a label; say, ip
Whenever you receive a request, inc the metric with the label set to the requester IP
In Grafana, graph the metric, likely summing it by the IP address to handle the case where you have several horizontally scaled servers handling requests sum(your_prometheus_namespace_requests_count) by (ip)
Set the Legend of the graph in Grafana to {{ ip }} to 'name' each line after the IP address it represents

However, every different label value a metric has causes a whole new metric to exist in the Prometheus time-series database; you can think of a metric like requests_count{ip="192.168.0.1"}=1 to be somewhat similar to requests_count_ip_192_168_0_1{}=1 in terms of how it consumes memory. Each metric instance currently being held in the Prometheus TSDB head takes something on the order of 3kB to exist. What that means is that if you're handling millions of requests, you're going to be swamping Prometheus' memory with gigabytes of data just from this one metric alone. A more detailed explanation about this issue exists in this other answer: https://stackoverflow.com/a/69167162/511258

With that in mind, this approach would make sense if you know for a fact you expect a small volume of IP addresses to connect (maybe on an internal intranet, or a client you distribute to a small number of known clients), but if you are planning to deploy to the web this would allow a very easy way for people to (unknowingly, most likely) crash your monitoring systems.

You may want to investigate an alternative -- for example, Grafana is capable of ingesting data from some common log aggregation platforms, so perhaps you can do some structured (e.g. JSON) logging, hold that in e.g. Elasticsearch, and then create a graph from the data held within that.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Grafana/Prometheus将多个IP地址可视化为查询。

问题

答案1

为什么尽管我在发送完所有值后尝试关闭通道，仍然出现死锁？

如何从父目录运行”go mod download”命令？

通过Go语言更新Couchbase文档的字段，如果文档存在的话。

如何解析带有冒号的 XML 属性？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论