2016年11月3日 01:53:10go评论173阅读模式

英文:

What's the best way to push kafka messages from my edge nodes?

问题

我在主要地区（美国东部）有一个工作人员，负责计算我们边缘位置的流量数据。我想将边缘地区的数据推送到我们的主要kafka地区。

一个例子是波兰、澳大利亚、美国西部。我想将所有这些统计数据推送到美国东部。我不希望在从边缘地区向主要地区写入数据时增加额外的延迟。

另一个选择是创建另一个kafka集群和工作人员作为中继。这将要求我们在每个地区维护单独的集群，并给我们的部署增加更多的复杂性。

我见过Mirror Maker，但我不想镜像任何东西，我想找一个更像是中继系统的解决方案。如果这不是设计上的解决方案，那么如何将我们所有的应用程序指标汇总到主要地区进行计算和排序呢？

谢谢你的时间。

英文:

I have a worker in the primary region (US-East) that computes data on traffic at our edge locations. I want to push the data from an edge region to our primary kafka region.

An example is Poland, Australia, US-West. I want to push all these stats to US-East. I don't want to encurr additional latency during the writes from the edge regions to the primary.

Another option is to create another kafka cluster and worker that acts as a relay. That would require us to maintain individual clusters in each region and would add a lot more complexity to our deployments.

I've seen Mirror Maker, but I don't really want to Mirror anything, I guess I'm looking more for a relay system. If this isn't the designed way to do this, how can I aggregate all of our application metrics to the primary region to be computed and sorted?

Thank you for your time.

答案1

得分: 1

据我所知，以下是您的选择：

在每个地区设置本地 Kafka 集群，并让边缘节点将数据写入其本地 Kafka 集群，以实现低延迟写入。然后，您可以设置一个镜像制造者（mirror maker），从本地 Kafka 拉取数据到远程 Kafka 进行聚合。
如果您担心使用高延迟阻塞请求中断应用程序的请求路径，那么您可能希望将生产者配置为异步（非阻塞）地将数据写入远程 Kafka 集群。根据您选择的编程语言，这可能是一个简单或复杂的过程。
运行一个每个主机的中继（或数据缓冲）服务，可以简单地作为一个日志文件和守护进程，将数据推送到远程 Kafka 集群（如上所述）。或者，运行一个单个实例的 Kafka / Zookeeper 容器（有捆绑在一起的 Docker 镜像），用于缓冲下游拉取的数据。

选项1无疑是解决这个问题的最常见方法，尽管有点过于复杂。我猜测未来 Confluent / Kafka 的开发人员将推出更多工具来支持选项3。

英文:

As far as I know, here are your options:

Setup a local Kafka cluster in each region and have your edge nodes
write to the their local Kafka cluster for low latency writes. From
there, you would setup a mirror maker that pulls data from your local Kafka to your remote Kafka for aggregation.
If you're concerned with interrupting your applications request path with high latent blocking requests, then you may want to configure your producers to write asynchronously (non-blocking) to your remote Kafka cluster. Depending on your programming language choice, this could be simple or complex exercise.
Run a per host relay (or data buffer) service that could be as simple as a log file and daemon that pushes to your remote Kafka cluster (as mentioned above). Alternatively, run a single instance Kafka / Zookeeper container (there are docker images that bundle both together) that buffers the data for downstream pulling.

Option 1. is definitely the most standard solution to this problem, albeit a bit heavy handed. I suspect there will be more tooling coming out Confluent / Kafka folks to support option 3. in the future.

答案2

得分: 0

将消息写入本地磁盘上的日志文件。编写一个小型守护进程，读取日志文件并将事件推送到主要的Kafka守护进程。

为了增加吞吐量并限制延迟的影响，您还可以每分钟轮换一次日志文件。然后，使用cron作业将日志文件每分钟同步到您的主要Kafka区域。让导入守护进程在那里运行。

英文:

Write the messages to a local logfile on disk. Write a small daemon which reads the logfile and pushes the events to the main kafka daemon.

To increase througput and limit the effect of latency you could also rotate the logfile every minute. Then rsync the logfile with a cronjob to your main kafka region minutely. Let the import daemon run there.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从我的边缘节点推送Kafka消息的最佳方法是什么？

问题

答案1

答案2

“线程池”对于Go语言是否相关？

你可以通过Docs API向Google Docs文档添加文本。

从Bitbucket上使用两个不同的Bitbucket账户进行”go get”操作。

如何创建一个包含不同类型的结构体数组的数组？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论