英文:
How do I find the event time difference between two non consecutive events in Flink?
问题
我想要计算在Apache Flink中两个不连续事件之间的时间差。一个事件包括一个名称和一个时间戳。例如:
E1("A", 时间戳) -> E2("B", 时间戳) -> E3("C", 时间戳)
在这种情况下,我想要计算E3和E1之间的时间戳差异。关于如何在Flink中实现这个功能有什么想法吗?
英文:
I want to calculate the time difference between two non consecutive event in Apache Flink. An Event consists of a name and a timestamp. Ex:
E1("A", timestamp) -> E2("B", timestamp) -> E3("C", timestamp)
In this case I want to calculate the timestamp difference between E3 and E1. Any ideas on how to make it work in Flink?
答案1
得分: 1
因为 E1 和 E3 有不同的键,您需要使用一个非键控窗口(.windowAll()
),请参考这个文档。由于通常无法依赖事件的原始顺序被保留,您在自定义的 ProcessWindowFunction 中需要按时间戳对它们进行排序,以能够可靠地计算增量。
英文:
Since E1 & E3 have different keys, you'd need to use a non-keyed window (.windowAll()
), see this doc. Since you typically can't rely on the original order of events being preserved, in your custom ProcessWindowFunction you'd have to sort them by timestamp to be able to reliably calculate a delta.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论