英文:
Flink pipeline with firing results on event
问题
-
使用Apache Flink是否有一种方法来执行以下操作:
- 合并具有相同地址的对象的组织列表
- 在发生某个事件时将所有结果发送到Sink。例如,当用户发送控制消息到Kafka主题或另一个数据源时
- 保留所有对象以供将来累积
-
我尝试使用全局窗口和自定义触发器,但似乎只返回控制元素作为结果,其他元素被忽略。
英文:
I have stream of objects with address and list of organizations:
@Data
class TaggedObject {
String address;
List<String> organizations;
}
Is there a way to do the following using apache flink:
- Merge organization lists for objects with same address
- Send all results to Sink when some event occurs. E.g. when user sends control message to a kafka topic or another DataSource
- Keep all objects for future accumulations
I tried using global window and custom trigger:
public class MyTrigger extends Trigger<TaggedObject, GlobalWindow> {
@Override
public TriggerResult onElement(TaggedObject element, long timestamp, GlobalWindow window, TriggerContext ctx) throws Exception {
if (element instanceof Control) return TriggerResult.FIRE;
else return TriggerResult.CONTINUE;
}
But it seems to give only Control element as a result. Other elements were ignored.
答案1
得分: 1
如果您想要一个通用的控制信号,以触发所有地址的输出,那么您需要使用广播流。您将地址流与控制流相结合,然后在自定义的KeyedBroadcastProcessFunction
实现中执行相应的逻辑(合并地址的组织或触发输出)。
英文:
If you want a generic control signal that triggers output for ALL addresses, then you'll need to use a broadcast stream. You combine your stream of addresses with your control stream and then perform the appropriate logic (merging organizations for an address, or triggering output) inside of your custom implementation of a KeyedBroadcastProcessFunction
.
答案2
得分: 0
看起来你应该只是按地址键入流,然后使用KeyedProcessFunction(带有List或MapState)来存储不同的组织。然后,一旦有事件进来,你可以直接输出状态中的条目。
Kind Regards
Dominik
英文:
It seems like you should just key the stream by address and then use a KeyedProcessFunction (with a List- or MapState) to store the different organizations. Then as soon as an event comes in, you can just output the entries of the State.
Kind Regards
Dominik
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论