英文:
Apache Beam , how to GroupBy in a List of Objects
问题
我有一个Car对象的列表在一个PCollection中。
PCollection<List<Car>>
每辆车都有一个颜色。
我想按颜色对这个列表进行排序,其中颜色是键,具有该颜色的车辆是值,并最终得到一个KV<String, List<Car>>
{"red":[car1,car2],"green":[car3,car4]}
Car car1 = new Car();
Car car2 = a new Car();
Car car3 = new Car();
Car car4 = a new Car();
car1.setColor("red");
car2.setColor("red");
car3.setColor("green");
car4.setColor("green");
final List<Car> cars = Arrays.asList(car1,car2,car3,car4);
PCollection<Car> carsCollection = pipeline.apply(Create.of(cars));
PCollection<KV<String, List<Car>>> sortedCars = carsCollection.apply(...)
也许类似这样的代码可以工作
PCollection<KV<String, List<Car>>> sortedCars =
cars.apply(WithKeys.of(new SimpleFunction<String, List<Car>>() {
@Override
public String apply(Car car) {
return car.getColor();
}
}));
英文:
I have a list of cars objects in a PCollection.
PCollection<List<Car>>
Each car has a color.
I want to sort this list where the color is the key and cars that have that color are the values and end up with a KV<String, List<Car>>
{"red":[car1,car2],"green":[car3,car4]}
Car car1 = new Car();
Car car2 = new Car();
Car car3 = new Car();
Car car4 = new Car();
car1.setColor("red");
car2.setColor("red");
car3.setColor("green");
car4.setColor("green");
final List<Cars> cars = Arrays.asList(car1,car2,car3,car4);
PCollection<Car> carsCollection = pipeline.apply(Create.of(cars));
PCollection<KV<String, List<Car>>> sortedCars = carsCollection.apply(...)
Maybe something like this wold work
PCollection<KV<String, List<Car>>> sortedCars =
cars.apply(WithKeys.of(new SimpleFunction<String, List<Car>>() {
@Override
public String apply(Car car) {
return cat.getColor();
}
}));
答案1
得分: 2
你可以使用 Core 中的 GroupByKey 转换。
对于你的 WithKeys,你也可以使用 lambda 表达式
(WithKey.of(x -> x.getColor())).apply(GroupByKey.create())
这将产生一个 KV<key,Iterable<Car>>。
英文:
You can make use of the Core GroupByKey transform.
For your WithKeys you can also make use of the lambda
(WithKey.of(x -> x.getColor())).apply(GroupByKey.create())
This will produce a KV<key,Iterable<Car>>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论