英文:
simplify java stream to find duplicate properties
问题
我有一个users
列表,我想找出所有具有重复名称的用户:
var duplicateNames = users
.stream()
.collect(Collectors.groupingBy(u -> u.getName()))
.entrySet()
.stream()
.filter(entry -> entry.getValue().size() > 1)
.map(Map.Entry::getKey)
.collect(Collectors.toSet());
这样可以改进/简化上面的解决方案吗?
例如,实际上我创建了一个包含所有名称的列表,然后对其进行筛选。如何在不创建额外的allNames
列表的情况下遍历列表以查找其中重复的名称呢?
英文:
I have a users
list and I want to find all users having duplicate names:
var allNames = users
.stream()
.map(u -> u.getName()).collect(Collectors.toList());
var duplicateNames = allNames
.stream()
.filter(i -> Collections.frequency(allNames, i) > 1)
.collect(Collectors.toSet());
Can I improve/simplify the above solution?
For example, actually I create a list with all names and then filter it. How can I traverse the list to find its duplicate names without creating the additional list allNames
?
答案1
得分: 7
一个解决方案是:
var duplicate = users.stream()
.collect(Collectors.toMap(User::getName, u -> false, (x, y) -> true))
.entrySet().stream()
.filter(Map.Entry::getValue)
.map(Map.Entry::getKey)
.collect(Collectors.toSet());
这会创建一个中间的 Map<String, Boolean>
,用于记录哪些名称出现超过一次。您可以使用该映射的 keySet()
,而不是收集到新的 Set
:
var duplicate = users.stream()
.collect(Collectors.collectingAndThen(
Collectors.toMap(User::getName, u -> false, (x, y) -> true, HashMap::new),
m -> {
m.values().removeIf(dup -> !dup);
return m.keySet();
}));
一个循环解决方案可能会更简单:
HashSet<String> seen = new HashSet<>(), duplicate = new HashSet<>();
for (User u : users)
if (!seen.add(u.getName())) duplicate.add(u.getName());
英文:
One solution is
var duplicate = users.stream()
.collect(Collectors.toMap(User::getName, u -> false, (x,y) -> true))
.entrySet().stream()
.filter(Map.Entry::getValue)
.map(Map.Entry::getKey)
.collect(Collectors.toSet());
This creates an intermediate Map<String,Boolean>
to record which name is occurring more than once. You could use the keySet()
of that map instead of collecting to a new Set
:
var duplicate = users.stream()
.collect(Collectors.collectingAndThen(
Collectors.toMap(User::getName, u -> false, (x,y) -> true, HashMap::new),
m -> {
m.values().removeIf(dup -> !dup);
return m.keySet();
}));
A loop solution can be much simpler:
HashSet<String> seen = new HashSet<>(), duplicate = new HashSet<>();
for(User u: users)
if(!seen.add(u.getName())) duplicate.add(u.getName());
答案2
得分: 2
按照姓名分组,找出拥有多个值的条目:
Map<String, List<User>> grouped = users.stream()
.collect(Collectors.groupingBy(User::getName));
List<User> duplicated =
grouped.values().stream()
.filter(v -> v.size() > 1)
.flatMap(List::stream)
.collect(Collectors.toList());
(如果你愿意,你可以将这个过程合并成一个表达式。我只是将步骤分开,以便更清楚地解释发生了什么。)
请注意,这不会保留原始列表中用户的顺序。
英文:
Group by the names, find entries with more than one value:
Map<String, List<User>> grouped = users.stream()
.collect(groupingBy(User::getName));
List<User> duplicated =
grouped.values().stream()
.filter(v -> v.size() > 1)
.flatMap(List::stream)
.collect(toList());
(You can do this in a single expression if you want. I only separated the steps to make it a little more clear what is happening).
Note that this does not preserve the order of the users from the original list.
答案3
得分: 1
我在@holger的帮助下找到了解决方案:
// 使用O(n)收集所有重复的名称
var duplicateNames = all.stream()
.collect(Collectors.groupingBy(Strategy::getName, Collectors.counting()))
.entrySet()
.stream()
.filter(m -> m.getValue() > 1)
.map(m -> m.getKey())
.collect(Collectors.toList());
这个解决方案的性能是O(n)还是O(n^2)?
如果有人能找到改进方法,请分享。
英文:
I find the solution with the help of @holger:
// collect all duplicate names with O(n)
var duplicateNames = all.stream()
.collect(Collectors.groupingBy(Strategy::getName, Collectors.counting()))
.entrySet()
.stream()
.filter(m -> m.getValue() > 1)
.map(m -> m.getKey())
.collect(Collectors.toList());
Is the performance of this solution O(n^2) or O(n)?
If someone can find improvements then please share.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论