英文:
Java Map.containsValue doesn't work after 15000th item in the list
问题
以下是翻译好的代码部分:
// 示例数值(30000个相同的单词,顺序不同):
// list: hooef dalwm vuitg enewb xcbfy ...
// search: dalwm xcbfy hooef enewb dalwm ...
Map<Integer, String> list = new HashMap<Integer, String>();
Map<Integer, String> search = new HashMap<Integer, String>();
boolean check = true;
for(int i=0; i<search.size(); i++) {
if(list.containsValue(search.get(i)))
list.remove(i);
else
check = false; // 当 i=15000 时,代码走到这里
}
return check; // 返回 false
英文:
I have two HashMaps and there are 30000 same words with different order in each of these lists. Although I can compare the search values coming from the second list by the first list, the comparison does not work after 15000th item. I know there is no guarantee in the hashmap, but I do not need order and I just check words in the search map by the list map and remove the founded words. If the list contains all the words in search, want to return true. Is there any point that I missed?
//Sample values (30000 same words with different order):
//list: hooef dalwm vuitg enewb xcbfy ...
//search: dalwm xcbfy hooef enewb dalwm ...
Map<Integer, String> list = new HashMap<Integer, String>();
Map<Integer, String> search = new HashMap<Integer, String>();
boolean check = true;
for(int i=0; i<search.size(); i++) {
if(list.containsValue(search.get(i)))
list.remove(i);
else
check = false; //when i=15000 the code hits to here
}
return check; //returns false
答案1
得分: 2
如果地图的键是有序的,那么 list.remove(i);
正在从 list
中移除一个随机值,这似乎是不正确的。
这里是一个可能的解决方案:
Collection<String> values = list.values();
for (int i = 0; i < search.size(); i++) {
if (!values.remove(search.get(i))) {
check = false;
}
}
英文:
If the key of the maps is the order, then list.remove(i);
is removing a random value from list
, which doesn't seem correct.
Here's a possible solution:
Collection<String> values = list.values();
for(int i=0; i<search.size(); i++) {
if(!values.remove(search.get(i))) {
check = false;
}
}
答案2
得分: 1
`containsValue` 函数没有生效的原因是您已经无意间删除了它要查找的值。
假设搜索 -> 1:A, 2:B, 3:C
列表 -> 3:B, 1:C, 2:A
`列表` 检查是否包含 `搜索` 在键 `1` 处的值。是的,在 `列表` 中它位于 `键 2` 处。但您在 `列表` 中删除了键 `1`,以为它是值 `A`。但实际上它是值 `C`。现在当 `列表` 尝试查找值 `C` 时,将失败。
根据我看,您需要解决三个问题:
- 您试图将两个映射关联起来,这两个映射实际上具有不同的键值映射(尽管它们具有相同的键和值集)。
- 您希望基于值而不是键进行搜索和删除。
- 您可能会有重复的值。
这是我的方法。首先,构建用于测试的数据结构。
```java
Stream<String> stream = null;
try {
stream = Files.lines(Path.of("f:/linux.words"));
} catch (Exception e) {
e.printStackTrace();
}
// 限制为 100,000 个单词
int count = 100_000;
现在确保单词的顺序不同。
String[] words1 = stream.limit(count).toArray(String[]::new);
String[] words2 = words1.clone();
Collections.shuffle(Arrays.asList(words2)); // 打乱数组顺序。
现在构建两个不同的映射 search
和 list
Map<Integer, String> list = new HashMap<>();
Map<Integer, String> search = new HashMap<>();
for (int i = 0; i < words1.length; i++) {
list.put(i + 1, words1[i]);
search.put(i + 1, words2[i]);
}
现在创建一个 valueToKeyMap
,将所有值映射到它们各自的键。由于值可以重复,键包含在一个 List
中。
Map<String, List<Integer>> valueToKeyMap = list.entrySet().stream()
.collect(Collectors.groupingBy(Entry::getValue,
Collectors.mapping(Entry::getKey,
Collectors.toList())));
现在遍历映射,删除重复项。valueToKeyMap 中的列表需要被遍历,但可以预期(可能不正确)任何给定字符串的重复次数都很小(例如,单词 cow
可能只会出现 10 次左右)。
这似乎工作得相当快速。整个过程,包括文件读取等,大约需要 1 秒钟。部分速度是因为没有重复项,因此每个 keys
的 List<Integer>
长度为 1。
count = 0;
int size = list.size();
for (int i = 1; i <= size; i++) {
String value = search.get(i);
if (valueToKeyMap.containsKey(value)) {
// 无需验证列表是否包含值,valueToKeyMap 是从列表中创建的。
for (int vkm : valueToKeyMap.get(value)) {
list.remove(vkm);
count++;
}
}
}
System.out.println(list);
System.out.println(list.size());
System.out.println(count);
<details>
<summary>英文:</summary>
The reason that `containsValue` appeared not to work is that you had already inadvertently deleted the value it was looking for.
Assume Search -> 1:A, 2:B, 3:C
List -> 3:B, 1:C, 2:A
`List` checks to see if it contains `Search's` value at key `1`. It does. It is at `key 2` in `List`. But you delete key `1` at `List` thinking it was value `A`. But it was value `C`. Now when `List` checks for value `C`, it will fail.
As I see it you have three issues to contend with.
- You are trying to relate two maps that essentially have different key to value mappings (in spite of the fact they they have the same set of keys and values).
- You want to search and delete based on values and not keys.
- And that you can have duplicate values.
Here was my approach. The first part is building the data structures for testing.
Stream<String> stream = null;
try {
stream = Files.lines(Path.of("f:/linux.words"));
} catch (Exception e) {
e.printStackTrace();
}
// limit to 100_000 word
int count = 100_000;
Now ensure the words are in different order.
String[] words1 = stream.limit(count).toArray(String[]::new);
String[] words2 = words1.clone();
Collections.shuffle(Arrays.asList(words2)); // shuffle the array.
Now build two different maps `search` and `list`
Map<Integer, String> list = new HashMap<>();
Map<Integer, String> search = new HashMap<>();
for (int i = 0; i < words1.length; i++) {
list.put(i + 1, words1[i]);
search.put(i + 1, words2[i]);
}
Now create a `valueToKeyMap` that maps all values to their respective keys. Since values can be duplicated the keys are contained in a `List`
Map<String, List<Integer>> valueToKeyMap = list.entrySet().stream()
.collect(Collectors.groupingBy(Entry::getValue,
Collectors.mapping(Entry::getKey,
Collectors.toList())));
Now iterate thru the maps removing duplicates. The lists in the valueToKeyMap will need to be
iterated but it is expected (perhaps incorrectly) that the number of duplicates of any given string will be small (e.g. the word `cow` will only occur maybe 10 times).
This appears to work fairly fast. The entire effort including file reading, etc takes about 1 second. Part of that speed is due to the fact that there were no duplicates so each `List<Integer>` of `keys` was of length 1.
count = 0;
int size = list.size();
for (int i = 1; i <= size; i++) {
String value = search.get(i);
if (valueToKeyMap.containsKey(value)) {
// no need to verify if list contains value, valueToKeyMap was
// created from it.
for (int vkm : valueToKeyMap.get(value)) {
list.remove(vkm);
count++;
}
}
}
System.out.println(list);
System.out.println(list.size());
System.out.println(count);
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论