英文:
What happens when I add duplicate element into HashSet? Is old element overwritten or not?
问题
当我想要在HashSet
中添加重复元素时,我们知道这是不允许的。所以我遇到了两种,让我们称之为理论。
在这个链接的'属性,编号2'下面,它说:
HashSet不允许重复元素。如果尝试插入重复元素,将覆盖旧元素。
但当我检查IDE中提供给我的文档,在add()
方法中,它说:
- 如果集合已经包含该元素,调用将保持集合不变并返回
false
。
所以它是覆盖(替换)旧元素还是保持集合不变并返回false
? 我是疯了还是他们完全说相反的话?
英文:
Say I want to add duplicate element in HashSet
. We know that is not allowed. So I came across 2, let's say theories.
Under 'PROPERTIES, number 2' of this link, it says:
> HashSet does not allow duplicate elements. If you try to insert a
> duplicate element, older element will be overwritten
But when I check docs that are provided to me in IDE, in method add()
, it states:
> * If this set already contains the element, the call leaves the set
> * unchanged and returns {@code false}.
So does it overwrite(replace) old element or it leaves the set unchanged and returns false? Am I crazy of they say exactly the opposite?
答案1
得分: 4
以下是翻译好的内容:
代码部分不需要翻译。
IDE显示了`Set.add`方法的官方规范摘录([javadocs][1])。
IDE是正确的。相信官方文档,而不是一些“漂亮”的网站。
-------------------------
> 但是那个网站的人怎么会犯这么大的错误呢?
永远记住,像你找到的那种网站的主要动机是从你的页面浏览中赚钱。他们往往更注重“搜索引擎优化”,而不是他们材料的质量。
------------------------
> 但问题是,这些“随机”的网站有时会对某些概念进行“更美观”的解释。作为一个初学者,官方文档有时太难理解了。
那么哪个更可取?容易阅读但错误的东西?还是准确无误的东西?
在这种情况下,你不能说官方文档难以理解。你自己能够看出从官方文档中提取的文本与第三方网站之间的矛盾。
我的建议是首先尝试阅读官方javadocs,始终相信它,胜过任何其他来源... *包括StackOverflow的回答!!* 唯一更明确的事情<sup>1</sup>是OpenJDK的源代码。
------------------
<sup>1 - 即使这也有争议。一方面,代码决定了实际发生的事情。另一方面,代码可能会在不同版本之间发生变化。因此,依赖于在javadocs中没有*指定*的行为可能会导致可移植性问题。</sup>
[1]: https://docs.oracle.com/javase/8/docs/api/java/util/Set.html#add-E-
代码部分不需要翻译。
英文:
The IDE is displaying an excerpt from the official specification (the javadocs) for the Set.add
method.
The IDE is correct. Trust the official documentation, not some "pretty" website.
> But how could people that made that site make such a big mistake then?
Always remember that the primary motivation for websites like the one you found is to earn money from your page views. They often pay more attention to "search engine optimization" than they do the quality of their material.
> But the problem is those "random" sites sometimes make 'prettier' explanation of some concepts. Official docs are sometimes too hard to follow for me as a beginner.
So which is preferable? Something that easy to read, but wrong? Or something that is accurate?
In this case, you cannot argue that the official document is hard to understand. You yourself were able to see that the contradiction between the text taken from the official documentation and the 3rd-party website.
My advice us to always try to read the Official javadocs first, and always believe it ahead of any other sources ... including StackOverflow answers!! The only thing that is more definitive<sup>1</sup> is the OpenJDK source code.
<sup>1 - And even that is debatable. On the one hand, the code determines what actually happens. On the other hand, the code may change from one version to another. So relying on behavior that is not specified in the javadocs may result in portability issues.</sup>
答案2
得分: 1
任何理论都可以通过查看源代码来进行验证。
例如,对于 JDK 8,HashSet::add
的实现如下所示:
/**
* 如果不存在,则将指定的元素添加到此集合中。
* 更正式地说,如果此集合不包含任何元素<tt>e2</tt>,使得
* <tt>(e==null&nbsp;?&nbsp;e2==null&nbsp;:&nbsp;e.equals(e2))</tt>,
* 则将指定的元素<tt>e</tt>添加到此集合中。
* 如果此集合已包含该元素,则保持集合不变,并返回<tt>false</tt>。
*
* @param e 要添加到此集合的元素
* @return 如果此集合先前未包含指定的元素,则为<tt>true</tt>
*/
public boolean add(E e) {
return map.put(e, PRESENT) == null;
}
在这里,map
是 HashMap
的一个实例,因此我们应该查看 HashMap::put
:
/**
* 将指定的值与该映射中的指定键关联。
* 如果映射先前包含键的映射,则替换旧值。
*
* @param key 要与指定值关联的键
* @param value 要与指定键关联的值
* @return 与<tt>key</tt>关联的先前值;如果没有映射<tt>key</tt>,则为
* <tt>null</tt>。
* (返回<tt>null</tt>还可以表示先前将<tt>null</tt>与<tt>key</tt>关联。)
*/
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}
而 putVal
的实现包含以下代码:
if (e != null) { // 存在键的映射
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
英文:
Any the theories can be verified by looking at the source code.
For example, for JDK 8, HashSet::add
implemented as follows:
/**
* Adds the specified element to this set if it is not already present.
* More formally, adds the specified element <tt>e</tt> to this set if
* this set contains no element <tt>e2</tt> such that
* <tt>(e==null&nbsp;?&nbsp;e2==null&nbsp;:&nbsp;e.equals(e2))</tt>.
* If this set already contains the element, the call leaves the set
* unchanged and returns <tt>false</tt>.
*
* @param e element to be added to this set
* @return <tt>true</tt> if this set did not already contain the specified
* element
*/
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
Here map
is an instance of HashMap
, so we should look at HashMap::put
:
/**
* Associates the specified value with the specified key in this map.
* If the map previously contained a mapping for the key, the old
* value is replaced.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
* @return the previous value associated with <tt>key</tt>, or
* <tt>null</tt> if there was no mapping for <tt>key</tt>.
* (A <tt>null</tt> return can also indicate that the map
* previously associated <tt>null</tt> with <tt>key</tt>.)
*/
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}
and implementation of putVal
contains this code:
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
答案3
得分: 1
执行下面的程序,你会看到不允许重复元素:
import java.util.HashSet;
public class Test {
public static class Dummy {
String s1;
String s2;
public Dummy(String s1, String s2) {
super();
this.s1 = s1;
this.s2 = s2;
}
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((s1 == null) ? 0 : s1.hashCode());
return result;
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Dummy other = (Dummy) obj;
if (s1 == null) {
if (other.s1 != null)
return false;
} else if (!s1.equals(other.s1))
return false;
return true;
}
@Override
public String toString() {
return "Dummy [s1=" + s1 + ", s2=" + s2 + "]";
}
}
public static void main(String[] args) {
HashSet<Object> hashSet = new HashSet<>();
Object o1 = new Dummy("a", "b");
Object o2 = new Dummy("a", "c");
System.out.println(o1.equals(o2));
hashSet.add(o1);
System.out.println(hashSet);
hashSet.add(o2);
System.out.println(hashSet);
}
}
英文:
Execute the program below and you'll see the duplicate elements are not allowed:
import java.util.HashSet;
public class Test {
public static class Dummy{
String s1;
String s2;
public Dummy(String s1, String s2) {
super();
this.s1 = s1;
this.s2 = s2;
}
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((s1 == null) ? 0 : s1.hashCode());
return result;
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Dummy other = (Dummy) obj;
if (s1 == null) {
if (other.s1 != null)
return false;
} else if (!s1.equals(other.s1))
return false;
return true;
}
@Override
public String toString() {
return "Dummy [s1=" + s1 + ", s2=" + s2 + "]";
}
}
public static void main(String[] args) {
HashSet<Object>hashSet =new HashSet<>();
Object o1 = new Dummy("a","b");
Object o2 = new Dummy("a","c");
System.out.println(o1.equals(o2));
hashSet.add(o1);
System.out.println(hashSet);
hashSet.add(o2);
System.out.println(hashSet);
}
}
答案4
得分: 1
IDE显示“未更改”,这包括两种情况。
1. 覆盖旧元素,因此保持“未更改”。
2. 不进行修改,因此保持“未更改”。
要确定哪种情况是正确的,我们需要在`HashSet`的`add`方法的源代码中寻找答案。
首先,`add`方法调用`HashMap`中的`put`方法:
```java
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
然后,put
方法调用putVal
方法:
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1)
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) {
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
return null;
}
关键部分是:
if (e != null) {
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
return oldValue;
}
首先,代码将检查找到的匹配节点e
是否为null。如果不为null
,则意味着哈希表中已经存在一个键与新插入节点的键相同的节点,需要更新该节点的值。
然后,代码将同一节点的旧值存储在oldValue
变量中。
**接下来,代码检查是否需要更新节点的值。如果onlyIfAbsent
为false或旧值为null,则意味着需要更新节点的值。**由于put
方法将onlyIfAbsent
参数传递为false,因此新值将在此处覆盖旧值,这证明了第一种情况是正确的(覆盖旧元素,因此保持“未更改”)。
<details>
<summary>英文:</summary>
The IDE says "unchanged", which includes two cases.
1. Overwrite the old element, so keep "unchanged".
2. Do not modify, so keep "unchanged".
To determine which case is correct, we need to look for the answer in the source code of the `add` method in `HashSet`.
First, the `add` method calls the `put` method in `HashMap`:
```java
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
Then, the put
method calls the putVal
method:
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1)
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) {
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
return null;
}
The key part is:
if (e != null) {
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
return oldValue;
}
Firstly, the code will check whether the found matching node e
is null. If it is not null
, it means that there is already a node in the hash table whose key is the same as the key of the newly inserted node, and the value of this node needs to be updated.
Then, the code stores the old value of the same node in the oldValue
variable.
Next, the code checks whether it needs to update the value of the node. If onlyIfAbsent
is false or the old value is null, it means that the value of the node needs to be updated. Since the put
method passes the onlyIfAbsent
parameter as false, it is here that a new value overwrites the old value, which proves that the first case is correct (overwrite the old element, so keep "unchanged").
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论