英文:
Will my object creation fail with improper hashcode-equals implementation?
问题
以下是您提供的代码翻译后的部分内容:
我想创建一个名为 Customer 的类,可以通过 Customer No 唯一地标识。
我写了下面的代码
public class Customer{
private Integer customerNo;
private String customerName;
public Customer(Integer customerNo, String customerName){
this.customerNo = customerNo;
this.customerName = customerName;
}
@Override
public int hashCode(){
return this.customerNo;
}
public Integer getCustomerNo(){
return this.customerNo;
}
public String getCustomerName(){
return this.customerName;
}
@Override
public boolean equals(Object o){
Customer cus = (Customer) o;
return (this.customerNo == cus.getCustomerNo() && this.customerName != null && this.customerName.equals(cus.getCustomerName()));
}
@Override
public String toString(){
StringBuffer strb = new StringBuffer();
strb.append("Customer No ")
.append(this.customerNo)
.append(", Customer Name ")
.append(this.customerName)
.append("\n");
return strb.toString();
}
public static void main(String [] args){
Set<Customer> set = null;
try{
set = new HashSet<Customer>();
set.add(new Customer(1,"Jack"));
set.add(new Customer(3,"Will"));
set.add(new Customer(1,"Tom"));
set.add(new Customer(3,"Fill"));
System.out.println("Size "+set.size());
}catch(Exception e){
e.printStackTrace();
}
}
}
从上面的代码中,您可以看到我将我的 hashCode 返回为 customer No。
而且我的相等性也基于 customer No 和 Customer Name。
如果我运行上面的代码,输出将是
D:\Java_Projects>java Customer
Size 4
D:\Java_Projects>
输出是创建了 4 个具有相同 customer No 的对象。
原因是尽管 customer no. 相同,但是名称不同,
根据我上面的 'equals' 实现,它基于 customerNo 和 customer Name。
有 4 种不同的 CustomerNo-CustomerName 组合,因此创建了 4 个对象。
我的问题是,
我的 hashCode 实现是否是一个糟糕的做法?
我可能会遇到哪些问题?
如果我创建 500,000 个具有相同 customer No 的 Customer 对象,会发生什么?
是否会有 500,000 个 customer 对象放置在同一个桶中?
请注意,我根据您的要求,只返回了翻译好的代码部分,没有包含其他内容。如果您有更多问题或需要进一步的帮助,请随时提问。
英文:
I want to create a class Customer who can be uniquely identified by Customer No.
I wrote the code below
public class Customer{
private Integer customerNo;
private String customerName;
public Customer(Integer customerNo, String customerName){
this.customerNo = customerNo;
this.customerName = customerName;
}
@Override
public int hashCode(){
return this.customerNo;
}
public Integer getCustomerNo(){
return this.customerNo;
}
public String getCustomerName(){
return this.customerName;
}
@Override
public boolean equals(Object o){
Customer cus = (Customer) o;
return (this.customerNo == cus.getCustomerNo() && this.customerName != null && this.customerName.equals(cus.getCustomerName()));
}
@Override
public String toString(){
StringBuffer strb = new StringBuffer();
strb.append("Customer No ")
.append(this.customerNo)
.append(", Customer Name ")
.append(this.customerName)
.append("\n");
return strb.toString();
}
public static void main(String [] args){
Set<Customer> set = null;
try{
set = new HashSet<Customer>();
set.add(new Customer(1,"Jack"));
set.add(new Customer(3,"Will"));
set.add(new Customer(1,"Tom"));
set.add(new Customer(3,"Fill"));
System.out.println("Size "+set.size());
}catch(Exception e){
e.printStackTrace();
}
}
}
From the above code you can see that I am returning my hashcode as customer No.
And my equality is also based on customer No. and Customer Name
If I run the above code the output will be
D:\Java_Projects>java Customer
Size 4
D:\Java_Projects>
The output is 4 objects getting created of same customer No.
The reason is even though the customer no. is same, but the names are different,
as per my above implementation of 'equals' its based on both customerNo and customer Name.
As 4 different combinations of CustomerNo-CustomerName, hence 4 objects getting created.
My question is,
Is my above hashcode implementation a bad practise ?
What all failures I can come accross ?
What if I create 500,000 Customer objects with same customer No, what will happen ?
Whether there will be 500,000 customer objects placed in a same bucket No ?
答案1
得分: 0
equals(...)
和 hashCode()
之间存在隐式契约:
> hashCode
的一般约定是:
>
> - 在 Java 应用程序的执行过程中,如果同一对象被多次调用 hashCode
方法,只要在对象的 equals 比较中使用的信息未被修改,该方法必须始终返回相同的整数。这个整数不需要在应用程序的不同执行之间保持一致。
>
> - 如果两个对象根据 equals(Object)
方法是相等的,则在这两个对象的每个对象上调用 hashCode
方法必须产生相同的整数结果。
>
> - 并不要求如果两个对象根据 equals(java.lang.Object)
方法是不相等的,则在这两个对象的每个对象上调用 hashCode
方法必须产生不同的整数结果。然而,程序员应意识到,为不相等的对象生成不同的整数结果可能会提高哈希表的性能。
您的实现满足了所有三个约束条件。然而,最佳实践是,在 equals(...)
中进行比较的所有属性也应影响 hashCode()
,反之亦然。否则,使用 hashCode()
(例如 HashMap
和 HashSet
)的数据结构可能会表现不佳。如您所提到的,一个原因是具有相同 hashCode
的所有对象都被放置在同一个桶中,因此访问可能没有恒定的时间复杂度。
然而,这不会导致抛出异常。
英文:
There is an implicit contract beteween equals(...)
and hashCode()
:
> The general contract of hashCode
is:
>
> - Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode
method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
>
> - If two objects are equal according to the equals(Object)
method, then calling the hashCode
method on each of the two objects must produce the same integer result.
>
> - It is not required that if two objects are unequal according to the equals(java.lang.Object)
method, then calling the hashCode
method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
Your implementation satisfies all three constraints. However, best practice is that all attributes that are compared in equals(...)
should also influence hashCode()
and vice-versa. Otherwise, it might be possible that data structures using hashCode()
(e.g. HahsMap
and HashSet
) can perform sub-optimally. One reason is, as you mentioned, that all objects with the same hashCode
are placed in the same bucket and thus the accesss may not have constant time complexity.
This will not, however, result in exceptions being thrown.
答案2
得分: 0
Is my above hashcode implementation a bad practise ?
Assuming different customers have different
customerNo
most of the time, this is a good implementation. In a real world application,customerNo
would most likely be a unique identifier, with uniqueness guaranteed by a database constraint.
关于上述的哈希码实现,如果不同的客户大部分时间具有不同的 customerNo
,那么这是一个很好的实现。在真实世界的应用中,customerNo
很可能是一个唯一标识符,并且其唯一性由数据库约束保证。
What all failures I can come accross ?
You haven't handled the case where
customerNo
isnull
. Here's one way to do that:public int hashCode(){ return Objects.hash(customerNo); }
This will return 0 when
customerNo
isnull
.You have another bug in the
equals
method:Integer
objects should not be compared with==
, it will give you unexpected results. Also, two customers withcustomerName
set tonull
are never equal. TheObjects.equals
method solves these problems.return Objects.equals(this.customerNo, cus.customerNo) && Objects.equals(this.customerName, cus.customerName);
你没有处理 customerNo
为 null
的情况。以下是一种处理方式:
public int hashCode(){
return Objects.hash(customerNo);
}
当 customerNo
为 null
时,这会返回 0。
在 equals
方法中还有另一个错误:不应该使用 ==
来比较 Integer
对象,它会导致意外的结果。而且,两个 customerName
都设置为 null
的客户永远不相等。Objects.equals
方法解决了这些问题。
return Objects.equals(this.customerNo, cus.customerNo)
&& Objects.equals(this.customerName, cus.customerName);
What if I create 500,000 Customer objects with same customer No, what will happen ?
Whether there will be 500,000 customer objects placed in a same bucket No ?In this scenario, all objects will indeed be placed in the same bucket. Your
HashSet
is reduced to a linked list data structure, and it will perform poorly: to locate a customer object, the data structure has to compare the given object with every object in the worst case.If
Customer
implementedComparable
, the hash table bucket could use a binary search tree instead of a linked list, and the performance would not be impacted as badly.
在这种情况下,所有对象将确实放置在同一个桶中。你的 HashSet
会被降级为链表数据结构,并且性能会较差:为了定位一个客户对象,在最坏的情况下,数据结构必须将给定对象与每个对象进行比较。
如果 Customer
实现了 Comparable
接口,哈希表的桶可以使用二叉搜索树而不是链表,性能不会受到那么严重的影响。
英文:
> Is my above hashcode implementation a bad practise ?
Assuming different customers have different customerNo
most of the time, this is a good implementation. In a real world application, customerNo
would most likely be a unique identifier, with uniqueness guaranteed by a database constraint.
> What all failures I can come accross ?
You haven't handled the case where customerNo
is null
. Here's one way to do that:
public int hashCode(){
return Objects.hash(customerNo);
}
This will return 0 when customerNo
is null
.
You have another bug in the equals
method: Integer
objects should not be compared with ==
, it will give you unexpected results. Also, two customers with customerName
set to null
are never equal. The Objects.equals
method solves these problems.
return Objects.equals(this.customerNo, cus.customerNo)
&& Objects.equals(this.customerName, cus.customerName);
> What if I create 500,000 Customer objects with same customer No, what will happen ?
> Whether there will be 500,000 customer objects placed in a same bucket No ?
In this scenario, all objects will indeed be placed in the same bucket. Your HashSet
is reduced to a linked list data structure, and it will perform poorly: to locate a customer object, the data structure has to compare the given object with every object in the worst case.
If Customer
implemented Comparable
, the hash table bucket could use a binary search tree instead of a linked list, and the performance would not be impacted as badly.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论