Why does Object provide equals and hash code methods?

huangapple go评论59阅读模式
英文:

Why does Object provide equals and hash code methods?

问题

Many object-oriented programming languages where something like Object is the root of the type hierarchy provide some default equality and hash code methods; for example:

  • C# System.Object: Equals and GetHashCode
  • Java java.lang.Object: equals and hashCode

这些面向对象编程语言中,例如C#的System.Object和Java的java.lang.Object,提供了一些默认的相等性和哈希码方法;例如:

  • C# System.Object: EqualsGetHashCode
  • Java java.lang.Object: equalshashCode

It is valuable to override these methods for objects that exhibit value semantics; for example, Boolean, Integer and Guid, but it is uncommon to override these methods for objects that exhibit behavior semantics; for example services and controllers. These methods also serve as the mechanism to determine duplicates for Set<T> and Map<K, V> insertion.

对于表现出值语义的对象,例如BooleanIntegerGuid,重写这些方法是有价值的,但对于表现出行为语义的对象,例如服务和控制器,很少重写这些方法。这些方法还用作确定Set<T>Map<K, V>插入中的重复项的机制。

Let's assume that Object does not contain these methods, and instead they are provided by interfaces; for example:

让我们假设Object不包含这些方法,而是由接口提供,例如:

interface Equatable<T> {
    Boolean equals(T other)
}

interface Hashable {
    Integer getHashCode()
}

Now, objects that exhibit behavior semantics don't implement the methods because they're not required, but objects that exhibit value semantics can implement them; for example:

现在,表现出行为语义的对象不实现这些方法,因为它们不是必需的,但表现出值语义的对象可以实现它们;例如:

class Guid : Equatable<Guid>, Hashable {
    public Boolean equals(Guid other) {
        ...
    }

    public Integer getHashCode() {
        ...
    }
}

Also Set<T> and Map<K, V> could declare constraints on T and K, such that objects should implement Equatable<T> and Hashable; for example:

此外,Set<T>Map<K, V> 可以对 TK 声明约束,使得对象应该实现 Equatable<T>Hashable;例如:

interface Set<T> where T extends Equatable<T>, Hashable {
    ...
}

interface Map<K, V> where K extends Equatable<T>, Hashable {
    ...
}

Given that not all objects (i.e. those that exhibit behavior semantics) require these methods to be overridden, it seems plausible that equals and hash code methods did not need to be declared at the root of the type hierarchy, so why are they defined at the root of the type hierarchy, rather than something expected to be implemented as required?

考虑到并非所有对象(即表现出行为语义的对象)都需要重写这些方法,似乎 equals 和 hash code 方法不需要在类型层次结构的根部声明,那么为什么它们确实被定义在类型层次结构的根部,而不是预期的按需实现呢?

Is there some other underlying or fundamental reason that these methods are implemented at the root of the type hierarchy, and are there any similar languages where this isn't the case?

是否有一些其他基本的原因导致这些方法被实现在类型层次结构的根部,是否有任何_类似的_语言不是这种情况?

英文:

Many object-oriented programming languages where something like Object is the root of the type hierarchy provide some default equality and hash code methods; for example:

  • C# System.Object: Equals and GetHashCode
  • Java java.lang.Object: equals and hashCode

It is valuable to override these methods for objects that exhibit value semantics; for example, Boolean, Integer and Guid, but it is uncommon to override these methods for objects that exhibit behaviour semantics; for example services and controllers. These methods also serve as the mechanism to determine duplicates for Set<T> and Map<K, V> insertion.

Let's assume that Object does not contain these methods, and instead they are provided by interfaces; for example:

interface Equatable<T> {
    Boolean equals(T other)
}

interface Hashable {
    Integer getHashCode()
}

Now, objects that exhibit behaviour semantics don't implement the methods because they're not required, but objects that exhibit value semantics can implement them; for example:

class Guid : Equatable<Guid>, Hashable {
    public Boolean equals(Guid other) {
        ...
    }

    public Integer getHashCode() {
        ...
    }
}

Also Set<T> and Map<K, V> could declare constraints on T and K, such that objects should implement Equatable<T> and Hashable; for example:

interface Set<T> where T extends Equatable<T>, Hashable {
    ...
}

interface Map<K, V> where K extends Equatable<T>, Hashable {
    ...
}

Given that not all objects (i.e. those that exhibit behaviour semantics) require these methods to be overridden, it seems plausible that equals and hash code methods did not need to be declared at the root of the type hierarchy, so why are they defined at the root of the type hierarchy, rather than something expected to be implemented as required?

Is there some other underlying or fundamental reason that these methods are implemented at the root of the type hierarchy, and are there any similar languages where this isn't the case?

答案1

得分: 5

以下是翻译好的部分:

没有根本的理由要求它们必须在根目录中实现,原因如你所提到的那些。这个答案将专门针对C#,但我猜想许多论点也适用于Java。

我们只能猜测语言设计师为什么使用了这个模型,但很可能受到了第一个版本的C#中泛型的缺乏的影响。来自Java的影响,它也缺乏泛型,也可能发挥了作用。请注意,在C#中,你提出的接口以IEquatable<T>的形式存在。

但是如果我正确理解你的论点,HashSet将被限制为IEquatable<T>。这只能用于使用值相等性的类型。因此,HashSet不能与使用引用相等性的类型一起使用,这似乎是一个明显的问题。对于使用引用相等性的对象,字典和哈希集仍然非常有用。

这并不是说我认为现在的相等性系统设计得很好。要比较两个对象,你可以使用

  1. .Equals(object obj)
  2. IEquatable<T>.Equals(T obj)
  3. 相等性运算符,== / !=,(每个都需要显式实现)
  4. IEqualityComparer<T>.Equals(T a, T b)

这对于新开发人员来说很令人困惑,并且往往会导致相当多的样板代码。这,像许多其他事情一样,部分是由于语言的演进。新功能必须经过谨慎设计,以与现有代码一起使用。如果C#从头开始重写,几乎肯定可以进行改进,这可能包括删除或重新设计.Equals(object obj)。但这实际上不可行,因为它会破坏几十年的现有代码。

如果你正在设计一种新语言,请务必考虑更好地处理相等性的方法。但我希望这需要一些主要的设计工作,以使它尽可能易于理解和使用,同时仍然允许必要的灵活性。设计语言并不容易!在其他语言中,有一些在相等性领域存在问题,至少C#比javascript好。

至于类似的语言,Java和C#在某种意义上都被设计成更易于使用的C++。这只是通过根本不具有共同的类型根来避免该问题。但这种方法也存在一些可用性方面的缺点。

英文:

There is no fundamental reason they have to be implemented at the root, for all the reasons you mention. This answer will be specifically for c#, but I would guess many of the arguments apply to java as well.

We can only speculate why the language designers used this model, but it is likely influenced by the lack of generics in the first version of C#. Influence from java, that also lacked generics, may also have played a part. Note that in c# your proposed interface does exist in the form of IEquatable<T>.

However If I understand your argument correctly, HashSet would be constrained to IEquatable<T>. This would only be implemented for types using value equality. Therefore HashSet could not be used with types using reference equality, and this seem like an obvious problem. Dictionaries and HashSets are still very useful for objects using reference equality.

That is not to say I think the equality system, as it is now, is well designed. To compare two objects you may use

  1. .Equals(object obj)
  2. IEquatable<T>.Equals(T obj)
  3. Equality operators, == / !=, (and each of these require explicit implementation)
  4. IEqualityComparer<T>.Equals(T a, T b)

This is confusing for new developers, and tend to lead to a quite significant amount of boiler plate code. This, as many other things, is partially due to the language evolution. New features have to be carefully designed work with existing code. If c# where rewritten from scratch you could almost certainly make improvements, and that would likely include removal or redesign of .Equals(object obj). But that is not really feasible since it would break decades of existing code.

If you are designing a new language please do consider ways to handle equality better. But I would expect this to require some major design work to make it as easy to understand and use as possible while still allowing for necessary flexibility. Designing languages is not easy! Looking around at other languages there are several that has problems in the equality area, C# is at least better than javascript.

As for similar languages, both java and c# where in some sense intended as an easier to use c++. And that just avoid the problem by not having a common type root at all. But there are some usability downsides with that approach.

huangapple
  • 本文由 发表于 2023年4月19日 16:10:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/76052122.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定