Why do some comparable classes in the JDK limit the comparison function to {−1, 0, 1} and some don't?

huangapple go评论79阅读模式
英文:

Why do some comparable classes in the JDK limit the comparison function to {−1, 0, 1} and some don't?

问题

java.lang.Comparable<T>接口明确要求的唯一返回值是当T aT b相等时为0。如果a小于b,则compare(a, b)必须为负数,不一定是-1,而compare(b, a)必须为正数,不一定是1。

然而,JDK中的一些可比较类在比较函数的输出上限制得非常准确,例如:

scala> (65 to 90).map(n => java.lang.Integer.compare(77, n))
res19: IndexedSeq[Int] = Vector(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1)

而另一些则不然,例如:

scala> ('A' to 'Z').map(ch => java.lang.Character.compare('M', ch))
res10: IndexedSeq[Int] = Vector(12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 
-1, -2, -3, -4, -5, -6, -7, -8, -9, -10, -11, -12, -13)

我对javax.naming.ldap中的RDNs(Relative Distinguished Names)这种相对晦涩的情况很清楚,但我没有预料到会在java.lang的任何内容中遇到类似情况。我最初注意到在Java中的凯撒密码程序的上下文中,Character.compare()的输出是不受限制的,但我发现在本地的Scala REPL中更容易运行类似这样的“实验”。

当我编写Fraction的实现时,我遵循的是Integer而不是Character的示例。

scala> val fractA = new fractions.Fraction(65, 128)
fractA: fractions.Fraction = 65/128

scala> val fractB = new fractions.Fraction(90, 128)
fractB: fractions.Fraction = 45/64

scala> fractA to fractB
res20: fractions.FractionRange = 65/128 to 45/64

scala> val fractC = new fractions.Fraction(77, 128)
fractC: fractions.Fraction = 77/128

scala> res20.map(_.compare(fractC))
res21: IndexedSeq[Int] = Vector(-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)

scala> res20.map(fractC.compare)
res22: IndexedSeq[Int] = Vector(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1)

scala> res19 == res22
res23: Boolean = true

对于Fraction,像这样实现没有问题。实际上,如果不这样做,会更麻烦。分子和分母的类型是long,因此如果差的分子恰好在int的范围之外,可能会出问题。通过使用Long.signum(),我将错误结果的可能性减少到一小组边缘情况。

由于char映射到了int范围的一半,我想String对于compare()的结果不受限制为{−1, 0, 1}可能更容易。

scala> "Hello, World!" compare "Hello, world?"
res30: Int = -32

在这里,我猜测如果没有涉及代理项,或者甚至可能涉及代理项,只需在String的每个字符上运行Character.compare(),直到出现第一个非零结果或达到末尾为止。

这是解释吗,即应该做任何最容易且能给出正确结果的事情?还是在一些可比较类中限制为差值的符号,而在其他类中不限制?

英文:

The only return value the java.lang.Comparable&lt;T&gt; interface explicitly requires is 0 for when T a and T b are equal. If a is less than b, then compare(a, b) must be negative, not necessarily −1, and compare(b, a) must be positive, not necessarily 1.

And yet some comparable classes in the JDK limit the output of the comparison function precisely in that manner, e.g.,

scala&gt; (65 to 90).map(n =&gt; java.lang.Integer.compare(77, n))
res19: IndexedSeq[Int] = Vector(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1)

and some don't, e.g.,

scala&gt; (&#39;A&#39; to &#39;Z&#39;).map(ch =&gt; java.lang.Character.compare(&#39;M&#39;, ch))
res10: IndexedSeq[Int] = Vector(12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 
-1, -2, -3, -4, -5, -6, -7, -8, -9, -10, -11, -12, -13)

I was well aware of the somewhat obscure case of RDNs (from javax.naming.ldap), but I wasn't expecting to come across this with anything from java.lang. I first noticed unrestricted output in Character.compare() in the context of a Caesar cypher program in Java, but I find it easier to run "experiments" like these in the local Scala REPL.

When I wrote my implementation of Fraction, I followed the example of Integer rather than Character.

scala&gt; val fractA = new fractions.Fraction(65, 128)
fractA: fractions.Fraction = 65/128
scala&gt; val fractB = new fractions.Fraction(90, 128)
fractB: fractions.Fraction = 45/64
scala&gt; fractA to fractB
res20: fractions.FractionRange = 65/128 to 45/64
scala&gt; val fractC = new fractions.Fraction(77, 128)
fractC: fractions.Fraction = 77/128
scala&gt; res20.map(_.compare(fractC))
res21: IndexedSeq[Int] = Vector(-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
scala&gt; res20.map(fractC.compare)
res22: IndexedSeq[Int] = Vector(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1)
scala&gt; res19 == res22
res23: Boolean = true

In the case of Fraction, it's no problem to implement it like this. Actually, it would be more trouble to not do it like this. The numerator and denominator are of type long, so there could be problems if the numerator of the difference is just a little outside the range of int. By using Long.signum(), I reduce the possibility of wrong results to a small set of edge cases.

Since char maps to half the range of int, I suppose it's easier for String to not restrict the result of compare() to {−1, 0, 1}.

scala&gt; &quot;Hello, World!&quot; compare &quot;Hello, world?&quot;
res30: Int = -32

Here I'm guessing that if no surrogates are involved, or maybe even if surrogates are involved, it's easier to just run Character.compare() on each character of the String until either the first nonzero result or reaching the end.

Is this the explanation, that one should do whatever is easiest and gives the correct results? Or is there a deeper reason to restrict to signum of the difference in some comparable classes and not others?

答案1

得分: 3

主要回答

因为实现可以自由选择适合它们的任何正值或负值。规范 对于返回值有如下说明:

> 一个负整数、零或正整数,表示此对象小于、等于或大于指定的对象。

人们可能会争论这个规范是否明智,但这就是它的定义方式。因此,即使通过实验表明在特定的 Java 版本中只返回 -1、0 或 1,也不要依赖于比较结果仅为这些值 - 在下一个版本中可能会发生变化。

实现回答

这个答案已经在你的问题中找到,主要如下。

有两种典型的比较实现方式:

  • 整数相减:可以得到不仅限于 -1、0、1 的结果。代码简单、优雅且快速。但可能会发生溢出,例如对于值 2,000,000,000 - (-2,000,000,000),数学上是 4,000,000,000,但在 32 位 int 中,结果显示为 -294,967,296,错误地暗示 2,000,000,000 小于 -2,000,000,000。为避免溢出,int 减法适用于大约在 +/- 1,000,000,000 范围内的数字。
  • 判定:通常需要一个 if ... else if ... else 结构,在这种情况下,三种情况的返回值都会明确给出。然后选择使用 -1、0 和 1 是一个自然的选择,我不知道是否有使用其他固定值的实现。

因此,减法是 byte 和 char 的有效解决方案,其中基于 int 的减法具有足够的保留位,不会发生溢出。因此,对于那些数据类型及其派生类型来说,显示在 -1、0 和 1 之外的值更有可能。

分数类

你正在实现一个 Fraction 类。如果可以创建两个不会引发异常的 Fraction 实例,我会要求 compareTo() 方法始终给出正确的结果。由于分数的比较是一个棘手的问题,可以预期会发生中间结果的溢出。因此,我建议创建一些测试用例,其中分子和/或分母接近有效限制(无论你如何定义它们)。

避免溢出的另一种方法是切换到无限范围的 BigInteger 类型,但可能会影响性能。

英文:

Main answer

Because implementations are free to choose whatever positive or negative value suits them best. The spec says about the return value:

> a negative integer, zero, or a positive integer as this object is less
> than, equal to, or greater than the specified object.

One might argue whether this spec was a wise decision or not, but that's the way it has been defined. So, never rely on the results of a comparison being only -1, 0, or 1, even if experimenting with one specific Java version shows that behaviour - it might change with the next release.

Implementations answer

This answer is already found in your question, mainly.

There are two typical ways of implementing comparison:

  • Integer Subtraction: that gives results not limited to -1, 0, 1. The code is simple, elegant and fast. But there can be overflow, e.g. for values 2_000_000_000 - (-2_000_000_000) mathematically is 4_000_000_000, but with 32-bit int the result shows as -294_967_296, falsely implying that 2_000_000_000 is less than -2_000_000_000. To avoid overflow, int subtraction works for numbers up to roughly +/- 1_000_000_000.
  • Decision: this typically needs an if ... else if ... else construct where the return values for the three cases are explicitly given. Then it's a natural choice to use -1, 0, and 1, and I don't know of an implementation using other fixed values.

So, subtraction is a valid solution for byte and char, where an int-based subtraction has enough reserve bits that overflow can't happen. So, it's more likely for those datataypes and their derivatives to show values outside of -1, 0, and 1.

Fraction class

You're wrting about a Fraction class you're implementing. If two Fraction instances can be created, not giving an exception, I'd require the compareTo() method to give correct results, always. As comparison of fractions is a tricky thing, overflows of intermediate results can be expected. So, I'd recommend to create some test cases with numerators and/or denominators close to the validity limits (whatever you define them to be).

Another approach to avoid overflow would be switching to the unlimited-range BigInteger type, but that might have a performance impact.

huangapple
  • 本文由 发表于 2020年9月22日 13:29:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/64003572.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定