Python在准确处理大数时是否失败?

huangapple go评论71阅读模式
英文:

Does python fail in handling large numbers accurately?

问题

我已经编写了Python代码来计算以下表达式:

Python在准确处理大数时是否失败?

其中C_n是第n个卡塔兰数,定义为C_n = (2nCn) / (n+1)

from math import comb
import sys
from decimal import Decimal

def factorial(n):
    fact = 1
    for i in range(1, n+1):
        fact = fact * i
    return fact

sum = 0

n = int(sys.stdin.readline())

for i in range(n+1):
    sum += int((((factorial(2 * i))/(factorial(i) * factorial(i)))/(i+1)) * (((factorial(2 * (n - i)))/(factorial(n - i) * factorial(n - i)))/(n - i + 1)))

print(sum)

输入59返回的结果是:1583850964596120056618874138787840,而期望的输出是:1583850964596120042686772779038896。为什么会这样?

英文:

I have made python code to calculate the following expression:

Python在准确处理大数时是否失败?

With C_n beeing the n:th catalan number defined by C_n = (2nCn) / (n+1)

from math import comb
import sys
from decimal import Decimal

def factorial(n):
    fact = 1
    for i in range(1, n+1):
        fact = fact * i
    return fact

sum = 0

n = int(sys.stdin.readline())

for i in range(n+1):
    sum += int((((factorial(2 * i))/(factorial(i) * factorial(i)))/(i+1)) * (((factorial(2 * (n - i)))/(factorial(n - i) * factorial(n - i)))/(n - i + 1)))

print(sum)

The input 59 returns: 1583850964596120056618874138787840 when the expected output is: 1583850964596120042686772779038896

Why is this?

答案1

得分: 2

因为你使用了 /,这导致了一个 float,它具有固定的精度,不足以准确表示该数字。只需使用 // 代替,这样你将保持在 int 值中,它们具有任意精度。

正如Mike指出的,你的Sn实际上是Cn+1,所以你可以更高效地这样做:

n += 1
print(comb(2*n, n) // (n+1))
英文:

It's because you use /, which results in a float, which has fixed precision, not enough to represent the number accurately. Simply use // instead, then you stay in int values, which have arbitrary precision.

As Mike points out, your S<sub>n</sub> is actually C<sub>n+1</sub>, so you can instead much more efficiently just do this:

n += 1
print(comb(2*n, n) // (n+1))

答案2

得分: 1

请注意,如果可能的话,不要计算阶乘,如果你正在实现数学计算,而存在直接公式可以使用的情况下,不要计算递归关系(虽然不总是可能的,但大多数情况下是可以的)

卡特兰数有直接的公式:Cn = 1/(n+1) * 2n 选择 n,尽管在“纸上”你可以使用阶乘除以阶乘来表示2n选择n,但在计算机中,你不希望进行这种替代,因为阶乘是非常巨大的数字,如果我们想要一个二项式系数,实际上并不需要它们:我们可以使用帕斯卡三角形或者更简单地使用预先构建的二项式函数,如scipy.special.binom函数。然而,在非常大的数字上,scipy(具有讽刺意味)变得不准确,所以让我们自己写一个:

binomials = [[1]]

def binom(n, k):
  while n >= len(binomials):
      s = len(binomials)
      next = [0] * (s+1) 
      next[0] = 1
      for i in range(1, s):
          next[i] = binomials
展开收缩
[i-1] + binomials
展开收缩
[i]
next
展开收缩
= 1;
binomials.append(next) return binomials[n][k]

这样,所有的都是整数数学运算,Python具有无限精度整数,只要我们有足够的内存,这个binom函数将始终给出正确的答案。

有了这个,让我们计算一些东西:

def Cn(n):
    return binom(2*n, n) / (n+1)

def calcSum(n):
    sum = 0
    for k in range(0, n+1):
      sum += Cn(k) * Cn(n-k)
    return "{:f}".format(sum)

然后测试 calcSum(59)

>>> calcSum(59)
'1583850964596120221000260537810944.000000'
>>>

现在,这当然不是正确的答案(正确答案应该是1583850964596120042686772779038896,所以我们偏差了10^18倍),但它是“为固定精度IEEE浮点数计算正确答案”的答案,没有任何阶乘问题。

当然,我们可以添加Decimal

from decimal import Decimal

def Cn(n):
    return Decimal(binom(2*n, n)) / Decimal((n+1))

def calcSum(n):
    sum = Decimal(0)
    for k in range(0, n+1):
      sum += Cn(k) * Cn(n-k)
    return sum

这给我们 Decimal('1.583850964596120042686772779E+33'),这显然是一个错误的答案(如果我们不使用科学计数法打印它,它会变成 1583850964596120042686772779000000),而E+33告诉我们,我们忘记了将小数精度设置为至少与答案所需的位数相同。所以,让我们只使用100位精度:

from decimal import Decimal, getcontext
getcontext().prec = 100

现在答案是 Decimal('1583850964596120042686772779038896')

完成:这是正确的答案。

所以你的代码有两个问题。首先:在编程时不要使用阶乘,使用你实际需要的东西。在这种情况下,是二项式系数。其次:使用Python提供的工具来获取所需的精度。你导入了Decimal,但没有使用它,但即使你使用了它,你的代码也缺少将小数精度设置为获得正确答案所需的数字的指令。

这里最重要的教训是不要盲目地实现数学公式,因为我们手工计算出来的符号数学经过几个世纪的优化,以便易于“我知道很多替代规则,经过多年的研究,我可以理解”的人类大脑。因此,除非你在一个可以执行相同符号替代工作的编程语言中工作(例如Mathematica),否则你将不得不像计算机一样思考。如果你有一个递归关系,找到非递归等效的函数并实现它。如果你看到阶乘,问自己它是用来做什么的,然后看看是否有一种干净和高效的方法来计算那个

英文:

Note that you should never compute factorials if you can help it, and if you're implementing maths, don't compute a recurrence relation if there is a direct formula that you can use. (That's not always possible, but the majority of the time it is)

Catalan numbers have the direct formula Cn = 1/(n+1) * 2n choose n, and while "on paper" you can write 2n choose n using factorials divided by factorials, in computer land you don't want to make that substitution because factorials are famously huge numbers, and if we want a binomial coefficient we don't actually need them: we can near-trivially compute any binomial coefficient with Pascal's triangle, or even more trivially by using a prebuilt binomial function such as the scipy.special.binom function. However, at very large numbers scipy (ironically) becomes imprecise, so let's just write our own:

binomials = [[1]]

def binom(n,k):
  while n &gt;= len(binomials):
      s = len(binomials)
      next = [0] * (s+1) 
      next[0] = 1
      for i in range(1,s):
          next[i] = binomials
展开收缩
[i-1] + binomials
展开收缩
[i]
next
展开收缩
= 1;
binomials.append(next) return binomials[n][k]

There, this is all pure integer math, and python has infinite precision ints, so as long as we have enough memory this binom function will always give us the correct answer.

with that, let's compute things:

def Cn(n):
    return binom(2*n, n) / (n+1)

def calcSum(n):
    sum = 0
    for k in range(0, n+1):
      sum += Cn(k) * Cn(n-k)
    return &quot;{:f}&quot;.format(sum)

And let's test calcSum(59):

&gt;&gt;&gt; calcSum(59)
&#39;1583850964596120221000260537810944.000000&#39;
&gt;&gt;&gt;

Now, that's not the right answer of course (that would be 1583850964596120042686772779038896, so we're off by a factor 10^18), but it is "the right answer for fixed precision IEEE floating point numbers to yield" without any sort of factorial problems.

Of course, we could add in Decimal:

from decimal import Decimal

def Cn(n):
    return Decimal(binom(2*n, n)) / Decimal((n+1))

def calcSum(n):
    sum = Decimal(0)
    for k in range(0, n+1):
      sum += Cn(k) * Cn(n-k)
    return sum

This gives us Decimal(&#39;1.583850964596120042686772779E+33&#39;) which is clearly a bad answer (if we print it without scientific notation it turns into 1583850964596120042686772779000000) and that E+33 tells us that we forgot to set the decimal precision to something that has at least as many digits as the answer needs. So, let's just use 100 digits of precision:

from decimal import Decimal, getcontext
getcontext().prec = 100

And now the answer is Decimal(&#39;1583850964596120042686772779038896&#39;).

Done: that is the correct answer.

So your code has two problems. First: don't use factorials when you're programming, use the thing you actually need. In this case, binomial coefficients. Second: use the things Python has for getting the precision you need. You imported Decimal but didn't use it, but even if you had, your code was missing the instruction to set the decimal precision to a number high enough to get the right answer.

The most important lesson here is don't blindly implement math formulae because the symbolic maths that we work out by hand have been optimized over the centuries to be easy for "I know a whole bunch of susbtitution rules from years of study" human brains. So unless you're working in a programming language that can do that same symbolic substitution work (e.g. Mathematica), you're going to have to think like a computer. If you have a recurrence relation, find the non-recurrence equivalent function and implement that. And if you ever see a factorial, ask yourself what it's used for, then see if there's a clean and efficient way to compute that instead.

huangapple
  • 本文由 发表于 2023年6月5日 23:17:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76407840.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定