2023年6月5日 21:18:00go评论63阅读模式

英文:

java - How does BigInteger convert Strings to its internal representation?

问题

I am doing a little side project where I am delving into the implementation of some popular classes that we use in the Java Libraries.

The first on my list is BigInteger. I am interested in understanding basically the algorithm via which is converts a String into a number for internal representation.

This is the piece of code I am interested in:

// Pre-allocate array of expected size. May be too large but can
// never be too small. Typically exact.
long numBits = ((numDigits * bitsPerDigit[radix]) >>> 10) + 1;
if (numBits + 31 >= (1L << 32)) {
    reportOverflow();
}
int numWords = (int) (numBits + 31) >>> 5;
int[] magnitude = new int[numWords];

// Process first (potentially short) digit group
int firstGroupLen = numDigits % digitsPerInt[radix];
if (firstGroupLen == 0)
    firstGroupLen = digitsPerInt[radix];
String group = val.substring(cursor, cursor += firstGroupLen);
magnitude[numWords - 1] = Integer.parseInt(group, radix);
if (magnitude[numWords - 1] < 0)
    throw new NumberFormatException("Illegal digit");

// Process remaining digit groups
int superRadix = intRadix[radix];
int groupVal = 0;
while (cursor < len) {
    group = val.substring(cursor, cursor += digitsPerInt[radix]);
    groupVal = Integer.parseInt(group, radix);
    if (groupVal < 0)
        throw new NumberFormatException("Illegal digit");
    destructiveMulAdd(magnitude, superRadix, groupVal);
}
// Required for cases where the array was overallocated.
mag = trustedStripLeadingZeroInts(magnitude);
if (mag.length >= MAX_MAG_LENGTH) {
    checkRange();
}

I kind of understand the previous lines. But here is when it does not make sense as the math used is not familiar.

My questions are as follows:

What does bitsPerDigit signify? If you run a debugger then it comes to 3402. Is that the number of bits used in a digit in int in Java?
Why divide the result of (numDigits * bitsPerDigit[radix]) by 2^10 and add 1? (Right logical shift is dividing a power by 2).
Why add 31 to the numBits before dividing by 2^5 = 32? (I kind of understand 32 bits as that is the size of an int in Java).
What is destructiveMulAdd? How does it work?

Are these algorithms documented somewhere?

Can anyone please point me in the right direction?

英文:

I am doing a little side project where I am delving into the implementation of some popular classes that we use in the Java Libraries.

The first on my list is BigInteger. I am interested in understanding basically the algorithm via which is converts a String into a number for internal representation.

This is the piece of code I am interested in:

// Pre-allocate array of expected size. May be too large but can
        // never be too small. Typically exact.
        long numBits = ((numDigits * bitsPerDigit[radix]) &gt;&gt;&gt; 10) + 1;
        if (numBits + 31 &gt;= (1L &lt;&lt; 32)) {
            reportOverflow();
        }
        int numWords = (int) (numBits + 31) &gt;&gt;&gt; 5;
        int[] magnitude = new int[numWords];

        // Process first (potentially short) digit group
        int firstGroupLen = numDigits % digitsPerInt[radix];
        if (firstGroupLen == 0)
            firstGroupLen = digitsPerInt[radix];
        String group = val.substring(cursor, cursor += firstGroupLen);
        magnitude[numWords - 1] = Integer.parseInt(group, radix);
        if (magnitude[numWords - 1] &lt; 0)
            throw new NumberFormatException(&quot;Illegal digit&quot;);

        // Process remaining digit groups
        int superRadix = intRadix[radix];
        int groupVal = 0;
        while (cursor &lt; len) {
            group = val.substring(cursor, cursor += digitsPerInt[radix]);
            groupVal = Integer.parseInt(group, radix);
            if (groupVal &lt; 0)
                throw new NumberFormatException(&quot;Illegal digit&quot;);
            destructiveMulAdd(magnitude, superRadix, groupVal);
        }
        // Required for cases where the array was overallocated.
        mag = trustedStripLeadingZeroInts(magnitude);
        if (mag.length &gt;= MAX_MAG_LENGTH) {
            checkRange();
        }

I kind of understand the previous lines. But here is when it does not make sense as the math used is not familiar.

My questions are as follows:

What does bitsPerDigit signify ? If you run a debugger then it comes to 3402. Is that the number of bits used in a digit in int in Java ?
Why divide the result of (numDigits * bitsPerDigit[radix]) by 2^10 and add 1 ? (Right logical shift is dividing a power by 2).
Why add 31 to the numBits before dividing by 2^5 = 32? (I kind of understand 32 bits as that is the size of an int in Java).
What is destructiveMulAdd? How does it work?

Are these algorithms documented somewhere?

Can anyone please point me in the right direction?

答案1

得分: 2

bitsPerDigit[x] 是对 log2(x) 的定点逼近，其比例为 1024，并向上取整。给定 x = 10，我们得到大约 3402 / 1024 = 3.32226562 每位的比特数（实际 log2(10) 更接近 3.3219）。明确一下，3402 表示 3.32226562，但我们从不实际计算后者，而是隐式地进行处理。

右移 10 位考虑了比例为 1024，将乘法的结果从具有小数点后 10 位的定点格式转换为普通整数。右移会向下取整，加 1 会进行补偿，以确保结果永远不会太低，也不会为零。

从比特数到字数的转换，(numBits + 31) >>> 5，只是将其除以 32 并向上取整。

在每一步中，我们必须向上取整，以避免分配太小的数组，但尺寸估计不需要非常精确：稍微高估只会导致内部数组中的一点点浪费空间。

英文:

bitsPerDigit[x] is a fixed-point approximation of log2(x), with a scale of 1024, and rounded up. Given that x = 10, we get an estimate of approximately 3402 / 1024 = 3.32226562 bits per digit (the actual log2(10) is closer to 3.3219). To be clear, 3402 represents 3.32226562, but we never actually calculate the latter, it is worked with implicitly.

The right-shift by 10 accounts for the scale of 1024, converting the result of the multiplication from a fixed-point format with 10 bits after the radix point to just a plain old integer. The right-shift rounds down, adding 1 compensates for that in a way that ensures the result is never too low, and never zero.

The conversion from the number of bits to the number of words, (numBits + 31) >>> 5, is just a division by 32 rounded up.

At every step we must round up to avoid allocating a too small array, but the size estimate does not need to be extremely accurate: estimating it a little too high only results in a little bit of wasted space in the internal array.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

java – BigInteger如何将字符串转换为其内部表示？

问题

答案1

Java: 如何在拆分单词并反转后将分隔符重新添加到字符串中？

Springfox在Swagger 2.0 JSON定义中创建’ref’类型，而不是java.io.File对象。

Java Spring Boot加Spring Batch创建Jar，仅运行特定的作业。

如何相乘用户输入的数组？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论