英文:
java - How does BigInteger convert Strings to its internal representation?
问题
I am doing a little side project where I am delving into the implementation of some popular classes that we use in the Java Libraries.
The first on my list is BigInteger. I am interested in understanding basically the algorithm via which is converts a String into a number for internal representation.
This is the piece of code I am interested in:
// Pre-allocate array of expected size. May be too large but can
// never be too small. Typically exact.
long numBits = ((numDigits * bitsPerDigit[radix]) >>> 10) + 1;
if (numBits + 31 >= (1L << 32)) {
reportOverflow();
}
int numWords = (int) (numBits + 31) >>> 5;
int[] magnitude = new int[numWords];
// Process first (potentially short) digit group
int firstGroupLen = numDigits % digitsPerInt[radix];
if (firstGroupLen == 0)
firstGroupLen = digitsPerInt[radix];
String group = val.substring(cursor, cursor += firstGroupLen);
magnitude[numWords - 1] = Integer.parseInt(group, radix);
if (magnitude[numWords - 1] < 0)
throw new NumberFormatException("Illegal digit");
// Process remaining digit groups
int superRadix = intRadix[radix];
int groupVal = 0;
while (cursor < len) {
group = val.substring(cursor, cursor += digitsPerInt[radix]);
groupVal = Integer.parseInt(group, radix);
if (groupVal < 0)
throw new NumberFormatException("Illegal digit");
destructiveMulAdd(magnitude, superRadix, groupVal);
}
// Required for cases where the array was overallocated.
mag = trustedStripLeadingZeroInts(magnitude);
if (mag.length >= MAX_MAG_LENGTH) {
checkRange();
}
I kind of understand the previous lines. But here is when it does not make sense as the math used is not familiar.
My questions are as follows:
- What does
bitsPerDigit
signify? If you run a debugger then it comes to3402
. Is that the number of bits used in a digit inint
in Java? - Why divide the result of
(numDigits * bitsPerDigit[radix])
by2^10
and add1
? (Right logical shift is dividing a power by2
). - Why add
31
to thenumBits
before dividing by2^5 = 32
? (I kind of understand 32 bits as that is the size of anint
in Java). - What is
destructiveMulAdd
? How does it work?
Are these algorithms documented somewhere?
Can anyone please point me in the right direction?
英文:
I am doing a little side project where I am delving into the implementation of some popular classes that we use in the Java Libraries.
The first on my list is BigInteger. I am interested in understanding basically the algorithm via which is converts a String into a number for internal representation.
This is the piece of code I am interested in:
// Pre-allocate array of expected size. May be too large but can
// never be too small. Typically exact.
long numBits = ((numDigits * bitsPerDigit[radix]) >>> 10) + 1;
if (numBits + 31 >= (1L << 32)) {
reportOverflow();
}
int numWords = (int) (numBits + 31) >>> 5;
int[] magnitude = new int[numWords];
// Process first (potentially short) digit group
int firstGroupLen = numDigits % digitsPerInt[radix];
if (firstGroupLen == 0)
firstGroupLen = digitsPerInt[radix];
String group = val.substring(cursor, cursor += firstGroupLen);
magnitude[numWords - 1] = Integer.parseInt(group, radix);
if (magnitude[numWords - 1] < 0)
throw new NumberFormatException("Illegal digit");
// Process remaining digit groups
int superRadix = intRadix[radix];
int groupVal = 0;
while (cursor < len) {
group = val.substring(cursor, cursor += digitsPerInt[radix]);
groupVal = Integer.parseInt(group, radix);
if (groupVal < 0)
throw new NumberFormatException("Illegal digit");
destructiveMulAdd(magnitude, superRadix, groupVal);
}
// Required for cases where the array was overallocated.
mag = trustedStripLeadingZeroInts(magnitude);
if (mag.length >= MAX_MAG_LENGTH) {
checkRange();
}
I kind of understand the previous lines. But here is when it does not make sense as the math used is not familiar.
My questions are as follows:
- What does
bitsPerDigit
signify ? If you run a debugger then it comes to3402
. Is that the number of bits used in a digit inint
in Java ? - Why divide the result of
(numDigits * bitsPerDigit[radix])
by2^10
and add1
? (Right logical shift is dividing a power by2
). - Why add
31
to thenumBits
before dividing by2^5 = 32
? (I kind of understand 32 bits as that is the size of anint
in Java). - What is
destructiveMulAdd
? How does it work?
Are these algorithms documented somewhere?
Can anyone please point me in the right direction?
答案1
得分: 2
bitsPerDigit[x]
是对 log2(x) 的定点逼近,其比例为 1024,并向上取整。给定 x = 10,我们得到大约 3402 / 1024 = 3.32226562
每位的比特数(实际 log2(10) 更接近 3.3219)。明确一下,3402 表示 3.32226562,但我们从不实际计算后者,而是隐式地进行处理。
右移 10 位考虑了比例为 1024,将乘法的结果从具有小数点后 10 位的定点格式转换为普通整数。右移会向下取整,加 1 会进行补偿,以确保结果永远不会太低,也不会为零。
从比特数到字数的转换,(numBits + 31) >>> 5
,只是将其除以 32 并向上取整。
在每一步中,我们必须向上取整,以避免分配太小的数组,但尺寸估计不需要非常精确:稍微高估只会导致内部数组中的一点点浪费空间。
英文:
bitsPerDigit[x]
is a fixed-point approximation of log2(x), with a scale of 1024, and rounded up. Given that x = 10, we get an estimate of approximately 3402 / 1024 = 3.32226562
bits per digit (the actual log2(10) is closer to 3.3219). To be clear, 3402 represents 3.32226562, but we never actually calculate the latter, it is worked with implicitly.
The right-shift by 10 accounts for the scale of 1024, converting the result of the multiplication from a fixed-point format with 10 bits after the radix point to just a plain old integer. The right-shift rounds down, adding 1 compensates for that in a way that ensures the result is never too low, and never zero.
The conversion from the number of bits to the number of words, (numBits + 31) >>> 5
, is just a division by 32 rounded up.
At every step we must round up to avoid allocating a too small array, but the size estimate does not need to be extremely accurate: estimating it a little too high only results in a little bit of wasted space in the internal array.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论