将两个表示无符号数字的字节数组相加时,无法添加最后的进位。

huangapple go评论75阅读模式
英文:

Adding two byte arrays which represent unsigned numbers together fails to add last carry in c

问题

我已经为练习编写了一个大数库,使用C语言,但加法操作出现了一些问题。它通常运行,但忽略了末尾的进位,导致结果不正确。

我一直在尝试在C语言中编写一个大数库,以下是迄今为止用于加法操作的代码。目前的代码如下:

BIG_INT 类型是一个 typedef,表示一个8字节的 unsigned char 数组。
8字节用于测试,以后会变成512字节。

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#define BIG_INT_SIZE 8
typedef unsigned char BIG_INT[BIG_INT_SIZE];

int add_big_int(BIG_INT *left, size_t right)
{
    if (!right)
        return 0;

    char buffer[sizeof(size_t)];
    memcpy(buffer, &right, sizeof(size_t));

    unsigned int carry = 0;
    for (int64_t i = BIG_INT_SIZE - 1; i >= 0; --i) {
        unsigned int sum = (*left[i] & 0xff) + (buffer[i] & 0xff) + carry;
        *left[i] = sum;
        carry = sum >> 8;
    }

    printf("carry: %u\n", carry);
    return 0;
}

int main() 
{
    BIG_INT big = { 0, 0, 0, 0, 0, 0, 0, 1 };
    add_big_int(&big, 255);
    return 0;
}

当相加数字如 256 + 1 = 257 时,字节数组是 [0, 0, 0, 0, 0, 0, 1, 1](逆序),但当相加 255 + 1 = 256 时,字节数组却是 [0, 0, 0, 0, 0, 0, 0, 0] 而不是预期的 [0, 0, 0, 0, 0, 0, 1, 0],并且进位变量设置为 1

英文:

I have been writing a big numbers library for practice reasons in C and the addition operation is causing me some issues. It works normally but ignores the carry at the end and the result is incorrect.

I have been trying to write a big number library in c and here is the code so far for the addition operation. Code so far:

The BIG_INT type is a typedef for a unsigned char array of 8 bytes.
8 bytes is used for testing, it will be like 512 bytes later.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#define BIG_INT_SIZE 8
typedef unsigned char BIG_INT[BIG_INT_SIZE];

int add_big_int(BIG_INT *left, size_t right)
{
    if (!right)
        return 0;

    char buffer[sizeof(size_t)];
    memcpy(buffer, &right, sizeof(size_t));

    unsigned int carry = 0;
    for (int64_t i = BIG_INT_SIZE - 1; i >= 0; --i) {
        unsigned int sum = (*left[i] & 0xff) + (buffer[i] & 0xff) + carry;
        *left[i] = sum;
        carry = sum >> 8;
    }

    printf("carry: %u\n", carry);
    return 0;
}

int main() 
{
    BIG_INT big = { 0, 0, 0, 0, 0, 0, 0, 1 };
    add_big_int(&big, 255);
    return 0;
}

When adding numbers like 256 + 1 = 257, the byte array is [0, 0, 0, 0, 0, 0, 1, 1] (reverse order), however when adding 255 + 1 = 256, the byte array is [0, 0, 0, 0, 0, 0, 0, 0] instead of the expected [0, 0, 0, 0, 0, 0, 1, 0] and the carry variable is set to 1.

答案1

得分: 0

首先,函数的第一个参数是一个指针:

int add_big_int(BIG_INT* left, size_t right)

如果BIG_INT定义如下:

typedef unsigned char BIG_INT[8];

那么参数的类型就是char(*)[8]

将数组传递给指针并没有太多意义。只需要像这样声明函数:

int add_big_int(BIG_INT left, size_t right)

然后像这样调用它:

add_big_int(big, 255);

该函数总是返回0,对于函数的用户而言并不具有信息量。要么函数应该返回有意义的值,要么返回类型应该是void

在代码中使用size_t类型,例如:

char buffer[sizeof(size_t)];

只会让代码的读者感到困惑。

相反,你可以使用表达式sizeof( BIG_INT ),前提是数组BIG_INT足够大,可以存储size_t类型的值。

使用表达式*left[i]会引发未定义行为。相反,你应该使用表达式( *left )[i]。但无论如何,使用按位与运算符的表达式,如:

(buffer[i] & 0xff)

没有效果,因为你的数组元素类型是unsigned char

总的来说,由于可能会发生溢出,你应该使用动态分配的数组,将其封装在一个结构体中,这涉及到使用进位标志。

英文:

For starters the first parameter of the function is a pointer

int add_big_int(BIG_INT* left, size_t right)
                ^^^^^^^^

If BIG_INT is defined like

typedef unsigned char BIG_INT[8];

then it means that the parameter has the type char ( * )[8].

There is no great sense to pass the array through a pointer to it. It is enough to declare the function like

int add_big_int(BIG_INT left, size_t right)

and call it like

add_big_int(big, 255);

The function always returns 0 that is not informative for the user of the function. Either the function should return meaningful values or its return type should be void.

Using the type size_t as for example

char buffer[sizeof(size_t)];

only confuses readers of the code.

Insetad you could use expression sizeof( BIG_INT ) provided that the array BIG_INT is large enough to store values of the type size_t.

The used expression *left[i] invokes undefined behavior. Instead you have to use expression ( *left )[i]. But in any case using expressions with the bitwise AND operator like

(buffer[i] & 0xff)

has no effect because you have arrays with the element type unsigned char.

In general you should use a dynamically allocated array enclosed in a structure due to using the carry flag because there can be an overflow.

答案2

得分: 0

代码混乱,需要一些组织来帮助修复它。

看看这个循环:for (int64_t i = BIG_INT_SIZE - 1; i >= 0; --i)。这个循环遍历了BIG_INT_SIZE个元素,使用了left[i]buffer[i]buffer中有多少元素?

buffer被声明为char buffer[sizeof(size_t)];,所以如果sizeof(size_t)BIG_INT_SIZE不同,那么这个循环就会严重出错。你要么需要让bufferleft具有相同数量的元素,要么你需要在循环中编写代码来处理不同的大小。

让我们使它们具有相同的大小,使用char buffer[BIG_INT_SIZE];

接下来,比较这两个声明:

typedef unsigned char BIG_INT[BIG_INT_SIZE];
char buffer[BIG_INT_SIZE];

有什么不同?一个使用了unsigned char,另一个使用了char。C标准没有规定char是有符号还是无符号的。当你在有符号的char中放入255时,它很可能会变成-1,尽管这是与实现相关的。你应该使用unsigned char

unsigned char buffer[BIG_INT_SIZE];

接下来,你有这个语句:memcpy(buffer, &right, sizeof(size_t));,但我们需要考虑buffer大小设置的改变。另外,memcpy按照它们在内存中的顺序复制字节,但从你的其他代码来看,你希望低字节位于较高的地址。所以你可以这样做:

for (int i = 0; i < sizeof right; ++i)
{
    buffer[BIG_INT_SIZE - 1 - i] = right & 0xff;
    right >>= 8;
}
for (int i = sizeof right; i < BIG_INT_SIZE; ++i)
    buffer[BIG_INT_SIZE - 1 - i] = 0;

第一个循环将right的低值字节放入buffer的高地址元素中,然后将right的字节向右移动。第二个循环用零填充buffer的任何其他元素。

这里的一个潜在风险是rightbuffer大,这会导致第一个循环越界缓冲区。我们可以通过以下方式防范这种情况:

_Static_assert(sizeof right <= sizeof buffer, "right is too big for buffer");

我在上面使用了& 0xff,模仿了你现有的代码。通常,人们会设计一个八位字节,如果是这种情况,& 0xff是不必要的;赋值可以只是buffer[BIG_INT_SIZE - 1 - i] = right;,自动将right转换为unsigned char将有效地移除高位。如果你进行这个更改,你可能想要插入一个 _Static_assert,确保CHAR_BIT是八。CHAR_BIT<limits.h>中定义。

接下来,看看unsigned int sum = (*left[i] & 0xff) + (buffer[i] & 0xff) + carry;。数组下标运算的优先级高于指针解引用,所以*left[i]*(left[i]),而不是(*left)[i]left是指向BIG_INT的指针,只传递了一个BIG_INT,所以你不能使用left[i],其中i不为零。你需要使用(*left)来获取BIG_INT,然后[i]来从中选择一个元素,所以你需要unsigned int sum = ((*left)[i] & 0xff) + (buffer[i] & 0xff) + carry;

类似地,*left[i] = sum;应该是(*left)[i] = sum;

经过这些更改,你的代码将为你展示的情况得到正确的答案。然而,你应该考虑其他设计方面的问题,包括:

  • 你想将BIG_INT定义为数组吗?将其定义为包含数组的结构可能更安全。这将在早期捕获*left[i]的错误。
  • 为什么right的类型是size_tsize_t具有与对象大小一起使用的含义,而不是可能用于大整数类型的任意算术值。你是否想使用有符号类型?
  • 为什么循环中的i使用int64_t?那是一个固定宽度的类型。你可能想要一个像ptrdiff_t这样的类型,它适用于数组索引(它是有符号的,因此可以与i >= 0测试一起使用,并且它是一种设计成根据目标平台调整其宽度的类型)。
  • 当最后的carry不为零时,你将采取什么措施?
  • add_big_int是否始终返回零,或者你对它有一些未来的计划?
英文:

The code is disorganized, and some organization will help fix it.

Look at this loop: for (int64_t i = BIG_INT_SIZE - 1; i &gt;= 0; --i). That iterates through BIG_INT_SIZE elements, using left[i] and buffer[i]. How many elements are in buffer?

buffer is declared char buffer[sizeof(size_t)];, so, if sizeof(size_t) is different from BIG_INT_SIZE, the loop is massively broken. You either need buffer and left to have the same number of elements or you need code in the loop to handle disparate sizes.

Let’s make them the same size, using char buffer[BIG_INT_SIZE];.

Next, compare these two declarations:

typedef unsigned char BIG_INT[BIG_INT_SIZE];
   char buffer[BIG_INT_SIZE];

What is different? One uses unsigned char, and the other uses char. The C standard does not say whether char is signed or unsigned. When you put 255 in signed char, it is likely to take on the value −1, although this is implementation-defined. You want unsigned char:

   unsigned char buffer[BIG_INT_SIZE];

Next, you have memcpy(buffer, &amp;right, sizeof(size_t));, but we need to account for the change in how the size of buffer is set. Also, memcpy copies the bytes in whatever order they are in memory, but you want the low bytes at higher addresses, judging from your other code. So what you can do is:

   for (int i = 0; i &lt; sizeof right; ++i)
   {
	   buffer[BIG_INT_SIZE - 1 - i] = right &amp; 0xff;
	   right &gt;&gt;= 8;
   }
   for (int i = sizeof right; i &lt; BIG_INT_SIZE; ++i)
	   buffer[BIG_INT_SIZE - 1 - i] = 0;

The first loops puts the low-value byte of right into the high-address element of buffer and then shifts the bytes of right right. The second loop fills in any further elements of buffer with zeros.

A potential hazard here is that right is larger than buffer, which would make the first loop overrun the buffer. We can guard against that with:

	_Static_assert(sizeof right &lt;= sizeof buffer, &quot;right is too big for buffer&quot;);

I used &amp; 0xff above, mimicking your existing code. Often, people would design for an eight-bit byte, in which case the &amp; 0xff is unnecessary; the assignment could be just buffer[BIG_INT_SIZE - 1 - i] = right;, and the automatic conversion of right to unsigned char will effectively remove higher bits. If you make that change, you may want to insert a _Static_assert that CHAR_BIT is eight. CHAR_BIT is defined in &lt;limits.h&gt;.

Next, look at unsigned int sum = (*left[i] &amp; 0xff) + (buffer[i] &amp; 0xff) + carry;. Array subscripting has higher precedence than pointer dereferencing, so *left[i] is *(left[i]), not (*left)[i]. left is a pointer to BIG_INT, and just one BIG_INT is passed, so you cannot use left[i] with i being non-zero. You need (*left) to get the BIG_INT and then [i] to select an element from it, so you want unsigned int sum = ((*left)[i] &amp; 0xff) + (buffer[i] &amp; 0xff) + carry;.

Similarly, *left[i] = sum; should be (*left)[i] = sum;.

With those changes made, your code will get the correct answer for the case you show. However, you should consider other design aspects, including:

  • Do you want to define BIG_INT as an array? It may be safer as a structure containing an array. This would have caught the *left[i] error earlier?
  • Why is the type of right size_t? size_t has a connotation that it is used with object sizes, not arbitrary arithmetic values that might be used with big-integer types. Do you want to use a signed type for it?
  • Why does the loop use int64_t for i? That is a fixed-width type. You might want a type like ptrdiff_t, which is suitable for indexing arrays (it is signed, so it can be used with an i &gt;= 0 test, and it is a type designed to adjust its width to the target platform.)
  • What will you do when the final carry is non-zero?
  • Will add_big_int always return zero, or do you have some future plan for it?

huangapple
  • 本文由 发表于 2023年7月28日 02:19:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/76782473.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定