英文:
Do FreeRTOS heap implementations violate C aliasing rules?
问题
Looking at the code for heap 1 in FreeRTOS...
#if ( configAPPLICATION_ALLOCATED_HEAP == 1 )
/* The application writer has already defined the array used for the RTOS
* heap - probably so it can be placed in a special segment or address. */
extern uint8_t ucHeap[ configTOTAL_HEAP_SIZE ];
#else
static uint8_t ucHeap[ configTOTAL_HEAP_SIZE ];
#endif /* configAPPLICATION_ALLOCATED_HEAP */
...we see that a heap is just an array of uint8_t objects.
But then, in its void* pvPortMalloc(size_t xWantedSize)
function, it defines a uint8_t*
called pucAlignedHeap
, and a size_t
called xNextFreeByte
.
Our return value pvReturn
is then defined in this block...
/* Check there is enough room left for the allocation and. */
if( ( xWantedSize > 0 ) && /* valid size */
( ( xNextFreeByte + xWantedSize ) < configADJUSTED_HEAP_SIZE ) &&
( ( xNextFreeByte + xWantedSize ) > xNextFreeByte ) ) /* Check for overflow. */
{
/* Return the next free byte then increment the index past this
* block. */
pvReturn = pucAlignedHeap + xNextFreeByte;
xNextFreeByte += xWantedSize;
}
...and is then expected to be used by the programmer to store whatever data they want:
//Some example:
my_struct* x = pvPortMalloc(sizeof(my_struct));
But since the underlying data type is an array of uint8_t
, doesn't that mean that any real usage of the heap violates C's aliasing requirements?
And if that's true, then why are they allowed to violate these requirements without worrying about UB? FreeRTOS is hardly a small hobby project, so they must know what they're doing, and yet it surely looks like this is UB. Why can they do this, but I can't? They do not appear to have -fno-strict-aliasing
defined, so I don't think it's that.
英文:
Looking at the code for heap 1 in FreeRTOS...
#if ( configAPPLICATION_ALLOCATED_HEAP == 1 )
/* The application writer has already defined the array used for the RTOS
* heap - probably so it can be placed in a special segment or address. */
extern uint8_t ucHeap[ configTOTAL_HEAP_SIZE ];
#else
static uint8_t ucHeap[ configTOTAL_HEAP_SIZE ];
#endif /* configAPPLICATION_ALLOCATED_HEAP */
...we see that a heap is just an array of uint8_t objects.
But then, in its void* pvPortMalloc(size_t xWantedSize)
function, it defines a uint8_t*
called pucAlignedHeap
, and a size_t
called xNextFreeByte
.
Our return value pvReturn
is then defined in this block...
/* Check there is enough room left for the allocation and. */
if( ( xWantedSize > 0 ) && /* valid size */
( ( xNextFreeByte + xWantedSize ) < configADJUSTED_HEAP_SIZE ) &&
( ( xNextFreeByte + xWantedSize ) > xNextFreeByte ) ) /* Check for overflow. */
{
/* Return the next free byte then increment the index past this
* block. */
pvReturn = pucAlignedHeap + xNextFreeByte;
xNextFreeByte += xWantedSize;
}
...and is then expected to be used by the programmer to store whatever data they want:
//Some example:
my_struct* x = pvPortMalloc(sizeof(my_struct));
But since the underlying data type is an array of uint8_t
, doesn't that mean that any real usage of the heap violates C's aliasing requirements?
And if that's true, then why are they allowed to violate these requirements without worrying about UB? FreeRTOS is hardly a small hobby project, so they must know what they're doing, and yet it surely looks like this is UB. Why can they do this, but I can't? They do not appear to have -fno-strict-aliasing
defined, so I don't think it's that.
答案1
得分: 1
因为有许多任务从不需要在其生命周期内具备重复使用存储以容纳多个不相关种类的对象的能力,C标准并不要求所有实现都支持这种重复使用。标准允许实现通过支持超出强制要求的使用模式来扩展语言,而适用于需要在其生命周期内重复使用存储的任务的任何实现都必然会以这种方式扩展语言。然而,标准放弃了对此类事情的管辖权。
在clang和gcc处理的语言中,当不使用“-fno-strict-aliasing”时,一旦通过非字符类型的lvalue写入了任何存储,该存储将作为“对于该访问和不修改存储值的后续访问”的有效类型。因为该短语没有说“在使用其他类型修改存储值之前的所有后续访问”,所以一旦存储被写入,就没有办法有用地更改存储的有效类型。存储可以使用其他类型写入,但在使用两个或更多不兼容的非字符类型写入存储后,任何尝试使用任何非字符类型读取它的操作都将与至少一个写入存储的有效类型不兼容,从而引发UB(未定义行为)。
因此,在其生命周期内将存储重新用于不同类型的所有代码都将违反标准中给出的别名约束,除非它限制自己仅使用字符类型读取。然而,程序员不应该为了满足这种约束而费尽心机,因为适用于需要重复使用存储的任务的实现将支持这些任务,无论标准是否要求他们这样做。不幸的是,标准对应该支持哪些构造没有提供指导,将这种支持视为实现质量的问题。
英文:
Because there are many tasks that would never require the ability to recycle storage to hold multiple unrelated kinds of objects within its lifetime, the C Standard does not require that all implementations support such recycling. The Standard allows implementations to extend the language by supporting usage patterns beyond those mandated, and any implementation which is suitable for tasks that would require recycling storage within its lifetime will necessarily extend the language in that fashion. The Standard waives jurisdiction over such things, however.
In the language processed by clang and gcc when not using -fno-strict-aliasing
, once any storage has been written via an lvalue of non-character type, that storage will have that as an Effective Type for "for that access and for subsequent accesses that do not modify the stored value". Because that phrase doesn't say "all subsequent accesses until the stored value is modified using some other type", there is no way to usefully change the Effective Type of storage once it has been written. The storage may be written using other types, but after storage is written using two or more incompatible non-character types, any attempt to read it using any non-character-type would be incompatible with at least one of the Effective Types written to the storage, and thus invoke UB.
Thus, all code that repurposes storage for use as different types within its lifetime will violate the aliasing constraints given in the Standard unless it limits itself to using character-type reads. Programmers shouldn't jump through hoops to satisfy such constraints, however, because implementations that are suitable for tasks that would require reuse of storage will support such tasks regardless of whether or not the Standard would require that they do so. Unfortunately, the Standard offers no guidance as to what constructs should be supported, treating such support as a Quality of Implementation issue.
答案2
得分: 0
以下是您要翻译的内容:
Why can they do this, but I can't?
为什么他们可以这样做,而我不能?
You can... but it comes with a price.
你可以... 但这是有代价的。
The C standard defines a number of behavior rules that any standard compliant implementation must adhere to.
C标准定义了许多行为规则,任何符合标准的实现都必须遵守。
Further the C standard leaves a number of things to the implementation. This is called implementation-defined behavior. Quoted from draft N1570 for C standard 20111:
此外,C标准将一些事情留给了实现。这被称为实现定义的行为。摘自C标准20111的N1570草案:
3.4.1
implementation-defined behavior
unspecified behavior where each implementation documents how the choice is made
实现定义的行为是一种未指定的行为,每个实现都会记录选择方式。
On top of that the standard has the concept of undefined behavior.
除此之外,标准还有未定义行为的概念。
3.4.3
undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements
未定义行为是一种行为,在使用非便携或错误的程序构造或错误的数据时,国际标准不会强制要求。
For undefined behavior a note says:
对于未定义行为,有一条说明如下:
Possible undefined behavior ranges from ignoring the situation completely with unpredictable
results, to behaving during translation or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to terminating a translation or
execution (with the issuance of a diagnostic message)
可能的未定义行为范围从完全忽略情况,导致不可预测的结果,到在翻译或程序执行过程中以环境特性为特征的文档化方式行为(是否伴随诊断消息的发出),到终止翻译或执行(伴随诊断消息的发出)
Now if you are willing to write C code that can be used only on specific implementations/paltforms/environments, you can write non standard compliant code that works fine as long as the targeted implementation defines the behavior. No problem. And there are lots of code out there doing that.
现在,如果您愿意编写只能在特定实现/平台/环境上使用的C代码,您可以编写不符合标准的代码,只要目标实现定义了行为,就可以正常工作。没有问题。而且有很多这样的代码存在。
The price is that your code can't be used on any implementation.
代价是您的代码无法在任何实现上使用。
BTW:
顺便说一句:
The specific code mentioned in the question uses uint8_t
. By doing that the code is limited to be used on implementations that supports uint8_t
. And as written in "7.20.1.1 Exact-width integer types" of N1570, the C standard doesn't require all implementations to implement that type.
问题中提到的具体代码使用了uint8_t
。通过这样做,代码被限制为只能在支持uint8_t
的实现上使用。并且如N1570的"7.20.1.1 精确宽度整数类型"中所写,C标准不要求所有实现都实现了这种类型。
英文:
> Why can they do this, but I can't?
You can... but it comes with a price.
The C standard defines a number of behavior rules that any standard compliant implementation must adhere to.
Further the C standard leaves a number of things to the implementation. This is called implementation-defined behavior. Quoted from draft N1570 for C standard 20111:
> 3.4.1
implementation-defined behavior
unspecified behavior where each implementation documents how the choice is made
On top of that the standard has the concept of undefined behavior.
> 3.4.3
undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements
For undefined behavior a note says:
> Possible undefined behavior ranges from ignoring the situation completely with unpredictable
results, to behaving during translation or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to terminating a translation or
execution (with the issuance of a diagnostic message)
Now if you are willing to write C code that can be used only on specific implementations/paltforms/environments, you can write non standard compliant code that works fine as long as the targeted implementation defines the behavior. No problem. And there are lots of code out there doing that.
The price is that your code can't be used on any implementation.
BTW:
The specific code mentioned in the question uses uint8_t
. By doing that the code is limited to be used on implementations that supports uint8_t
. And as written in "7.20.1.1 Exact-width integer types" of N1570, the C standard doesn't require all implementations to implement that type.
答案3
得分: 0
当分配例程在不进行链接时的独立翻译单元中编译时,编译器在编译其他翻译单元时无法知道它们返回指针的对象类型。
当编译器编译使用内存分配例程的翻译单元时,可能会将该翻译单元稍后与符合C别名规则的翻译单元链接。特别是,它可能与返回指向动态分配的内存的对象模块链接,该内存最初没有有效类型。因此,编译器必须为当前翻译单元生成一个对象模块,如果与这样的模块链接,它将正常工作。
C标准中别名规则的效果是允许编译器优化一些接收不同类型对象的指针的代码。例如,给定一个例程 void foo(int *p, float *q)
,编译器可以假设 p
和 q
指向不同的内存,因此可以交换对 p[i]
和 q[j]
的操作。当内存分配例程位于独立的翻译单元中时,就不会涉及到它返回的地址,因此别名规则没有影响。
英文:
When the allocation routines are compiled in a separate translation unit without link-time optimization, then the compiler has no information about what object types they return pointers to when it is compiling other translation units.
When the compiler is compiling a translation unit that uses the memory allocation routines, it is possible the translation unit will later be linked with a translation unit that conforms to the C aliasing rules. In particular, it could be linked with an object module that returns pointers to dynamically allocated memory, which initially has no effective type. Therefore, the compiler must produce an object module for the current translation unit that will work correctly if it is linked with such a module.
The effect of the aliasing rules in the C standard is to allow the compiler to optimize some code that receives pointers to objects of different types. For example, given a routine void foo(int *p, float *q)
, the compiler could assume that p
and q
point to different memory, and therefore it can commute operations on p[i]
and q[j]
. When the memory allocation routines are in a separate translation unit, such situations never arise with regard to the addresses it returns, so there is no effect from the aliasing rules.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论