英文:
Security considerations when exposing pointers from a shared library
问题
Here's the translated code portion:
让我们假设我有一个实现以下函数的C库:
// 返回数组中的元素数量和指向第一个元素的指针。
// 指向的内存拥有静态生命周期。
size_t MyLib_GetValues(const int** outBasePtr);
我像这样使用它:
const int* values = NULL;
size_t count = MyLib_GetValues(&values);
if ( values )
{
for ( size_t index = 0; index < count; ++index )
{
DoSomething(values + index);
}
}
对于关于此方法的几个问题,鉴于没有阻止库的用户在数组末尾迭代,我有一些疑问:
- 这是否被视为
MyLib
的安全问题?我对系统如何处理库的方式了解不多,不知道values[count]
是否会被视为无效的内存访问,还是用户只能继续读取数组末尾并从库的内存中获取值。 - 如果这被认为是安全问题,有什么更好的方法吗?库是否应该将值复制到用户提供的缓冲区中?此外,如果直接将字符串公开为
const char*
,是否也会引发安全问题,因为您也可以随意读取其末尾的内容?
我的直觉是,是否应将此方法视为安全问题将取决于库的预期用例。鉴于库的用户可以轻松地在十六进制编辑器中查看内容,无论如何都无法阻止他们这样做。如果您试图隐藏内存中的内容,那么在这种情况下,API设计模式实际上并不重要。然而,在更受限制的情况下,用户只能通过提供某个在运行时链接到库的程序来访问库(例如,通过提供一些与库链接的程序),情况可能会有所不同。
从C库设计的角度来看,是否有最佳实践来处理这种情况呢?
英文:
Let's say I have a C library that implements the following function:
// Returns the number of elements in the array,
// and a pointer to the first element.
// The memory pointed to has static lifetime.
size_t MyLib_GetValues(const int** outBasePtr);
And I use it like this:
const int* values = NULL;
size_t count = MyLib_GetValues(&values);
if ( values )
{
for ( size_t index = 0; index < count; ++index )
{
DoSomething(values + index);
}
}
I have a few questions about this approach, given there's nothing to stop the user of the library iterating past the end of the array.
- Would this be classed as a security issue for
MyLib
? I don't know enough about how systems handle libraries to know whethervalues[count]
would be caught as an invalid memory access, or whether the user would just be able to read off the end of the array and obtain values from elsewhere in the library's memory. - If this would be considered a security issue, what would be a better approach? Should the library copy values out into a user-provided buffer? Additionally, would this mean that exposing strings directly as
const char*
would also be a security concern, given you can just read off the end of those as well if you want to?
My gut instinct is that whether this approach should be considered a security issue would depend on the intended use case for the library. Given the user of the library could easily look into the contents in a hex editor, there would be nothing to stop them doing that anyway. If you're trying to hide things in memory then it's game over in that case, and so the API design pattern doesn't really make any difference. However, in more restricted cases where a user would only have code-based access to the library (eg. by providing some program that links to the library at runtime), things might be different.
Is there best practice on this kind of thing from a C library design perspective?
答案1
得分: 1
以下是翻译好的部分:
允许库的用户在数组的末尾迭代可能被视为安全问题,因为它可能导致未定义的行为,包括内存损坏、分段错误和其他可能被攻击者利用的错误。根据平台和体系结构,此类错误可能会或可能不会被检测为无效的内存访问。
更好的方法是提供一个已知大小的用户提供的缓冲区,库可以安全地复制数组元素到其中。这种方法确保库保持在缓冲区的边界内,防止用户迭代超出数组的末尾。或者,库可以返回一个动态分配的缓冲区,用户负责在使用后释放。然而,后一种方法需要用户更加谨慎的内存管理,可能效率较低。
关于const char*字符串,同样的问题也适用,通常建议为字符串数据提供用户提供的缓冲区或动态分配的缓冲区,而不是直接暴露内部内存。
您正确指出了安全风险程度取决于库的预期用例。如果库在用户可以无限制访问内存的环境中使用,实施防止迭代超出数组或字符串末尾的保护措施可能效益有限。但在其他环境中,例如在受限制的沙箱环境中使用库或作为具有安全要求的大型系统的一部分时,这些保护措施可能至关重要。
C库设计的最佳实践通常包括:
- 明确定义API契约,包括函数签名、预期输入和输出以及边缘情况下的行为。
- 避免直接将内部数据结构或实现细节暴露给用户,以保持封装性和灵活性。
- 提供安全的默认值和明智的错误处理机制,以防止意外行为并促进调试。
- 使用适当的内存管理技术,例如提供用户提供的缓冲区或动态分配内存,并清楚地指定内存的所有权和释放责任。
英文:
Yes, allowing the user of the library to iterate past the end of the array can be considered a security issue because it can lead to undefined behavior, including memory corruption, segmentation faults, and other errors that could potentially be exploited by an attacker. Depending on the platform and architecture, such errors may or may not be caught as invalid memory access.
A better approach would be to provide a user-provided buffer of known size, where the library can safely copy the array elements. This approach ensures that the library stays within the bounds of the buffer, preventing the user from iterating past the end of the array. Alternatively, the library can return a dynamically allocated buffer that the user is responsible for freeing after use. However, the latter approach requires more careful memory management on the part of the user and may be less efficient.
Regarding const char* strings, the same issue applies, and it is generally recommended to provide a user-provided buffer or a dynamically allocated buffer for string data, rather than exposing the internal memory directly.
You are correct that the degree of security risk depends on the intended use case of the library. If the library is used in an environment where the user has unrestricted access to the memory, there may be limited benefit to implementing safeguards against iterating past the end of the array or string. However, in other environments, such as when the library is used in a sandboxed or constrained environment, or when it is used as part of a larger system with security requirements, such safeguards can be critical.
Best practices for C library design typically include:
Clearly defining the API contract, including function signatures, expected inputs and outputs, and behavior in edge cases.
Avoiding exposing internal data structures or implementation details directly to the user, to maintain encapsulation and flexibility.
Providing safe defaults and sensible error handling mechanisms to prevent unexpected behavior and facilitate debugging.
Using appropriate memory management techniques, such as providing user-provided buffers or dynamically allocating memory, and clearly specifying ownership and responsibility for freeing memory.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论