std::string的reserve函数是否会为空字符分配额外的字节?

huangapple go评论71阅读模式
英文:

Does std::string reserve allocate an extra byte for the null?

问题

这个问题的动机来自于以下代码片段,来源于 https://stackoverflow.com/questions/3303527/how-to-pre-allocate-memory-for-a-stdstring-object

file.seekg(0, std::ios::end);
s4.resize(file.tellg());
file.seekg(0, std::ios::beg);

file.read(&s4[0], s4.length());

在我看来,只有在 resize() 最初分配了一个额外的字节用于空字符时,这才能保证是正确的(随后的 s4.data() 将返回一个长度为 N 的以空字符结尾的字符串,总大小为 N+1 字节),并且是高效的(s4.data() 不需要额外的重新分配/复制)。

它确实这样做吗?

英文:

This question is motivated by the following code fragment from https://stackoverflow.com/questions/3303527/how-to-pre-allocate-memory-for-a-stdstring-object

file.seekg(0, std::ios::end);
s4.resize(file.tellg());
file.seekg(0, std::ios::beg);

file.read(&s4[0], s4.length());

It seems to me that this is only guaranteed to be both correct (subsequent s4.data() will return a null-terminated string of length N, total size N+1 bytes) and efficient (s4.data() won't need to do an extra reallocate/copy) if resize() allocated an extra byte for the null in the first place.

Does it indeed do so?

答案1

得分: 4

C++标准([basic.string])规定,对于std::basic_string<charT>

data() + size() 指向一个具有值 charT()(即“空终止符”)的对象,

此外,数据必须是连续的。因此,是的。始终在字符串数据后分配一个终止字符,使其成为有效的C字符串。

英文:

The C++ Standard ([basic.string]) specifies that, for std::basic_string<charT>:

> data() + size() points at an object with value charT() (a “null terminator”),

Furthermore, the data must be contiguous. Therefore, yes. There is always a terminating character allocated after the string data, to make it a valid C string.

答案2

得分: 3

如果您查看libstdc中的basic_string源代码,您会看到这是在_M_create函数中完成的:

  // 注意:需要一个char_type[__capacity]的数组,以及一个终止的null char_type()元素。
  return _S_allocate(_M_get_allocator(), __capacity + 1);

因此,在libstdc中,是的,这是已经完成的。

但是是否需要这样做完全取决于底层实现如何处理内存。例如,如果底层操作系统已经保证对于给定指针,分配内存后的访问总是导致\0,那么就不需要这样做。

另外一点:

一个实现可以(根据我所知,这也是为小字符串优化部分所做的)使用包含字符串长度信息的内存部分来用于空终止部分。

所以你可以有类似这样的结构:

struct string {
   size_t capacity;
   char* data;
}

而数据可能看起来像这样:

[与capacity相等的字节数][用于存储长度所需的字节]

长度存储为max_capacity-length_of_string,因此如果字符串达到最大容量,存储长度的字节将为0,然后可以用作空终止,因此不需要额外的字节来分配\0(只需要长度的字节)。

英文:

If you look at the source of basic_string in libstdc you can see that this is done in the _M_create function:

  // NB: Need an array of char_type[__capacity], plus a terminating
  // null char_type() element.
  return _S_allocate(_M_get_allocator(), __capacity + 1);

So for the libstdc, yes this is done.

But if this is required or not depends entirely on how the memory is handled by the underlying implementation. If e.g. the underlying OS would already guarantee that for a given pointer, an access after the allocated memory would always result in a \0, then this would not be required.

A side note:

An implementation could (and this is AFAIK also done for the small string optimization part) use the memory which contains the information about the length of the string also for the null termination part.

So you could have something like this:

struct string {
   size_t capacity;
   char* data;
}

And data could look like this:

[number of bytes equal to capacity][bytes required for storing length]

Length is stored as max_capacity-lenght_of_string so if the string reaches maximum capacity the bytes storing the length will be 0 and could then serve as the null termination so no additional byte for the \0 would need to be allocated (only the ones for the length).

答案3

得分: 2

Terminator不算在std::string的元素中

您可以通过立即使用正确大小初始化字符串来完成相同的操作

size_t = file.tellg();
std::string str(size, '
size_t = file.tellg();
std::string str(size, '\0');
file.seekg(0);
file.read(&str[0], size);
'); file.seekg(0); file.read(&str[0], size);

https://en.cppreference.com/w/cpp/io/basic_istream/read

英文:

Terminator does not count as an element in std::string

you could do the same with initializing the string with right size right away

size_t = file.tellg();
std::string str(size, '
size_t = file.tellg();
std::string str(size, '\0');
file.seekg(0);
file.read(&str[0], size);
'); file.seekg(0); file.read(&str[0], size);

https://en.cppreference.com/w/cpp/io/basic_istream/read

huangapple
  • 本文由 发表于 2023年6月22日 15:18:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76529418.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定