Stringbuilder does not create new instance if the initial capacity set too small but a book says otherwise

huangapple go评论56阅读模式
英文:

Stringbuilder does not create new instance if the initial capacity set too small but a book says otherwise

问题

StringBuilder sb = new StringBuilder("test", 4);
sb.Append('\n');
sb.AppendLine("test1");
sb.AppendLine("test2");
sb.AppendLine("test3");
sb.AppendLine("test4");

查看 IL 代码,只有一行 newobj,但我认为应该有更多的 StringBuilder 类的实例,因为它应该通过创建新对象来增加其容量。或者我理解错了吗?

// [3 1 - 3 49]
IL_0000: ldstr "test"
IL_0005: ldc.i4.4
IL_0006: newobj instance void [System.Runtime]System.Text.StringBuilder::.ctor(string, int32)
IL_000b: stloc.0 // sb

// [4 1 - 4 17]
IL_000c: ldloc.0 // sb
IL_000d: ldc.i4.s 10 // 0x0a
IL_000f: callvirt instance class [System.Runtime]System.Text.StringBuilder [System.Runtime]System.Text.StringBuilder::Append(char)
IL_0014: pop

// [5 1 - 5 24]
IL_0015: ldloc.0 // sb
IL_0016: ldstr "test1"
IL_001b: callvirt instance class [System.Runtime]System.Text.StringBuilder [System.Runtime]System.Text.StringBuilder::AppendLine(string)
IL_0020: pop

// [6 1 - 6 24]
IL_0021: ldloc.0 // sb
IL_0022: ldstr "test2"
IL_0027: callvirt instance class [System.Runtime]System.Text.StringBuilder [System.Runtime]System.Text.StringBuilder::AppendLine(string)
IL_002c: pop

// [7 1 - 7 24]
IL_002d: ldloc.0 // sb
IL_002e: ldstr "test3"
IL_0033: callvirt instance class [System.Runtime]System.Text.StringBuilder [System.Runtime]System.Text.StringBuilder::AppendLine(string)
IL_0038: pop

// [8 1 - 8 24]
IL_0039: ldloc.0 // sb
IL_003a: ldstr "test4"
IL_003f: callvirt instance class [System.Runtime]System.Text.StringBuilder [System.Runtime]System.Text.StringBuilder::AppendLine(string)
IL_0044: pop
IL_0045: ret

从 Troelsen 的关于 C# 的书中,我翻译的段落如下:“如果你添加的字符数量超过了指定的限制,StringBuilder 对象将复制其数据到新的实例,并增加缓冲区大小,增加的大小为指定的限制。”

我请朋友提供原文引用,这是来自第 89 页的 "Pro C#10 with .Net 6" 原文:

"If you append more characters than the specified limit, the StringBuilder object will copy its data into a new instance and grow the buffer by the specified limit."

英文:
StringBuilder sb = new StringBuilder("test", 4);
sb.Append('\n');
sb.AppendLine("test1");
sb.AppendLine("test2");
sb.AppendLine("test3");
sb.AppendLine("test4");

looking on IL code there is only one newobj line, but I thought there should be more instances of StringBuilder class since it should increase its capacity by creating new object? Or I got it wrong?

// [3 1 - 3 49]
    IL_0000: ldstr        "test"
    IL_0005: ldc.i4.4
    IL_0006: newobj       instance void [System.Runtime]System.Text.StringBuilder::.ctor(string, int32)
    IL_000b: stloc.0      // sb

    // [4 1 - 4 17]
    IL_000c: ldloc.0      // sb
    IL_000d: ldc.i4.s     10 // 0x0a
    IL_000f: callvirt     instance class [System.Runtime]System.Text.StringBuilder [System.Runtime]System.Text.StringBuilder::Append(char)
    IL_0014: pop

    // [5 1 - 5 24]
    IL_0015: ldloc.0      // sb
    IL_0016: ldstr        "test1"
    IL_001b: callvirt     instance class [System.Runtime]System.Text.StringBuilder [System.Runtime]System.Text.StringBuilder::AppendLine(string)
    IL_0020: pop

    // [6 1 - 6 24]
    IL_0021: ldloc.0      // sb
    IL_0022: ldstr        "test2"
    IL_0027: callvirt     instance class [System.Runtime]System.Text.StringBuilder [System.Runtime]System.Text.StringBuilder::AppendLine(string)
    IL_002c: pop

    // [7 1 - 7 24]
    IL_002d: ldloc.0      // sb
    IL_002e: ldstr        "test3"
    IL_0033: callvirt     instance class [System.Runtime]System.Text.StringBuilder [System.Runtime]System.Text.StringBuilder::AppendLine(string)
    IL_0038: pop

    // [8 1 - 8 24]
    IL_0039: ldloc.0      // sb
    IL_003a: ldstr        "test4"
    IL_003f: callvirt     instance class [System.Runtime]System.Text.StringBuilder [System.Runtime]System.Text.StringBuilder::AppendLine(string)
    IL_0044: pop
    IL_0045: ret

from Troelsen book about c#, translated paragraph by me: "If you add more characters than the specified limit, the StringBuilder object will copy its data to a new instance and increase the buffer size by the specified limit."

Asked a friend to provide original quote, here it is, page 89 "Pro C#10 with .Net 6":

> If you append more characters than the specified limit, the
> StringBuilder object will copy its data into a new instance and grow
> the buffer by the specified limit.

答案1

得分: 5

查看您自己的C#代码的IL将不会告诉您关于StringBuilder内部做了什么,这正是该书所讨论的内容。 您需要查看StringBuilder的源代码,或者查看其IL以了解这一点。

StringBuilder的文档仅提到“新缓冲区”或“分配新内存”,从未明确提到“StringBuilder的新实例”。

如果有足够的空间,数据将附加到缓冲区;否则,将分配一个新的、更大的缓冲区,将原始缓冲区的数据复制到新缓冲区,然后将新数据附加到新缓冲区。

如果添加的字符数导致StringBuilder对象的长度超过其当前容量,将分配新内存,Capacity属性的值加倍,新字符将添加到StringBuilder对象,并调整其Length属性。

因此,它是否创建新的StringBuilder实例是一个实现细节。在source.dot.net中查看,它确实在ExpandByABlock中这样做。

// 在更新任何状态之前分配数组,以避免在内存不足异常的情况下留下不一致的状态
char[] chunkChars = GC.AllocateUninitializedArray<char>(newBlockLength);

// 通过一些O(1)引用调整,将所有数据从这个块移动到一个新块。然后,使这个块指向新块作为其前身。
m_ChunkPrevious = new StringBuilder(this);
m_ChunkOffset += m_ChunkLength;
m_ChunkLength = 0;

m_ChunkChars = chunkChars;

在这个实现中,StringBuilder被实现为一个链表。StringBuilder实例是节点。上面的代码基本上将前一个节点设置为this的副本,并使this成为一个容量为newBlockLength的“空”节点。

英文:

Looking at your own C# code's IL will not tell you anything about what StringBuilder does internally, which is what the book is talking about. You would need to look at StringBuilder's source code, or its IL to see that.

The documentation of StringBuilder only mentions "new buffers" or "allocating new memory", and never specifically "new instances of StringBuilder".

> New data is appended to the buffer if room is available; otherwise, a new, larger buffer is allocated, data from the original buffer is copied to the new buffer, and the new data is then appended to the new buffer.

> If the number of added characters causes the length of the StringBuilder object to exceed its current capacity, new memory is allocated, the value of the Capacity property is doubled, new characters are added to the StringBuilder object, and its Length property is adjusted.

So whether or not it creates new instances of StringBuilders is an implementation detail. Looking at source.dot.net, it does indeed do that in ExpandByABlock.

// Allocate the array before updating any state to avoid leaving inconsistent state behind in case of out of memory exception
char[] chunkChars = GC.AllocateUninitializedArray<char>(newBlockLength);

// Move all of the data from this chunk to a new one, via a few O(1) reference adjustments.
// Then, have this chunk point to the new one as its predecessor.
m_ChunkPrevious = new StringBuilder(this);
m_ChunkOffset += m_ChunkLength;
m_ChunkLength = 0;

m_ChunkChars = chunkChars;

In this implementation, StringBuilder is implemented as a linked list. The StringBuilder instances are nodes. The above code basically sets the previous node to a copy of this, and make this an "empty" node with a capacity of newBlockLength.

huangapple
  • 本文由 发表于 2023年3月7日 08:59:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/75657171.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定