如何在C#中进行字符串修剪而不进行额外的内存分配?

huangapple go评论63阅读模式
英文:

How to trim a string without additional memory allocations in C#?

问题

这段代码的目的是从字符串input中移除最后一个符号。下面是示例代码:

static void Main(string[] args)
{
    string input = "Hello World-";
    string result = input.TrimEnd('-');
    Console.WriteLine(result); // Hello World
}

这段代码可以实现你的需求,但需要注意的是,TrimEnd() 方法在内部调用了System.String.FastAllocateString()。这是因为字符串(string)是不可变数据结构,通常情况下我们无法直接修改它。然而,在你的情况下,既然你不再需要input,你可以考虑重用它的内部缓冲区,而不是分配额外的缓冲区,并且可以让垃圾回收器(GC)最终清理input的缓冲区。这样可以减少总体的内存分配,并降低GC的工作负担。

英文:

I have a string input. I want to remove the last symbol from it. I can do it in this way:

static void Main(string[] args)
{
    string input = "Hello World-";
    string result = input.TrimEnd('-');
    Console.WriteLine(result); // Hello World
}

It works, but TrimEnd() calls System.String.FastAllocateString() internally. It makes sense, because string is immutable data structure, and in common case we cannot do anything else. However, in my case I don't need input anymore, so I would like to reuse its internal buffer, rather than allocate additional buffer, and "ask" GC to clean input buffer eventually. It could reduce overall allocations, and reduce GC work.

答案1

得分: 2

Console.Out.WriteLine 不接受 ReadOnlySpan<char> 作为输入,但 Console.Out,它是一个 TextWriter,确实公开了这样的方法。因此,避免分配的正确方式是:

ReadOnlySpan<char> input = "Hello World-";
Console.Out.WriteLine(input.TrimEnd('-'));

// 或者

// 使用索引器始终删除最后一个字符:
Console.Out.WriteLine(input[..^1]);

如果您需要一个字符串作为结果,并且必须在输入上应用多个字符串操作,那么使用 ReadOnlySpan<char> 将为您节省中间结果的重复分配,但最终您将不得不分配一个新的字符串,可能的字符串操作非常有限。除非您愿意使用不安全的代码或通过 COM 代码绕一个弯路,否则没有办法打破这个规则。

或者,您可以使用 StringBuilder 进行字符串操作。但这将分配一个 char[] 缓冲区,并且 ToString 调用也会调用 FastAllocateString。请参阅 StringBuilder 的参考源代码

英文:

Console.Out.WriteLine does not accept a ReadOnlySpan<char> as input, but Console.Out, which is a TextWriter, does expose such a method. Therefore, the correct way to avoid allocations is:

ReadOnlySpan<char> input = "Hello World-";
Console.Out.WriteLine(input.TrimEnd('-'));

// or

// Uses an indexer to always remove the last character:
Console.Out.WriteLine(input[..^1]);

If you need a string as result and you have to apply several string operations on the input, then using a ReadOnlySpan<char> will save you repeated allocations on the intermediate results, but in the end you will have to allocate a new string and the possible string operations are very limited. Unless you are willing to use unsafe code or to make a detour via COM code, there is no way to break this rule.

Alternatively, you can use a StringBuilder for string manipulations. But this will allocate a char[] buffer and ToString calls FastAllocateString as well. See Reference Source for StringBuilder.

答案2

得分: -1

这是 ReadOnlySpan 结构和 System.Memory 命名空间的用法

ReadOnlySpan<char> input = "Hello World-";
string result = new string(input.TrimEnd('-'));
Console.WriteLine(result);
英文:

This is what the ReadOnlySpan struct & System.Memory namespace are for

ReadOnlySpan&lt;char&gt; input = &quot;Hello World-&quot;;
string result = new string(input.TrimEnd(&#39;-&#39;));
Console.WriteLine(result);

huangapple
  • 本文由 发表于 2023年2月26日 21:13:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/75572222.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定