捕获最后匹配之后的句子部分

huangapple go评论104阅读模式
英文:

Regex capture rest of sentance after last match

问题

以下是翻译好的部分:

"Say I have a sentence where I would like to match (and eventually replace but that is not part of the question) certain markers based on what they look like:"

"假设我有一个句子,我想要匹配(最终可能替换,但这不是问题的一部分)某些标记,基于它们的外观:"

"I would like to replace the {NUM}, {#NUM} and {NUM#} parts based on logic and the number in the #. And then put it all back together as a single string."

"我想根据逻辑和#中的数字来替换{NUM}{#NUM}{NUM#}部分。然后将它们全部拼接成一个字符串。"

"I came up with"

"我想出了以下正则表达式:"

"which will obtain the captures I would like, but I cant seem to figure out how to get the REMAINING part of the sentence after the last match, i.e. jumped over the lazy dog."

"这个正则表达式可以获得我想要的捕获,但我似乎无法弄清楚如何获取最后匹配后句子的剩余部分,即jumped over the lazy dog."

"I thought maybe adding (.*?) to the end would do it but it does not. Here is what I have working:"

"我以为在末尾添加(.*?)可能会起作用,但实际上不起作用。这里是我的工作示例:"

英文:

Say I have a sentence where I would like to match (and eventually replace but that is not part of the question) certain markers based on what they look like:

The quick brown fox {NUM} jumped over {211NUM} the lazy dog {NUM}. The Quick brown fox {NUM10} jumped over the lazy dog.

I would like to replace the {NUM}, {#NUM} and {NUM#} parts based on logic and the number in the #. And then put it all back together as a single string.

I came up with

(.*?)({NUM}|{\d*NUM}|{NUM\d*})

which will obtain the captures I would like, but I cant seem to figure out how to get the REMAINING part of the sentence after the last match, i.e. jumped over the lazy dog.

I thought maybe adding (.*?) to the end would do it but it does not. Here is what I have working:

https://regex101.com/r/HpIcAa/1

答案1

得分: 1

不需要使用字符串的分割/连接来垃圾回收。替换效果很好,没有内存峰值,速度快,还接受函数作为参数:

  1. public class Program
  2. {
  3. public static void Main()
  4. {
  5. var pattern = @"(?<num>\{((?<n1>\d+)?NUM|NUM(?<n2>\d+))\})";
  6. var regx = new Regex(pattern, RegexOptions.ExplicitCapture | RegexOptions.Multiline);
  7. var input =
  8. @"The quick brown fox {NUM} jumped over {211NUM} the lazy dog {NUM}. The Quick brown fox {NUM10} jumped over the lazy dog. ";
  9. var output = regx.Replace(input, ToReplaceMatch);
  10. Console.WriteLine(output);
  11. }
  12. private static string ToReplaceMatch(Match match)
  13. {
  14. if (match.Groups["num"].Success)
  15. {
  16. var num = int.TryParse(match.Groups["n1"].Value ?? match.Groups["n2"].Value, out var res)
  17. ? res
  18. : (int?)null;
  19. return ToReplaceNum(num);
  20. }
  21. return match.Value;
  22. }
  23. private static string ToReplaceNum(int? number)
  24. {
  25. return "FOOBAR" + (number ?? 0);
  26. }
  27. }

输出:

  1. The quick brown fox FOOBAR0 jumped over FOOBAR211 the lazy dog FOOBAR0. The Quick brown fox FOOBAR0 jumped over the lazy dog.
英文:

No need to trash garbage collector with split/concating strings. Replace works good, no memory spikes, fast, accept functions as argument too:

  1. public class Program
  2. {
  3. public static void Main()
  4. {
  5. var pattern = @"(?<num>\{((?<n1>\d+)?NUM|NUM(?<n2>\d+))\})";
  6. var regx = new Regex(pattern, RegexOptions.ExplicitCapture | RegexOptions.Multiline);
  7. var input =
  8. @"The quick brown fox {NUM} jumped over {211NUM} the lazy dog {NUM}. The Quick brown fox {NUM10} jumped over the lazy dog. ";
  9. var output = regx.Replace(input, ToReplaceMatch);
  10. Console.WriteLine(output);
  11. }
  12. private static string ToReplaceMatch(Match match)
  13. {
  14. if (match.Groups["num"].Success)
  15. {
  16. var num = int.TryParse(match.Groups["n1"].Value ?? match.Groups["n2"].Value, out var res)
  17. ? res
  18. : (int?)null;
  19. return ToReplaceNum(num);
  20. }
  21. return match.Value;
  22. }
  23. private static string ToReplaceNum(int? number)
  24. {
  25. return "FOOBAR" + (number ?? 0);
  26. }
  27. }

Output:

  1. The quick brown fox FOOBAR0 jumped over FOOBAR211 the lazy dog FOOBAR0. The Quick brown fox FOOBAR0 jumped over the lazy dog.

答案2

得分: 1

你可以尝试使用这个正则表达式模式仅匹配你的关键词:(?<g1>{NUM})|(?<g2>{\d*NUM})|(?<g3>{NUM\d*}),然后使用Regex.Replace函数基于你的逻辑替换关键词。

  1. string pattern = @"(?<g1>{NUM})|(?<g2>{\d*NUM})|(?<g3>{NUM\d*})";
  2. string input = "The quick brown fox {NUM} jumped over {211NUM} the lazy dog {NUM}. The Quick brown fox {NUM10} jumped over the lazy dog.";
  3. output = Regex.Replace(input, pattern, delegate(Match match)
  4. {
  5. if (match.Groups["g1"].Success)
  6. {
  7. return "{Group 1}";
  8. }
  9. if (match.Groups["g2"].Success)
  10. {
  11. return "{Group 2}";
  12. }
  13. if (match.Groups["g3"].Success)
  14. {
  15. return "{Group 3}";
  16. }
  17. return match.Value;
  18. });

请参考此处的演示

英文:

You could try to match only your keywords using this regex pattern: (?<g1>{NUM})|(?<g2>{\d*NUM})|(?<g3>{NUM\d*})

Then using Regex.Replace function to replace keywords based on your logic.

  1. string pattern = @"(?<g1>{NUM})|(?<g2>{\d*NUM})|(?<g3>{NUM\d*})";
  2. string input = "The quick brown fox {NUM} jumped over {211NUM} the lazy dog {NUM}. The Quick brown fox {NUM10} jumped over the lazy dog. ";
  3. output = Regex.Replace(input, pattern, delegate(Match match)
  4. {
  5. if (match.Groups["g1"].Success)
  6. {
  7. return "{Group 1}";
  8. }
  9. if (match.Groups["g2"].Success)
  10. {
  11. return "{Group 2}";
  12. }
  13. if (match.Groups["g3"].Success)
  14. {
  15. return "{Group 3}";
  16. }
  17. return match.Value;
  18. });

See demo here

huangapple
  • 本文由 发表于 2023年3月9日 20:35:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/75684706.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定