捕获最后匹配之后的句子部分

huangapple go评论71阅读模式
英文:

Regex capture rest of sentance after last match

问题

以下是翻译好的部分:

"Say I have a sentence where I would like to match (and eventually replace but that is not part of the question) certain markers based on what they look like:"

"假设我有一个句子,我想要匹配(最终可能替换,但这不是问题的一部分)某些标记,基于它们的外观:"

"I would like to replace the {NUM}, {#NUM} and {NUM#} parts based on logic and the number in the #. And then put it all back together as a single string."

"我想根据逻辑和#中的数字来替换{NUM}{#NUM}{NUM#}部分。然后将它们全部拼接成一个字符串。"

"I came up with"

"我想出了以下正则表达式:"

"which will obtain the captures I would like, but I cant seem to figure out how to get the REMAINING part of the sentence after the last match, i.e. jumped over the lazy dog."

"这个正则表达式可以获得我想要的捕获,但我似乎无法弄清楚如何获取最后匹配后句子的剩余部分,即jumped over the lazy dog."

"I thought maybe adding (.*?) to the end would do it but it does not. Here is what I have working:"

"我以为在末尾添加(.*?)可能会起作用,但实际上不起作用。这里是我的工作示例:"

英文:

Say I have a sentence where I would like to match (and eventually replace but that is not part of the question) certain markers based on what they look like:

The quick brown fox {NUM} jumped over {211NUM} the lazy dog {NUM}. The Quick brown fox {NUM10} jumped over the lazy dog.

I would like to replace the {NUM}, {#NUM} and {NUM#} parts based on logic and the number in the #. And then put it all back together as a single string.

I came up with

(.*?)({NUM}|{\d*NUM}|{NUM\d*})

which will obtain the captures I would like, but I cant seem to figure out how to get the REMAINING part of the sentence after the last match, i.e. jumped over the lazy dog.

I thought maybe adding (.*?) to the end would do it but it does not. Here is what I have working:

https://regex101.com/r/HpIcAa/1

答案1

得分: 1

不需要使用字符串的分割/连接来垃圾回收。替换效果很好,没有内存峰值,速度快,还接受函数作为参数:

public class Program
{
    public static void Main()
    {
        var pattern = @"(?<num>\{((?<n1>\d+)?NUM|NUM(?<n2>\d+))\})";
        var regx = new Regex(pattern, RegexOptions.ExplicitCapture | RegexOptions.Multiline);
        var input =
            @"The quick brown fox {NUM} jumped over {211NUM} the lazy dog {NUM}.  The Quick brown fox {NUM10} jumped over the lazy dog. ";

        var output = regx.Replace(input, ToReplaceMatch);
        Console.WriteLine(output);
    }

    private static string ToReplaceMatch(Match match)
    {
        if (match.Groups["num"].Success)
        {
            var num = int.TryParse(match.Groups["n1"].Value ?? match.Groups["n2"].Value, out var res)
                ? res
                : (int?)null;
            return ToReplaceNum(num);
        }

        return match.Value;
    }

    private static string ToReplaceNum(int? number)
    {
        return "FOOBAR" + (number ?? 0);
    }
}

输出:

The quick brown fox FOOBAR0 jumped over FOOBAR211 the lazy dog FOOBAR0.  The Quick brown fox FOOBAR0 jumped over the lazy dog.
英文:

No need to trash garbage collector with split/concating strings. Replace works good, no memory spikes, fast, accept functions as argument too:

public class Program
{
    public static void Main()
    {
        var pattern = @"(?<num>\{((?<n1>\d+)?NUM|NUM(?<n2>\d+))\})";
        var regx = new Regex(pattern, RegexOptions.ExplicitCapture | RegexOptions.Multiline);
        var input =
            @"The quick brown fox {NUM} jumped over {211NUM} the lazy dog {NUM}.  The Quick brown fox {NUM10} jumped over the lazy dog. ";

        var output = regx.Replace(input, ToReplaceMatch);
        Console.WriteLine(output);
    }

    private static string ToReplaceMatch(Match match)
    {
        if (match.Groups["num"].Success)
        {
            var num = int.TryParse(match.Groups["n1"].Value ?? match.Groups["n2"].Value, out var res)
                ? res
                : (int?)null;
            return ToReplaceNum(num);
        }

        return match.Value;
    }

    private static string ToReplaceNum(int? number)
    {
        return "FOOBAR" + (number ?? 0);
    }
}

Output:

The quick brown fox FOOBAR0 jumped over FOOBAR211 the lazy dog FOOBAR0.  The Quick brown fox FOOBAR0 jumped over the lazy dog.

答案2

得分: 1

你可以尝试使用这个正则表达式模式仅匹配你的关键词:(?<g1>{NUM})|(?<g2>{\d*NUM})|(?<g3>{NUM\d*}),然后使用Regex.Replace函数基于你的逻辑替换关键词。

string pattern = @"(?<g1>{NUM})|(?<g2>{\d*NUM})|(?<g3>{NUM\d*})";
string input = "The quick brown fox {NUM} jumped over {211NUM} the lazy dog {NUM}.  The Quick brown fox {NUM10} jumped over the lazy dog.";
output = Regex.Replace(input, pattern, delegate(Match match)
{
	if (match.Groups["g1"].Success)
	{
	   return "{Group 1}";
	}
	if (match.Groups["g2"].Success)
	{
	   return "{Group 2}";
	}
	if (match.Groups["g3"].Success)
	{
	   return "{Group 3}";
	}
	return match.Value;
});

请参考此处的演示

英文:

You could try to match only your keywords using this regex pattern: (?<g1>{NUM})|(?<g2>{\d*NUM})|(?<g3>{NUM\d*})

Then using Regex.Replace function to replace keywords based on your logic.

string pattern = @"(?<g1>{NUM})|(?<g2>{\d*NUM})|(?<g3>{NUM\d*})";
string input = "The quick brown fox {NUM} jumped over {211NUM} the lazy dog {NUM}.  The Quick brown fox {NUM10} jumped over the lazy dog. ";
output = Regex.Replace(input, pattern, delegate(Match match)
{
	if (match.Groups["g1"].Success)
    {
       return "{Group 1}";
    }
	if (match.Groups["g2"].Success)
    {
       return "{Group 2}";
    }
	if (match.Groups["g3"].Success)
    {
       return "{Group 3}";
    }
	return match.Value;
});

See demo here

huangapple
  • 本文由 发表于 2023年3月9日 20:35:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/75684706.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定