正则表达式在Google表格中,匹配前一个周期或分号。

huangapple go评论62阅读模式
英文:

regex in google sheets, match previous period or semi-colon

问题

I want the return to be the sentence with the double question mark.

我希望返回的是带有双问号的句子。

I'm using the formula =trim(regexextract(H4,"[.;](.*?)\?\?"))

我正在使用公式=trim(regexextract(H4,"[.;](.*?)\?\?"))

ex 1:

例如 1:

blah blah. 1 The quick brown fox jumps over the lazy dog?? blah blah

我得到了正确的返回值。 1 The quick brown fox jumps over the lazy dog

但是例子 2:

但是例子 2:

blah blah. 1 The quick brown fox jumps over the lazy dog. blah blah ; 2 The quick brown fox jumps over the lazy dog?? blah blah, blah blah?

我得到了

1 The quick brown fox jumps over the lazy dog. blah blah ; 2 The quick brown fox jumps over the lazy dog

我只想要

2 The quick brown fox jumps over the lazy dog

带有双问号的句子。

不确定如何从问号回溯到第一个句号或分号,因为RE2不支持回溯。

我这里有一个示例链接 https://docs.google.com/spreadsheets/d/1mPFiWZ4WCOSPoae-ga3TaJOFlIfPIp7ALf-4ht6xpFo/edit#gid=2100307022

英文:

I want the return to be the sentence with the double question mark.

I'm using the formula =trim(regexextract(H4,"[\.;](.*?)\?\?"))

ex 1:

blah blah. 1 The quick brown fox jumps over the lazy dog?? blah blah

I get the right return. 1 The quick brown fox jumps over the lazy dog

but ex 2:

blah blah. 1 The quick brown fox jumps over the lazy dog. blah blah ; 2 The quick brown fox jumps over the lazy dog?? blah blah, blah blah?

I get

1 The quick brown fox jumps over the lazy dog. blah blah ; 2 The quick brown fox jumps over the lazy dog

I only want

2 The quick brown fox jumps over the lazy dog

the sentence with the double question mark.

not sure how to make it to lookback from the question mark to the first period or semi-colon since RE2 doesn't support lookback.

Here is a sample https://docs.google.com/spreadsheets/d/1mPFiWZ4WCOSPoae-ga3TaJOFlIfPIp7ALf-4ht6xpFo/edit#gid=2100307022

答案1

得分: 1

使用 .*(\d[\w\s]+) 从最后一个数字开始并在第一个非字母数字、非空白字符处停止,就像这样:

=arrayformula( iferror( regexextract(B5:B, ".*(\d[\w\s]+)") ) )

这个公式完全匹配你在示例电子表格中呈现的所需结果。

英文:

To start at the last digit and stop at the first non-alphanumeric, non-whitespace character, use .*(\d[\w\s]+), like this:

=arrayformula( iferror( regexextract(B5:B, ".*(\d[\w\s]+)") ) )

This formula exactly matches the desired results you present in the sample spreadsheet.

答案2

得分: 0

Sure, here's the translated part:

Explanation

  • [.;] 匹配 .;
  • \s+ 匹配 1 个或多个空白字符
  • (.*?) 捕获组 1,匹配任何字符尽可能少
  • [^\w\s] 匹配除了单词字符或空白字符之外的单个字符

请查看 regex101 演示 上的捕获组匹配。

示例代码:

=trim(regexextract(H4,"[.;]\s+(.*?)[^\w\s]"))

英文:

You might use:

[.;]\s+(.*?)[^\w\s]

Explanation

  • [.;] Match either . or ;
  • \s+ Match 1+ whitespace chars
  • (.*?) Capture group 1, match any char as few as possible
  • [^\w\s] Match a single char other than a word char or whitespace char

See the capture group matches at the regex101 demo.

Example code:

=trim(regexextract(H4,"[.;]\s+(.*?)[^\w\s]"))

huangapple
  • 本文由 发表于 2023年1月6日 13:18:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/75027188.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定