2023年4月4日 15:25:31go评论187阅读模式

英文:

Excluding certain string using regex in python

问题

I would like to apply regex to the below code such that I remove any string that appears between a comma and the word 'AS'.

Select customer_name, customer_type, COUNT(*) AS volume
FROM table
GROUP BY customer_name, customer_type
ORDER BY volume DESC
LIMIT 10

Expected output:

Select customer_name, customer_type, volume
FROM table
GROUP BY customer_name, customer_type
ORDER BY volume DESC
LIMIT 10

I tried the below but that did not give the desired output

result = re.sub(r",\s*COUNT\(\*\)\s*AS\s*\w+", "", text)

英文:

I would like to apply regex to the below code such that I remove any string that appears between a comma and the word 'AS'.

Select customer_name, customer_type, COUNT(*) AS volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10

Expected output:

Select customer_name, customer_type, volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10

I tried the below but that did not give the desired output

result = re.sub(r&quot;,\s*COUNT\(\*\)\s*AS\s*\w+&quot;, &quot;&quot;, text)

答案1

得分: 2

你可以使用捕获组，并在替换中使用该组。

(,\s*)[^,]*\sAS\b\s*

解释

(,\s*) 捕获组 1，匹配逗号和可选的空白字符
[^,]* 匹配除逗号之外的任何字符
\sAS\b\s* 匹配空白字符，然后是 AS，后面是可选的空格

正则表达式演示 | Python演示

import re
pattern = r&quot;(,\s*)[^,]*\sAS\b\s*&quot;
s = (&quot;Select customer_name, customer_type, COUNT(*) AS volume\\nFROM table\\nGROUP BY customer_name, customer_type\\nORDER BY volume DESC\\nLIMIT 10\n&quot;)
print(re.sub(pattern, r&quot;&quot;, s))

输出

Select customer_name, customer_type, volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10

英文:

You could use a capture group and use the group in the replacement.

(,\s*)[^,]*\sAS\b\s*

Explanation

(,\s*) Capture group 1, match a comma and optional whitespace chars
[^,]* Match any char except a comma
\sAS\b\s* Match a whitespace char, then AS followed by optional spaces

Regex demo | Python demo

import re
pattern = r&quot;(,\s*)[^,]*\sAS\b\s*&quot;
s = (&quot;Select customer_name, customer_type, COUNT(*) AS volume\\nFROM table\\nGROUP BY customer_name, customer_type\\nORDER BY volume DESC\\nLIMIT 10\n&quot;)
print(re.sub(pattern, r&quot;&quot;, s))

Output

Select customer_name, customer_type, volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10

答案2

得分: 1

我会使用：

text = "Select customer_name, customer_type, COUNT(*) AS volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10"
result = re.sub(r',\s*\S+\s+AS\b\s*', ', ', text)
print(result)

这会打印：

Select customer_name, customer_type, volume
FROM table
GROUP BY customer_name, customer_type
ORDER BY volume DESC
LIMIT 10

这里使用的正则表达式模式表示要匹配以下内容：

, 逗号
\s* 可选的空白字符
\S+ 一个非空白字符序列
\s+ 一个或多个空白字符
AS 字面上的 "AS"
\b 单词边界
\s* 更多可选的空白字符

英文:

I would use:

text = &quot;Select customer_name, customer_type, COUNT(*) AS volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10&quot;
result = re.sub(r&#39;,\s*\S+\s+AS\b\s*&#39;, &#39;, &#39;, text)
print(result)

This prints:

Select customer_name, customer_type, volume
FROM table
GROUP BY customer_name, customer_type
ORDER BY volume DESC
LIMIT 10

The regex pattern used here says to match:

, a comma
\s* optional whitespace
\S+ a non whitespace term
\s+ one or more whitespace characters
AS literal "AS"
\b word boundary
\s* more optional whitespace

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用Python排除特定字符串的正则表达式

问题

答案1

答案2

从 Plotly 图表的 X 轴标签中移除数据

平息 mypy 对使用了类型提示的 None 变量和其他操作数的不满。

在网络节点之间插值缺失数值的最佳方法

如何在Jenkins / Groovy中使用多行正则表达式匹配

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。