使用Python排除特定字符串的正则表达式

huangapple go评论160阅读模式
英文:

Excluding certain string using regex in python

问题

I would like to apply regex to the below code such that I remove any string that appears between a comma and the word 'AS'.

Select customer_name, customer_type, COUNT(*) AS volume
FROM table
GROUP BY customer_name, customer_type
ORDER BY volume DESC
LIMIT 10

Expected output:

Select customer_name, customer_type, volume
FROM table
GROUP BY customer_name, customer_type
ORDER BY volume DESC
LIMIT 10

I tried the below but that did not give the desired output

result = re.sub(r",\s*COUNT\(\*\)\s*AS\s*\w+", "", text)
英文:

I would like to apply regex to the below code such that I remove any string that appears between a comma and the word 'AS'.

Select customer_name, customer_type, COUNT(*) AS volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10

Expected output:

Select customer_name, customer_type, volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10

I tried the below but that did not give the desired output

result = re.sub(r",\s*COUNT\(\*\)\s*AS\s*\w+", "", text)

答案1

得分: 2

你可以使用捕获组,并在替换中使用该组。

(,\s*)[^,]*\sAS\b\s*

解释

  • (,\s*) 捕获组 1,匹配逗号和可选的空白字符
  • [^,]* 匹配除逗号之外的任何字符
  • \sAS\b\s* 匹配空白字符,然后是 AS,后面是可选的空格

正则表达式演示 | Python演示

import re

pattern = r"(,\s*)[^,]*\sAS\b\s*"
s = ("Select customer_name, customer_type, COUNT(*) AS volume\\nFROM table\\nGROUP BY customer_name, customer_type\\nORDER BY volume DESC\\nLIMIT 10\n")

print(re.sub(pattern, r"", s))

输出

Select customer_name, customer_type, volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10
英文:

You could use a capture group and use the group in the replacement.

(,\s*)[^,]*\sAS\b\s*

Explanation

  • (,\s*) Capture group 1, match a comma and optional whitespace chars
  • [^,]* Match any char except a comma
  • \sAS\b\s* Match a whitespace char, then AS followed by optional spaces

Regex demo | Python demo

import re

pattern = r"(,\s*)[^,]*\sAS\b\s*"
s = ("Select customer_name, customer_type, COUNT(*) AS volume\\nFROM table\\nGROUP BY customer_name, customer_type\\nORDER BY volume DESC\\nLIMIT 10\n")

print(re.sub(pattern, r"", s))

Output

Select customer_name, customer_type, volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10

答案2

得分: 1

我会使用:

text = "Select customer_name, customer_type, COUNT(*) AS volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10"
result = re.sub(r',\s*\S+\s+AS\b\s*', ', ', text)
print(result)

这会打印:

Select customer_name, customer_type, volume
FROM table
GROUP BY customer_name, customer_type
ORDER BY volume DESC
LIMIT 10

这里使用的正则表达式模式表示要匹配以下内容:

  • , 逗号
  • \s* 可选的空白字符
  • \S+ 一个非空白字符序列
  • \s+ 一个或多个空白字符
  • AS 字面上的 "AS"
  • \b 单词边界
  • \s* 更多可选的空白字符
英文:

I would use:

<!-- language: python -->

text = &quot;Select customer_name, customer_type, COUNT(*) AS volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10&quot;
result = re.sub(r&#39;,\s*\S+\s+AS\b\s*&#39;, &#39;, &#39;, text)
print(result)

This prints:

Select customer_name, customer_type, volume
FROM table
GROUP BY customer_name, customer_type
ORDER BY volume DESC
LIMIT 10

The regex pattern used here says to match:

  • , a comma
  • \s* optional whitespace
  • \S+ a non whitespace term
  • \s+ one or more whitespace characters
  • AS literal "AS"
  • \b word boundary
  • \s* more optional whitespace

huangapple
  • 本文由 发表于 2023年4月4日 15:25:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/75926569.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定