英文:
Excluding certain string using regex in python
问题
I would like to apply regex to the below code such that I remove any string that appears between a comma and the word 'AS'.
Select customer_name, customer_type, COUNT(*) AS volume
FROM table
GROUP BY customer_name, customer_type
ORDER BY volume DESC
LIMIT 10
Expected output:
Select customer_name, customer_type, volume
FROM table
GROUP BY customer_name, customer_type
ORDER BY volume DESC
LIMIT 10
I tried the below but that did not give the desired output
result = re.sub(r",\s*COUNT\(\*\)\s*AS\s*\w+", "", text)
英文:
I would like to apply regex to the below code such that I remove any string that appears between a comma and the word 'AS'.
Select customer_name, customer_type, COUNT(*) AS volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10
Expected output:
Select customer_name, customer_type, volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10
I tried the below but that did not give the desired output
result = re.sub(r",\s*COUNT\(\*\)\s*AS\s*\w+", "", text)
答案1
得分: 2
你可以使用捕获组,并在替换中使用该组。
(,\s*)[^,]*\sAS\b\s*
解释
(,\s*)
捕获组 1,匹配逗号和可选的空白字符[^,]*
匹配除逗号之外的任何字符\sAS\b\s*
匹配空白字符,然后是AS
,后面是可选的空格
import re
pattern = r"(,\s*)[^,]*\sAS\b\s*"
s = ("Select customer_name, customer_type, COUNT(*) AS volume\\nFROM table\\nGROUP BY customer_name, customer_type\\nORDER BY volume DESC\\nLIMIT 10\n")
print(re.sub(pattern, r"", s))
输出
Select customer_name, customer_type, volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10
英文:
You could use a capture group and use the group in the replacement.
(,\s*)[^,]*\sAS\b\s*
Explanation
(,\s*)
Capture group 1, match a comma and optional whitespace chars[^,]*
Match any char except a comma\sAS\b\s*
Match a whitespace char, thenAS
followed by optional spaces
import re
pattern = r"(,\s*)[^,]*\sAS\b\s*"
s = ("Select customer_name, customer_type, COUNT(*) AS volume\\nFROM table\\nGROUP BY customer_name, customer_type\\nORDER BY volume DESC\\nLIMIT 10\n")
print(re.sub(pattern, r"", s))
Output
Select customer_name, customer_type, volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10
答案2
得分: 1
我会使用:
text = "Select customer_name, customer_type, COUNT(*) AS volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10"
result = re.sub(r',\s*\S+\s+AS\b\s*', ', ', text)
print(result)
这会打印:
Select customer_name, customer_type, volume
FROM table
GROUP BY customer_name, customer_type
ORDER BY volume DESC
LIMIT 10
这里使用的正则表达式模式表示要匹配以下内容:
,
逗号\s*
可选的空白字符\S+
一个非空白字符序列\s+
一个或多个空白字符AS
字面上的 "AS"\b
单词边界\s*
更多可选的空白字符
英文:
I would use:
<!-- language: python -->
text = "Select customer_name, customer_type, COUNT(*) AS volume\nFROM table\nGROUP BY customer_name, customer_type\nORDER BY volume DESC\nLIMIT 10"
result = re.sub(r',\s*\S+\s+AS\b\s*', ', ', text)
print(result)
This prints:
Select customer_name, customer_type, volume
FROM table
GROUP BY customer_name, customer_type
ORDER BY volume DESC
LIMIT 10
The regex pattern used here says to match:
,
a comma\s*
optional whitespace\S+
a non whitespace term\s+
one or more whitespace charactersAS
literal "AS"\b
word boundary\s*
more optional whitespace
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论