英文:
How to remove from the list of strings, using regex for all the strings in between 'a' and 'b'?
问题
I wrote a function that searches for a given text in the file and returns all the lines where this text appeared (in the new_list
). I would like to delete all the text that's between a character '/' and '/' (including these characters), as these are comments, and are not necessary in the returned list. So now it returns something like:
new_list = ['123456_xxxx /* cccccccccccccc */ /\n', '67890_yyyy /* cccccccccccccc */ /\n', ...]
and the goal is:
new_list = ['123456_xxxx', '67890_yyyy', ...]
I tried using the re library and regex (?<=/*).*?(?=*/)
, but this one gives an error:
error: nothing to repeat at position 13
PS I also tried with replace
, but it only deletes the characters and not everything between them.
英文:
I wrote a function that searches for a given text in the file and returns all the lines where this text appeared (in the new_list
). I would like to delete all the text that's between a character '/* ' and ' */' (including these characters), as these are comments, and are not necessary in the returned list. So now it returns something like:
new_list = [ '123456_xxxx /* cccccccccccccc */ /\n", '67890_yyyy /* cccccccccccccc */ /\n", ... ]
and the goal is:
new_list = [ '123456_xxxx", '67890_yyyy", ... ]
I tried using re library and regex (?<=§).*?(?=;)
new_list = re.sub('(?<=/*).*?(?=*/)', '', str(new_list))
but this one gives an error:
error: nothing to repeat at position 13
PS I also tried with replace but it only deletes the characters and not everything between them:
new_list = [s.replace(' /*', ' ') for s in new_list]
答案1
得分: 1
import re
list = [f'123456_xxxx /* cccccccccccccc */ /\n', f'67890_yyyy /* cccccccccccccc */ /\n']
newList = []
pattern = f'\/\*.*\*\/'
for str in list:
newList.append(re.sub(pattern, '', str))
print(newList)
['123456_xxxx /\n', '67890_yyyy /\n']
如果不想要空格和换行符,更改模式:
pattern = f' \/\*.*\*\/ \/\n'
['123456_xxxx', '67890_yyyy']
英文:
import re
list = [ f'123456_xxxx /* cccccccccccccc */ /\n', f'67890_yyyy /* cccccccccccccc */ /\n']
newList = []
pattern = f'\/\*.*\*\/'
for str in list:
newList.append( re.sub(pattern, '', str))
print(newList)
['123456_xxxx /\n', '67890_yyyy /\n']
if you don't want
and \n
change the pattern:
pattern = f' \/\*.*\*\/ \/\\n'
['123456_xxxx', '67890_yyyy']
答案2
得分: 1
只需使用 \/\*.*?\*\/
,并记得提供 re.S 标志以跨多行搜索。
斜杠 (/) 和星号 (*) 都是特殊字符,需要在前面加上反斜杠 () 以使其被视为字面字符。
英文:
Just use \/\*.*?\*\/
and remember to supply the re.S flag to search across multiple lines.
Both the / and the * are special and need to be prefaced with a backslash () to be taken literally.
答案3
得分: 1
你可以使用 re.search
和以下正则表达式:
^(?:(?!\/\*).)+(?<!\s)
(?:(?!\/\*).)
匹配字符串中的单个字符 (.
),从字符串的开头开始 (^
)。(?!\/\*)
是一个 负向先行断言,用于断言字符串中的下两个字符不是 /*
。换句话说,匹配任何字符,直到如果它是 /
,且下一个字符是 *
。这被称为温和的贪婪标记技术。
我添加了 负向回顾断言 (?<!\s)
以确保匹配不以空格结尾。
英文:
You could use re.search
with the following regular expression.
^(?:(?!\/\*).)+(?<!\s)
Python demo<sup><sub><-</sup></sub><sub>\(ツ)/</sub><sup><sub>-></sub></sup>Regex demo
(?:(?!\/\*).)
matches a single character in the string (.
), starting at the beginning of the string (^
). (?!\/\*)
is a negative lookahead that asserts that the following two characters in the string are not /*
. In other words any character is matched until and if it is /
and the next character is *
. This is called the tempered greedy token technique.
I've added the negative lookbehind (?<!\s)
to ensure that the match does not end in a whitespace.
答案4
得分: 0
代码
old_list = ['123456_xxxx /* cccccccccccccc */ /\n', '67890_yyyy /* cccccccccccccc */ /\n']
new_list = []
for i in old_list:
new_list.append(i[:10])
print(new_list)
输出
['123456_xxx', '67890_yyyy']
英文:
Using index of strings
Code
old_list = ['123456_xxxx /* cccccccccccccc */ /\n', '67890_yyyy /* cccccccccccccc */ /\n']
new_list = []
for i in old_list:
new_list.append(i[:10])
print(new_list)
Output
['123456_xxx', '67890_yyyy']
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论