英文:
How to remove from the list of strings, using regex for all the strings in between 'a' and 'b'?
问题
I wrote a function that searches for a given text in the file and returns all the lines where this text appeared (in the new_list). I would like to delete all the text that's between a character '/' and '/' (including these characters), as these are comments, and are not necessary in the returned list. So now it returns something like:
new_list = ['123456_xxxx /* cccccccccccccc */ /\n', '67890_yyyy /* cccccccccccccc */ /\n', ...]
and the goal is:
new_list = ['123456_xxxx', '67890_yyyy', ...]
I tried using the re library and regex (?<=/*).*?(?=*/), but this one gives an error:
error: nothing to repeat at position 13
PS I also tried with replace, but it only deletes the characters and not everything between them.
英文:
I wrote a function that searches for a given text in the file and returns all the lines where this text appeared (in the new_list). I would like to delete all the text that's between a character '/* ' and ' */' (including these characters), as these are comments, and are not necessary in the returned list. So now it returns something like:
new_list = [ '123456_xxxx /* cccccccccccccc */ /\n", '67890_yyyy /* cccccccccccccc */ /\n", ... ]
and the goal is:
new_list = [ '123456_xxxx", '67890_yyyy", ... ]
I tried using re library and regex (?<=§).*?(?=;)
new_list = re.sub('(?<=/*).*?(?=*/)', '', str(new_list)) but this one gives an error:
error: nothing to repeat at position 13
PS I also tried with replace but it only deletes the characters and not everything between them:
new_list = [s.replace(' /*', ' ') for s in new_list]
答案1
得分: 1
import re
list = [f'123456_xxxx /* cccccccccccccc */ /\n', f'67890_yyyy /* cccccccccccccc */ /\n']
newList = []
pattern = f'\/\*.*\*\/'
for str in list:
newList.append(re.sub(pattern, '', str))
print(newList)
['123456_xxxx /\n', '67890_yyyy /\n']
如果不想要空格和换行符,更改模式:
pattern = f' \/\*.*\*\/ \/\n'
['123456_xxxx', '67890_yyyy']
英文:
import re
list = [ f'123456_xxxx /* cccccccccccccc */ /\n', f'67890_yyyy /* cccccccccccccc */ /\n']
newList = []
pattern = f'\/\*.*\*\/'
for str in list:
newList.append( re.sub(pattern, '', str))
print(newList)
['123456_xxxx /\n', '67890_yyyy /\n']
if you don't want and \n
change the pattern:
pattern = f' \/\*.*\*\/ \/\\n'
['123456_xxxx', '67890_yyyy']
答案2
得分: 1
只需使用 \/\*.*?\*\/,并记得提供 re.S 标志以跨多行搜索。
斜杠 (/) 和星号 (*) 都是特殊字符,需要在前面加上反斜杠 () 以使其被视为字面字符。
英文:
Just use \/\*.*?\*\/ and remember to supply the re.S flag to search across multiple lines.
Both the / and the * are special and need to be prefaced with a backslash () to be taken literally.
答案3
得分: 1
你可以使用 re.search 和以下正则表达式:
^(?:(?!\/\*).)+(?<!\s)
(?:(?!\/\*).) 匹配字符串中的单个字符 (.),从字符串的开头开始 (^)。(?!\/\*) 是一个 负向先行断言,用于断言字符串中的下两个字符不是 /*。换句话说,匹配任何字符,直到如果它是 /,且下一个字符是 *。这被称为温和的贪婪标记技术。
我添加了 负向回顾断言 (?<!\s) 以确保匹配不以空格结尾。
英文:
You could use re.search with the following regular expression.
^(?:(?!\/\*).)+(?<!\s)
Python demo<sup><sub><-</sup></sub><sub>\(ツ)/</sub><sup><sub>-></sub></sup>Regex demo
(?:(?!\/\*).) matches a single character in the string (.), starting at the beginning of the string (^). (?!\/\*) is a negative lookahead that asserts that the following two characters in the string are not /*. In other words any character is matched until and if it is / and the next character is *. This is called the tempered greedy token technique.
I've added the negative lookbehind (?<!\s) to ensure that the match does not end in a whitespace.
答案4
得分: 0
代码
old_list = ['123456_xxxx /* cccccccccccccc */ /\n', '67890_yyyy /* cccccccccccccc */ /\n']
new_list = []
for i in old_list:
new_list.append(i[:10])
print(new_list)
输出
['123456_xxx', '67890_yyyy']
英文:
Using index of strings
Code
old_list = ['123456_xxxx /* cccccccccccccc */ /\n', '67890_yyyy /* cccccccccccccc */ /\n']
new_list = []
for i in old_list:
new_list.append(i[:10])
print(new_list)
Output
['123456_xxx', '67890_yyyy']
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论