2023年2月18日 08:48:20go评论91阅读模式

英文:

regex split by parenthesis but not all parenthesis

问题

I will not provide a translation for the code part of your message, as per your request.

英文:

I am trying to split a string containing open and close parenthesis but want to exclude those parenthesis that have a substring right before them.
In the following example:

a = &#39;abc (xyz pqr) qwe ew (kjlk asd) ue(aad) kljl&#39;

I want to have a list like:

[&#39;abc&#39;, &#39;xyz pqr&#39;, &#39;qwe ew&#39;, &#39;kjlk asd&#39;, &#39;ue(aad)&#39;, &#39;kljl&#39;]

So I want to keep ue(aad) and do not split by (aad)

I have tried:

y = [x.strip() for x in re.split(r&quot;[^ue()][()]&quot;, a) if x.strip()]

答案1

得分: 2

Sure, here's the translated code:

import re

a = 'abc (xyz pqr) qwe ew (kjlk asd) ue(aad) kljl'
y = [x.strip() for x in re.split(r' (\S*\(.*?\))', a) if x != '']
for i in range(len(y)):
    if y[i][0] == '(' and y[i][-1] == ')':
        y[i] = y[i].strip('()')

print(y)  # => ['abc', 'xyz pqr', 'qwe ew', 'kjlk asd', 'ue(aad)', 'kljl']

The code uses regular expressions to split the input string and remove surrounding parentheses from matches that have no preceding strings.

英文:

Try this:

import re

a = &#39;abc (xyz pqr) qwe ew (kjlk asd) ue(aad) kljl&#39;
y = [x.strip() for x in re.split(r&#39; (\S*\(.*?\))&#39;, a) if x != &#39;&#39;]
for i in range(len(y)):
    if y[i][0] == &#39;(&#39; and y[i][-1] == &#39;)&#39;:
        y[i] = y[i].strip(&#39;()&#39;)

print(y)  # =&gt; [&#39;abc&#39;, &#39;xyz pqr&#39;, &#39;qwe ew&#39;, &#39;kjlk asd&#39;, &#39;ue(aad)&#39;, &#39;kljl&#39;]

The RegEx (\S*\(.*?\)) will match any of the parentheses and any preceding strings, then the loop removes surrounding parentheses from matches that have no preceding strings.

答案2

得分: 0

I understand your request. Here's the translated code:

由于在我的情况中关键字始终已知，我考虑删除所有ue(.*?)s并将它们保存在列表中，然后按括号拆分，然后进行替换。
这样我将能够拆分嵌套的括号。
类似这样：

a = "abc (xyz pqr) qwe ew (kjlk asd) ue(aad) kljl"
ues = re.findall("ue\(.*?\)", a)
j = re.sub("(?<=ue)\(.*?\)", "", a)
y = [x.strip() for x in re.split(r"[()]", j) if x.strip()]
for i in y:
    if "ue" in i:
        print(re.sub("ue", ues.pop(0), i))
    else: 
        print(i)

**更新：**
必须忽略的括号将附加一个像ue()这样的子字符串。因此，在它们之前添加一个空格将会忽略它们。

y = [x.strip() for x in re.split(r"[(?<=\s)][()]", a) if x.strip()]

Please note that I've translated the code as requested, and you should be able to use it as is.

英文:

Since the keyword in my case is always known, I was thinking to remove all ue(.*?)s and keep them in a list then split by parenthesis then substitute them.
This way I will be able to split nested parenthesis.
something like:

a = &quot;abc (xyz pqr) qwe ew (kjlk asd) ue(aad) kljl&quot;
ues = re.findall(&quot;ue\(.*?\)&quot;, a)
j = re.sub(&quot;(?&lt;=ue)\(.*?\)&quot;, &quot;&quot;, a)
y = [x.strip() for x in re.split(r&quot;[()]&quot;, j) if x.strip()]
for i in y:
    if &quot;ue&quot; in i:
        print(re.sub(&quot;ue&quot;, ues.pop(0), i))
    else: 
        print(i)

Update:
The parenthesis that must be ignored will have a substring stuck to it like ue(). So adding a space before will ignore them.

y = [x.strip() for x in re.split(r&quot;[(?&lt;=\s)][()]&quot;, a) if x.strip()]

答案3

得分: 0

对于你的示例数据，你可以使用捕获组来保留括号内分隔后的结果。在模式中，捕获括号前后的除括号外的非空白字符。

在列表推导中，首先检查 x，然后你可以再次测试 x.strip()。

请注意，这不考虑任何嵌套/平衡的括号。

解释

([^\s()]+\([^()]*\)) 捕获组1，匹配括号中的 (...) 前的1个或多个非空白字符。
| 或
(\([^()]*\)[^\s()]+) 捕获组2，匹配括号中的 (...) 后的1个或多个非空白字符。
| 或
[()] 匹配 ( 或 )。

查看Python演示和regex101演示。

import re

pattern = r"([^\s()]+\([^()]*\))|(\([^()]*\)[^\s()]+)|[()]"
a = 'abc (xyz pqr) qwe ew (kjlk asd) ue(aad) kljl'

y = [x.strip() for x in re.split(pattern, a) if x and x.strip()]
print(y)

输出

['abc', 'xyz pqr', 'qwe ew', 'kjlk asd', 'ue(aad)', 'kljl']

英文:

For your example data, you could use capture groups to keep the result after splitting. In the pattern, capture non whitespace chars except parenthesis before or after the part with parenthesis.

In the list comprehension, first check for x and then you can test again for x.strip()

Note that this does not take any nested/balanced parenthesis into account.

Explanation

([^\s()]+\([^()]*\)) Capture group 1, match 1+ non whitespace chars before matching from (...)
| Or
(\([^()]*\)[^\s()]+) Capture group 2, match 1+ non whitespace chars after matching from (...)
| Or
[()] Match either ( or )

See a Python demo and a regex101 demo.

import re

pattern = r&quot;([^\s()]+\([^()]*\))|(\([^()]*\)[^\s()]+)|[()]&quot;
a = &#39;abc (xyz pqr) qwe ew (kjlk asd) ue(aad) kljl&#39;

y = [x.strip() for x in re.split(pattern, a) if x and x.strip()]
print(y)

Output

[&#39;abc&#39;, &#39;xyz pqr&#39;, &#39;qwe ew&#39;, &#39;kjlk asd&#39;, &#39;ue(aad)&#39;, &#39;kljl&#39;]

答案4

得分: -1

I will only provide a translation of the code, as per your request. Here's the translated code:

import re

a = "abc (xyz pqr) qwe ew (kjlk asd) ue(aad) kljl"

dissub = re.split("\)\s", a)
newlist = []
for b in dissub:
    dasplit = re.split("\s\(", b)
    for c in dasplit:
        newlist.append(c)

i = 0
while i < len(newlist):
    dacheck = re.search("\(", newlist[i])
    if dacheck:
        newlist[i] += ")"
    i += 1
print(newlist)

英文:

This is a strange way to accomplish this but it works:

import re

a = &quot;abc (xyz pqr) qwe ew (kjlk asd) ue(aad) kljl&quot;

dissub=re.split(&quot;\)\s&quot;,a)
newlist=[]
for b in dissub:
    dasplit=re.split(&quot;\s\(&quot;,b)
    for c in dasplit:

        newlist.append(c)
i=0
while i&lt;len(newlist):
    dacheck=re.search(&quot;\(&quot;,newlist[i])
    if dacheck:
        newlist[i]+=&quot;)&quot;
    i+=1
print(newlist)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

正则表达式按括号拆分，但不是所有括号。

问题

答案1

答案2

答案3

答案4

数据在pyodbc和SQL Server之间是如何转换的？

Go语言从字符串中获取匹配的子串。

JSON转CSV在Python中，CSV的行数多于JSON。

只显示相关对象在Django管理界面中的方法

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论