如何修复 Python 正则表达式的 if 语句?

huangapple go评论76阅读模式
英文:

How to fix python regex if statement?

问题

以下是您要翻译的代码部分:

我正在尝试在数据框中查找特定短语 这些是我想要查找的词组组合
   apples
   bananas
   oranges
   apples and bananas
   apples and oranges
   bananas and oranges
   apples, bananas, and oranges
但是我的代码在只有2个匹配词的情况下不起作用 例如如果一行包含`` `apple, banana和orange```,我的代码只会输出``` 'apple and banana'```
这是我的代码
对于copy.loc [fruits]中的i
    打印(i)

    如果匹配apple
    如果re.match(r '((?=.*apple)|(?=.*apples))'i)

        如果说没有苹果
        如果re.match(r '(\bno\b)'i)
            打印('没有苹果,只有香蕉和橙子')
            打印('')

        如果有苹果和橙子
        elif re.match(r '(?=.*orange)|(?=.*\boranges\b)'i)
            打印('苹果和橙子')
            打印('')

        如果有苹果和香蕉
        elif re.match(r '(?=.*banana)|(?=.*bananas)'i)
            打印('苹果和香蕉')
            打印('')

        有苹果香蕉和橙子
        elif re.match(r '(?=.*banana)|(?=.*bananas)(?=.*orange)|(?=.*\boranges\b)'i)
            打印('苹果,香蕉和橙子')
            打印('')

        只有苹果
        否则
            打印('苹果')
            打印('')

    只有橙子
    elif re.match(r '(?=.*orange)|(?=.*\boranges\b)'i)
        打印('橙子')
        打印('')

    只有香蕉
    否则
        打印('香蕉')
        打印('')
当只有2个匹配词时我的代码不起作用 我该如何解决这个问题

感谢您抽出时间阅读和帮助 我非常感激
英文:

I'm trying to find certain phrases in a data frame. These are the combinations of words I would like to find:

   apples
   bananas
   oranges
   apples and bananas
   apples and oranges
   bananas and oranges
   apples, bananas, and oranges

However, my code is not working for the cases where there are only 2 matching words. For instance, if the a row contains 'apple, banana, and orange' my code will only output 'apple and banana'

This is my code:

for i in copy.loc[fruits]:
    print(i)

    #if match apple
    if re.match(r'((?=.*apple)|(?=.*apples))',i):
        
        #if says no apple     
        if re.match(r'(\bno\b)',i) :
            print('no apples, only banana and oranges')
            print('')
        
        #if has apple and orange
        elif re.match(r'(?=.*orange)|(?=.*\boranges\b)',i):
            print('apples and oranges')
            print('')  
        
        #if has apple and banana
        elif re.match(r'(?=.*banana)|(?=.*bananas)',i):
            print('apples and bananas')
            print('') 

        #has apple, banana, and orange 
        elif re.match(r'(?=.*banana)|(?=.*bananas)(?=.*orange)|(?=.*\boranges\b)',i):
            print('apples, bananas, and oranges')
            print('')
       
       #has only apple
       else:
            print('apples')
            print('')
   
    #only oranges
    elif re.match(r'(?=.*orange)|(?=.*\boranges\b)',i):
       print('oranges')
       print('')  
        
    #only banana     
    else:
       print('bananas')
       print('')
            

My code does not work when there are only 2 matching words. How can I fix this?

Thank you for taking the time to read and help out. I really appreciate it!

答案1

得分: 2

以下是您要翻译的内容:

"It's because of the order of the comparisons in your code." - "这是因为您代码中比较的顺序。"
"elif is "else if" - so if bananas matches, the elif code is not executed." - "elif 是“否则如果”的意思 - 所以如果匹配到香蕉,elif 代码不会被执行。"
"If you have patterns that are a "subset" of other patterns, you have to check the longer patterns first:" - "如果您有一个模式是其他模式的“子集”,您必须先检查较长的模式:"

请注意,我已按照您的要求只返回翻译好的部分。

英文:

It's because of the order of the comparisons in your code.

row = 'apple, banana, and orange'

if re.match(r'(?=.*\bbananas?\b)', row):
    print('apples and bananas')
elif re.match(r'(?=.*\bbananas?\b)(?=.*\boranges?\b)', row):
    print('apples, bananas, and oranges')
apples and bananas

elif is "else if" - so if bananas matches, the elif code is not executed.

If you have patterns that are a "subset" of other patterns, you have to check the longer patterns first:

if re.match(r'(?=.*\bbananas?\b)(?=.*\boranges?\b)', row):
    print('apples, bananas, and oranges')
elif re.match(r'(?=.*\bbananas?\b)', row):
    print('apples and bananas')
apples, bananas, and oranges

答案2

得分: 0

除此之外,您必须将更具体的匹配条件放在if else语句的前面,不需要所有那些查找。您只需将它们匹配即可。

对于这样的构造:((?=.*apple)|(?=.*apples)),您不需要任何查找。这与匹配\bapples?\b是相同的,其中s字符是可选的。

另外,使用re.match从字符串的开头开始匹配,所以匹配\bno\b(您也可以写成no\b)并不真正意味着您已经匹配了no apples,而是匹配了字符串开头的no

只有当所有字符串都具有相同的格式时,您才可以确定这一点。

这也适用于最后一个else,其中您打印出bananas,但实际上并没有匹配bananas。

如果您想要在字符串中找到匹配的第一个位置,可以使用re.search

例如,您可以使代码更加具体:

for i in copy.loc[fruits]:
    print(i)

    if re.search(r'\bapples?\b', i):

        if re.search(r'(\bno apples?\b)', i):
            print('no apples, only banana and oranges')
            print('')
    
        elif re.match(r'(?=.*\bbananas?\b).*\boranges?\b', i):
            print('apples, bananas, and oranges')
            print('')
    
        elif re.search(r'\boranges?\b', i):
            print('apples and oranges')
            print('')
    
        elif re.search(r'\bbananas?\b', i):
            print('apples and bananas')
            print('')
        else:
            print('apples')
            print('')
    
    elif re.search(r'\boranges?\b', i):
        print('oranges')
        print('')
    
    elif re.search(r'\bbananas?\b', i):
        print('bananas')
        print('')
    else:
        print("other..")
        print('')

希望这对您有所帮助。

英文:

Apart from that you have to move the more specific match higher in the if else matching, you don't need all those lookarounds. You can just match them as well.

For a construction like this ((?=.*apple)|(?=.*apples)) you don't need any lookarounds. This would be the same as matching \bapples?\b where the s char is optional.

Also using re.match matches from the start of the string, so matching \bno\b (which you can also write as no\b) does not really mean that you have matched no apples but matching no at the start of the string.

You can only be sure about that if all the strings have the same format.

This also applies to the last else where you print bananas, but you have not really matched bananas.

If you want to find the first location of a match in the string can make use of re.search instead.

You could make for example the code a bit more specific:

for i in copy.loc[fruits]:
    print(i)

    if re.search(r'\bapples?\b', i):

        if re.search(r'(\bno apples?\b)', i):
            print('no apples, only banana and oranges')
            print('')

        elif re.match(r'(?=.*\bbananas?\b).*\boranges?\b', i):
            print('apples, bananas, and oranges')
            print('')

        elif re.search(r'\boranges?\b', i):
            print('apples and oranges')
            print('')

        elif re.search(r'\bbananas?\b', i):
            print('apples and bananas')
            print('')
        else:
            print('apples')
            print('')

    elif re.search(r'\boranges?\b', i):
        print('oranges')
        print('')

    elif re.search(r'\bbananas?\b', i):
        print('bananas')
        print('')
    else:
        print("other..")
        print('')

huangapple
  • 本文由 发表于 2023年6月5日 01:22:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/76401586.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定