2023年5月11日 18:06:25go评论101阅读模式

英文:

pyparsing: NotAny(FollowedBy()) failing

问题

import pyparsing as pp
from pyparsing import *
def pyparsing_test():
    data = "[gog1] [G1] [gog2] [gog3] [gog4] [G2] [gog5] [G3] [gog6]"
    poi_type = Word(alphas).set_results_name('type')
    poi = Suppress('[') + poi_type + Word(nums) + Suppress(']')
    def cnd_is_type(val):
        return lambda toks: toks.type == val
    def cnd_is_not_type(val):
        return lambda toks: toks.type != val
    poi_gog = poi('gog').add_condition(cnd_is_type('gog'))
    poi_g = poi('g').add_condition(cnd_is_type('G'))
    poi_not_g = poi('not_g').add_condition(cnd_is_not_type('G'))
    pattern = poi_gog + ~poi_g
    r = pattern.search_string(data)
    print(data)
    print('=' * 10)
    print(r)

英文:

i have some input data like

[gog1] [G1] [gog2] [gog3] [gog4] [G2] [gog5] [G3] [gog6]

and want to find all gogs, if not G after it. so in this case i want to get gog2 and gog3 (and maybe gog6).

looks pretty simple, rigth? but i failed

import pyparsing as pp
from pyparsing import *
def pyparsing_test():
    # this also dont helps
    # ParserElement.enable_left_recursion(force=True)
    data=&quot;&quot;&quot; [gog1] [G1] [gog2] [gog3] [gog4] [G2] [gog5] [G3] [gog6] &quot;&quot;&quot;
    poi_type = Word(alphas).set_results_name(&#39;type&#39;)
    poi = Suppress(&#39;[&#39;) + poi_type + Char(nums) + Suppress(&#39;]&#39;)
    def cnd_is_type(val):
        return lambda toks: toks.type==val
    def cnd_is_not_type(val):
        return lambda toks: toks.type!=val
    poi_gog=poi(&#39;gog&#39;).add_condition(cnd_is_type(&#39;gog&#39;))
    poi_g=poi(&#39;g&#39;).add_condition(cnd_is_type(&#39;G&#39;))
    poi_not_g=poi(&#39;not_g&#39;).add_condition(cnd_is_not_type(&#39;G&#39;))
    pattern = poi_gog + ~poi_g
    #WTF this finds only `gog6`, why??
    pattern = poi_gog + NotAny(FollowedBy(poi_g))
    #WTF same, only `gog6`
    pattern = poi_gog + poi_not_g.suppress()
    #WTF this works better but find only `gog2`, why not `gog3` also?
    r=pattern.search_string(data)
    print(data)
    print(&#39;=&#39;*10)
    print(r)

答案1

得分: 0

我会选择使用正则表达式模块 re

import re
data = """[gog1] [G1] [gog2] [gog3] [gog4] [G2] [gog5] [G3] [gog6] """
m = re.findall(r'\[(gog.)(?!...G)', data)
print(m)

结果是：

['gog2', 'gog3', 'gog6']

如果需要，正则表达式仍然可以进一步改进，以排除最后的 gog？和/或处理大于9的数字，或者使其更健壮。

英文:

I would go for the regexp module re

import re
data=&quot;&quot;&quot; [gog1] [G1] [gog2] [gog3] [gog4] [G2] [gog5] [G3] [gog6] &quot;&quot;&quot;
m = re.findall(&#39;\[(gog.)(?!...G)&#39;, data)
print(m)

the result is:

[&#39;gog2&#39;, &#39;gog3&#39;, &#39;gog6&#39;]

The regexp can still be improved if you want to exclute the last gog ? and/or you need to handle numbers larger than 9 if needed ? or make it more robust.

答案2

得分: 0

Finally, we know what's happening, thanks to @ptmcg!
His original answer on GitHub here.

Summary:

First of all, you need to use grouping with StringEnd(), and this one works:

pattern = poi_gog + FollowedBy(Group(poi_not_g) | StringEnd())

Regarding the title problem - NotAny() has a bug; it skips parse actions and conditions. The current version of pyparsing is 3.0.9.

英文:

Finnaly, we know whats happens, thanks to @ptmcg!
His original answer on github https://github.com/pyparsing/pyparsing/issues/482#issuecomment-1546779260.

Summary:

First of all, need to use grouping with StringEnd() and this one works:

pattern = poi_gog + FollowedBy(Group(poi_not_g) | StringEnd())

About title problem - NotAny() have bug, it skips parse actions (and conditions). Current version pyparsing 3.0.9

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

pyparsing: NotAny(FollowedBy()) 失败

问题

答案1

答案2

如何使Docker容器自动激活conda环境？

改变表格中字段的颜色，取决于用户支付剩余的时间，使用 Django。

Discord.py中的wait_for()在使用Cogs时不起作用- API文档无用。

如何在VSCode上启用Pylint扩展以显示“E202”错误？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。