Pyparsing:如何从各个组收集所有命名结果?

huangapple go评论71阅读模式
英文:

Pyparsing: How to collect all named results from groups?

问题

我正在使用pyparsing,需要能够收集表达式中的所有变量名。似乎可以使用setResultsName 实现这一点,但对于带有括号或其他分组的表达式,变量名是嵌套的。

例如,

ParserElement.enablePackrat()
LPAREN, RPAREN, COMMA = map(Suppress, "(),")
expr = Forward()

number = pyparsing_common.number
fn_call = Group(CaselessKeyword('safe_divide') + LPAREN + expr + COMMA + expr + RPAREN)
reserved_words = CaselessKeyword('safe_divide')
variable = ~reserved_words + pyparsing_common.identifier

operand = number | fn_call | variable.setResultsName('var', listAllMatches=True)

unary_op = oneOf("! -")
power_op = Literal("^")
multiplicative_op = oneOf("* / %")
additive_op = oneOf("+ -")
logical_op = oneOf("&& ||")

expr <<= infixNotation(
    operand,
    [
        (unary_op, 1, opAssoc.RIGHT),
        (power_op, 2, opAssoc.RIGHT),
        (multiplicative_op, 2, opAssoc.LEFT),
        (additive_op, 2, opAssoc.LEFT),
        (logical_op, 2, opAssoc.LEFT),
    ],
)

parsed = expr.parseString('(a + b) + c', parse_all=True)
print(parsed.dump())

这会给出以下结果

[[['a', '+', 'b'], '+', 'c']]
[0]:
  [['a', '+', 'b'], '+', 'c']
  - var: [['c']]
    [0]:
      ['c']
  [0]:
    ['a', '+', 'b']
    - var: [['a'], ['b']]
      [0]:
        ['a']
      [1]:
        ['b']
  [1]:
    +
  [2]:
    c

在这里,变量被返回,但对于更复杂的表达式,这并不容易访问。是否有一种方法可以收集所有嵌套的变量?

有一个类似的问题在这里,但那里的解决方法会错误地将关键字标记为变量。

英文:

I'm using pyparsing and I need to be able to collect all of the variable names from an expression. It seems like this should be possible with setResultsName, but for expressions with parens or that are otherwise grouped, the variable names are nested.

For example,

ParserElement.enablePackrat()
LPAREN, RPAREN, COMMA = map(Suppress, &quot;(),&quot;)
expr = Forward()

number = pyparsing_common.number
fn_call = Group(CaselessKeyword(&#39;safe_divide&#39;) + LPAREN + expr + COMMA + expr + RPAREN)
reserved_words = CaselessKeyword(&#39;safe_divide&#39;)
variable = ~reserved_words + pyparsing_common.identifier

operand = number | fn_call | variable.setResultsName(&#39;var&#39;, listAllMatches=True)

unary_op = oneOf(&quot;! -&quot;)
power_op = Literal(&quot;^&quot;)
multiplicative_op = oneOf(&quot;* / %&quot;)
additive_op = oneOf(&quot;+ -&quot;)
logical_op = oneOf(&quot;&amp;&amp; ||&quot;)

expr &lt;&lt;= infixNotation(
    operand,
    [
        (unary_op, 1, opAssoc.RIGHT),
        (power_op, 2, opAssoc.RIGHT),
        (multiplicative_op, 2, opAssoc.LEFT),
        (additive_op, 2, opAssoc.LEFT),
        (logical_op, 2, opAssoc.LEFT),
    ],
)

parsed = expr.parseString(&#39;(a + b) + c&#39;, parse_all=True)
print(parsed.dump())

This gives

[[[&#39;a&#39;, &#39;+&#39;, &#39;b&#39;], &#39;+&#39;, &#39;c&#39;]]
[0]:
  [[&#39;a&#39;, &#39;+&#39;, &#39;b&#39;], &#39;+&#39;, &#39;c&#39;]
  - var: [[&#39;c&#39;]]
    [0]:
      [&#39;c&#39;]
  [0]:
    [&#39;a&#39;, &#39;+&#39;, &#39;b&#39;]
    - var: [[&#39;a&#39;], [&#39;b&#39;]]
      [0]:
        [&#39;a&#39;]
      [1]:
        [&#39;b&#39;]
  [1]:
    +
  [2]:
    c

where the variables are returned, but not in an easily accessible format especially for more complex expressions. Is there a way to collect all of the nested variables?

There's a similar question here, but the workaround there would incorrectly label keywords as variables.

答案1

得分: 1

根据我的理解,您希望输出为树中找到的变量的单个列表。

def gather_named_elements(tree, name):
    named = []
    for i in range(len(tree)):
        if isinstance(tree[i], ParseResults):
            named += tree[i][name].as_list()
            named += gather_named_elements(tree[i], name)
    return list(set([x[0] for x in named]))

print(gather_named_elements(parsed, 'var'))
# 输出: ['a', 'b', 'c']

顺序不确定,但如果需要,您可以对列表进行排序。

英文:

As I understand it, you want the output to be the list of variables found throughout the tree as a single list.

def gather_named_elements(tree, name):
    named = []
    for i in range(len(tree)):
        if isinstance(tree[i], ParseResults):
            named += tree[i][name].as_list()
            named += gather_named_elements(tree[i], name)
    return list(set([x[0] for x in named]))

print(gather_named_elements(parsed, &#39;var&#39;))
# OUTPUT: [&#39;a&#39;, &#39;b&#39;, &#39;c&#39;]

The order is not deterministic, but you can sort the list if needed.

答案2

得分: 0

你可以为变量添加一个解析操作,将其名称保存到一个变量列表中(确保在调用setResultsName之前插入此代码):

found_variables = []
def found_var(s, l, t):
    found_variables.append(t[0])
variable.add_parse_action(found_var)

在第二次调用parse_string之前,务必清空列表。

英文:

You could add a parse action to variable to save its name off to a variable list (be sure to insert this code before calling setResultsName):

found_variables = []
def found_var(s, l, t):
    found_variables.append(t[0])
variable.add_parse_action(found_var)

Be sure to clear the list before calling parse_string a second time.

huangapple
  • 本文由 发表于 2023年1月6日 11:14:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/75026588.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定