如何在Python中使用“begin”和“end”标志从另一个列表构建嵌套列表?

huangapple go评论58阅读模式
英文:

How to construct a nested list from another list with "begin" and "end" flags in python?

问题

我正在尝试解析一个具有嵌套结构的文件,其中包含数据。嵌套的深度是任意的,并且每个嵌套由一个字符串标记,例如 'begin' 和 'end'。我想要构建一个保留文件中数据结构嵌套的列表(或字典等)。

以一个简单的例子为例,从一个类似于以下列表的列表开始:

input = [0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6]

我想要得到:

[0, [1, 2, [3, 4], 5], 6]

我尝试创建一个递归函数来做到这一点,但我无法获得正确的输出。 (我本可以包含我尝试的解决方案,但由于递归的原因,我无法确定我是否接近解决方案还是完全不对)。应该如何解决这个问题?

编辑以添加我的尝试(希望这不会让事情更加混乱):

input = [0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6]

def create_nest(items, nest):
    for i, item in enumerate(items):
        if item == 'begin':
            nest.append(create_nest(items[i+1:], []))
        elif item == 'end':
            return nest
        else:
            nest.append(item)

nested_list = []
create_nest(input, nested_list)

这会输出:

[0, [1, 2, [3, 4], 3, 4], 1, 2, [3, 4], 3, 4]
英文:

I am trying to parse a file, which has a nested structure to the data it contains. The number nesting is arbitrary in depth, and each is demarcated by a string, e.g. 'begin' and 'end. I would like to construct a list (or dictionary, etc.) that preserves the nesting of the data structure in the file.

As a simple example, from a list that looks something like:

input = [0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6]

I want to get to:

[0, [1, 2, [3, 4], 5], 6]

I tried creating a recursive function to do this, but I was unable to get the correct output. (I would include my attempt at a solution, but I can't tell if I am close or not at all because of the recursion). How should this problem be tackled?

Edit to add my attempt (I hope this doesn't just confuse things):

input = [0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6]

def create_nest(items, nest):
    for i, item in enumerate(items):
        if item == 'begin':
            nest.append(create_nest(items[i+1:], []))
        elif item == 'end':
            return nest
        else:
            nest.append(item)

nested_list = []
create_nest(input, nested_list)

This outputs:

[0, [1, 2, [3, 4], 3, 4], 1, 2, [3, 4], 3, 4]

答案1

得分: 2

完成无递归:

def collapse(lst):
    save = []
    curr = []
    for x in lst:
        if x == 'begin':
            save.append(curr)  # 记住你到目前为止的内容
            curr = []          # 重新开始
        elif x == 'end':
            old = save.pop()   # 恢复之前的内容
            old.append(curr)   # 添加刚刚完成的列表
            curr = old         # 继续之前的位置
        else:
            curr.append(x)
    return curr
英文:

Done without recursion:

def collapse(lst):
    save = []
    curr = []
    for x in lst:
        if x == 'begin':
            save.append(curr) # remember what you had so far
            curr = []         # start over
        elif x == 'end':
            old = save.pop()  # get back what you had
            old.append(curr)  # add list you just finished
            curr = old        # pick up where you left off
        else:
            curr.append(x)
    return curr

答案2

得分: 1

使用递归函数,使用迭代器来避免跟踪当前项目,以及利用列表的可变性:

def to_nested(inpt, out=None):
    it = iter(inpt)   # 使用迭代器
    if out is None: # 可选:只是为了避免使用列表作为默认值
        out = []    #
    for item in it:
        if item == 'begin': # 如果是'begin',让我们开始一个新级别
            out.append([])  # 添加一个新列表并递归传递它
            to_nested(it, out=out[-1])
        elif item == 'end': # 如果是'end',让我们结束这个递归
            return
        else:               # 否则,将项目添加到当前级别
            out.append(item)
    return out

to_nested([0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6])
# [0, [1, 2, [3, 4], 5], 6]

请注意,这假设格式正确('begin'和'end'的数量匹配)。如果不匹配,你将不得不跟踪它们。

修改你的方法,使用类似我的迭代器的方法

inpt = [0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6]

def create_nest(items, nest):
    items = iter(items)
    for item in items:
        if item == 'begin':
            nest.append(create_nest(items, []))
        elif item == 'end':
            return nest
        else:
            nest.append(item)
    return nest

nested_list = []
create_nest(inpt, nested_list)
# [0, [1, 2, [3, 4], 5], 6]

这是你提供的代码的翻译部分。

英文:

Using a recursive function, an iterator to avoid having to keep track of the current item, and taking advantage of the mutability of lists:

def to_nested(inpt, out=None):
    it = iter(inpt)   # use an iterator
    if out is None: # optional: just to avoid using a list as default
        out = []    #
    for item in it:
        if item == 'begin': # if begin, let's start a new level
            out.append([])  # add a new list and pass this recursively
            to_nested(it, out=out[-1])
        elif item == 'end': # if end, let's finish this recursion
            return
        else:               # else, add an item to the current level
            out.append(item)
    return out

to_nested([0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6])
# [0, [1, 2, [3, 4], 5], 6]

Note that this assumes a correct format (as many ends as there are begins). If not, you would have to keep track of them.

modification of your approach using an iterator like I did

inpt = [0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6]

def create_nest(items, nest):
    items = iter(items)
    for item in items:
        if item == 'begin':
            nest.append(create_nest(items, []))
        elif item == 'end':
            return nest
        else:
            nest.append(item)
    return nest

nested_list = []
create_nest(inpt, nested_list)
# [0, [1, 2, [3, 4], 5], 6]

答案3

得分: 1

线性数据最好使用线性过程处理 -

def parse(tokens):
  r = [[]]
  for t in tokens:
    match t:
      case 'begin':
        r.append([])
      case 'end':
        r[-2].append(r.pop())
      case _:
        r[-1].append(t)
  return r[0]
parse([0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6])
[0, [1, 2, [3, 4], 5], 6]

对于格式不正确的 begin..end 对,您可以和应该添加异常处理 -

def parse(tokens):
  r = [[]]
  for t in tokens:
    match t:
      case 'begin':
        r.append([])
      case 'end':
        if len(r) < 2:
          raise RuntimeError("unexpected 'end'")
        r[-2].append(r.pop())
      case _:
        r[-1].append(t)
  if len(r) > 1:
    raise RuntimeError("expected 'end'")
  return r[0]
parse([0, 'end', 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6])
# RuntimeError: unexpected 'end'
parse([0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 6])
# RuntimeError: expected 'end'

尽管它们可能很优雅,但这里提供的递归解决方案对于格式不正确的输入提供了不正确的输出。

英文:

Linear datum are best processed with a linear procedure -

def parse(tokens):
  r = [[]]
  for t in tokens:
    match t:
      case &#39;begin&#39;:
        r.append([])
      case &#39;end&#39;:
        r[-2].append(r.pop())
      case _:
        r[-1].append(t)
  return r[0]
parse([0, &#39;begin&#39;, 1, 2, &#39;begin&#39;, 3, 4, &#39;end&#39;, 5, &#39;end&#39;, 6])
[0, [1, 2, [3, 4], 5], 6]

You can and should exceptions for malformed begin..end pairs -

def parse(tokens):
  r = [[]]
  for t in tokens:
    match t:
      case &#39;begin&#39;:
        r.append([])
      case &#39;end&#39;:
        if len(r) &lt; 2:
          raise RuntimeError(&quot;unexpected &#39;end&#39;&quot;)
        r[-2].append(r.pop())
      case _:
        r[-1].append(t)
  if len(r) &gt; 1:
    raise RuntimeError(&quot;expected &#39;end&#39;&quot;)
  return r[0]
parse([0, &#39;end&#39;, &#39;begin&#39;, 1, 2, &#39;begin&#39;, 3, 4, &#39;end&#39;, 5, &#39;end&#39;, 6])
# RuntimeError: unexpected &#39;end&#39;
parse([0, &#39;begin&#39;, 1, 2, &#39;begin&#39;, 3, 4, &#39;end&#39;, 5, 6])
# RuntimeError: expected &#39;end&#39;

Elegant as they may be, the recursive solutions here provided incorrect output for malformed inputs.

huangapple
  • 本文由 发表于 2023年7月7日 01:22:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/76631199.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定