英文:
How to construct a nested list from another list with "begin" and "end" flags in python?
问题
我正在尝试解析一个具有嵌套结构的文件,其中包含数据。嵌套的深度是任意的,并且每个嵌套由一个字符串标记,例如 'begin' 和 'end'。我想要构建一个保留文件中数据结构嵌套的列表(或字典等)。
以一个简单的例子为例,从一个类似于以下列表的列表开始:
input = [0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6]
我想要得到:
[0, [1, 2, [3, 4], 5], 6]
我尝试创建一个递归函数来做到这一点,但我无法获得正确的输出。 (我本可以包含我尝试的解决方案,但由于递归的原因,我无法确定我是否接近解决方案还是完全不对)。应该如何解决这个问题?
编辑以添加我的尝试(希望这不会让事情更加混乱):
input = [0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6]
def create_nest(items, nest):
for i, item in enumerate(items):
if item == 'begin':
nest.append(create_nest(items[i+1:], []))
elif item == 'end':
return nest
else:
nest.append(item)
nested_list = []
create_nest(input, nested_list)
这会输出:
[0, [1, 2, [3, 4], 3, 4], 1, 2, [3, 4], 3, 4]
英文:
I am trying to parse a file, which has a nested structure to the data it contains. The number nesting is arbitrary in depth, and each is demarcated by a string, e.g. 'begin' and 'end. I would like to construct a list (or dictionary, etc.) that preserves the nesting of the data structure in the file.
As a simple example, from a list that looks something like:
input = [0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6]
I want to get to:
[0, [1, 2, [3, 4], 5], 6]
I tried creating a recursive function to do this, but I was unable to get the correct output. (I would include my attempt at a solution, but I can't tell if I am close or not at all because of the recursion). How should this problem be tackled?
Edit to add my attempt (I hope this doesn't just confuse things):
input = [0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6]
def create_nest(items, nest):
for i, item in enumerate(items):
if item == 'begin':
nest.append(create_nest(items[i+1:], []))
elif item == 'end':
return nest
else:
nest.append(item)
nested_list = []
create_nest(input, nested_list)
This outputs:
[0, [1, 2, [3, 4], 3, 4], 1, 2, [3, 4], 3, 4]
答案1
得分: 2
完成无递归:
def collapse(lst):
save = []
curr = []
for x in lst:
if x == 'begin':
save.append(curr) # 记住你到目前为止的内容
curr = [] # 重新开始
elif x == 'end':
old = save.pop() # 恢复之前的内容
old.append(curr) # 添加刚刚完成的列表
curr = old # 继续之前的位置
else:
curr.append(x)
return curr
英文:
Done without recursion:
def collapse(lst):
save = []
curr = []
for x in lst:
if x == 'begin':
save.append(curr) # remember what you had so far
curr = [] # start over
elif x == 'end':
old = save.pop() # get back what you had
old.append(curr) # add list you just finished
curr = old # pick up where you left off
else:
curr.append(x)
return curr
答案2
得分: 1
使用递归函数,使用迭代器来避免跟踪当前项目,以及利用列表的可变性:
def to_nested(inpt, out=None):
it = iter(inpt) # 使用迭代器
if out is None: # 可选:只是为了避免使用列表作为默认值
out = [] #
for item in it:
if item == 'begin': # 如果是'begin',让我们开始一个新级别
out.append([]) # 添加一个新列表并递归传递它
to_nested(it, out=out[-1])
elif item == 'end': # 如果是'end',让我们结束这个递归
return
else: # 否则,将项目添加到当前级别
out.append(item)
return out
to_nested([0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6])
# [0, [1, 2, [3, 4], 5], 6]
请注意,这假设格式正确('begin'和'end'的数量匹配)。如果不匹配,你将不得不跟踪它们。
修改你的方法,使用类似我的迭代器的方法
inpt = [0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6]
def create_nest(items, nest):
items = iter(items)
for item in items:
if item == 'begin':
nest.append(create_nest(items, []))
elif item == 'end':
return nest
else:
nest.append(item)
return nest
nested_list = []
create_nest(inpt, nested_list)
# [0, [1, 2, [3, 4], 5], 6]
这是你提供的代码的翻译部分。
英文:
Using a recursive function, an iterator to avoid having to keep track of the current item, and taking advantage of the mutability of lists:
def to_nested(inpt, out=None):
it = iter(inpt) # use an iterator
if out is None: # optional: just to avoid using a list as default
out = [] #
for item in it:
if item == 'begin': # if begin, let's start a new level
out.append([]) # add a new list and pass this recursively
to_nested(it, out=out[-1])
elif item == 'end': # if end, let's finish this recursion
return
else: # else, add an item to the current level
out.append(item)
return out
to_nested([0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6])
# [0, [1, 2, [3, 4], 5], 6]
Note that this assumes a correct format (as many ends as there are begins). If not, you would have to keep track of them.
modification of your approach using an iterator like I did
inpt = [0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6]
def create_nest(items, nest):
items = iter(items)
for item in items:
if item == 'begin':
nest.append(create_nest(items, []))
elif item == 'end':
return nest
else:
nest.append(item)
return nest
nested_list = []
create_nest(inpt, nested_list)
# [0, [1, 2, [3, 4], 5], 6]
答案3
得分: 1
线性数据最好使用线性过程处理 -
def parse(tokens):
r = [[]]
for t in tokens:
match t:
case 'begin':
r.append([])
case 'end':
r[-2].append(r.pop())
case _:
r[-1].append(t)
return r[0]
parse([0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6])
[0, [1, 2, [3, 4], 5], 6]
对于格式不正确的 begin..end
对,您可以和应该添加异常处理 -
def parse(tokens):
r = [[]]
for t in tokens:
match t:
case 'begin':
r.append([])
case 'end':
if len(r) < 2:
raise RuntimeError("unexpected 'end'")
r[-2].append(r.pop())
case _:
r[-1].append(t)
if len(r) > 1:
raise RuntimeError("expected 'end'")
return r[0]
parse([0, 'end', 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6])
# RuntimeError: unexpected 'end'
parse([0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 6])
# RuntimeError: expected 'end'
尽管它们可能很优雅,但这里提供的递归解决方案对于格式不正确的输入提供了不正确的输出。
英文:
Linear datum are best processed with a linear procedure -
def parse(tokens):
r = [[]]
for t in tokens:
match t:
case 'begin':
r.append([])
case 'end':
r[-2].append(r.pop())
case _:
r[-1].append(t)
return r[0]
parse([0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6])
[0, [1, 2, [3, 4], 5], 6]
You can and should exceptions for malformed begin..end
pairs -
def parse(tokens):
r = [[]]
for t in tokens:
match t:
case 'begin':
r.append([])
case 'end':
if len(r) < 2:
raise RuntimeError("unexpected 'end'")
r[-2].append(r.pop())
case _:
r[-1].append(t)
if len(r) > 1:
raise RuntimeError("expected 'end'")
return r[0]
parse([0, 'end', 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 'end', 6])
# RuntimeError: unexpected 'end'
parse([0, 'begin', 1, 2, 'begin', 3, 4, 'end', 5, 6])
# RuntimeError: expected 'end'
Elegant as they may be, the recursive solutions here provided incorrect output for malformed inputs.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论