2023年5月28日 01:02:45go评论78阅读模式

英文:

Iteration from an arbitrary state of multiple iterators, with: i1 < i2 < i3 ... < in

问题

我正在使用多个迭代器进行循环，每个迭代器的值都在0到193之间，并且每个迭代器的值必须大于前一个迭代器的值（i1 < i2 < i3 ... < in）。

在这个示例中，我有5个迭代器；要迭代的总状态数非常大（2,174,032,288）。因此，我将这些状态分批处理，并保存每个批次的起始和最终“状态”，以便可以从上次中断的地方继续。

我的下面的代码有效。我的问题是如何将这些嵌套的if语句通用化，使其适用于任意数量的有序迭代器，而不仅仅是5个。此外，是否有更好的方法来实现这个目标？

rnIndex = [0, 1, 2, 3, 4]  # 要迭代的索引的起始状态
batchSize = 1000000  # 每个批次迭代1百万个索引状态
batchNumber = 0  # 要从第一个批次开始的批次号 - 1
batchNumberMax = 10  # 最大要运行的批次号
rnLimit = 194  # 每个索引要迭代到（不包括）
rnComplete = False

while not rnComplete and batchNumber < batchNumberMax:
    batchNumber += 1
    print('起始索引（包括）：' + str(rnIndex))
    rnBatch = []
    for i in range(batchSize):
        if i == batchSize - 1:
            print('最终索引（包括）：' + str(rnIndex))
        rnBatch.append(rnIndex)  # 将每个rnIndex添加到rnBatch
        rnIndex[-1] += 1
        for j in range(len(rnIndex) - 1, 0, -1):
            if rnIndex[j] == rnLimit - (len(rnIndex) - 1 - j):
                rnIndex[j - 1] += 1
                for k in range(j, len(rnIndex)):
                    rnIndex[k] = rnIndex[k - 1] + 1
                break
        if rnIndex[0] == rnLimit - len(rnIndex):
            rnComplete = True
            break
    print('len(rnBatch) = ' + str(len(rnBatch)))  # 检查rnBatch的长度

print(rnIndex)  # 要从哪里恢复的rnIndex状态

输出：

起始索引（包括）：[0, 1, 2, 3, 4]
最终索引（包括）：[0, 1, 94, 99, 133]
len(rnBatch) = 1000000

...

起始索引（包括）：[0, 8, 24, 122, 173]
最终索引（包括）：[0, 9, 23, 54, 90]
len(rnBatch) = 1000000

起始索引（包括）：[0, 9, 23, 54, 91]
最终索引（包括）：[0, 10, 22, 182, 188]
len(rnBatch) = 1000000
[0, 10, 22, 182, 189]

这个修改后的代码通用化了嵌套的if语句，可以适用于任意数量的有序迭代器，并且在实现上更加灵活。

英文:

I'm looping with multiple iterators, each of which can take values between 0 and 193, and each iterator must also be greater than the previous one (i1 < i2 < i3 ... < in).

For this example, I have 5 iterators; the total number of states to be iterated over is very large (2,174,032,288). Therefore I am processing these states in batches and saving the start and final 'states' from each batch, so I can continue from where it left off.

My following code works. My question is what is the best way of generalising these nested if statements so that it works for any number of ordered iterators, not just 5. Also, is there a better approach to achieve this?

rnIndex = [0, 1, 2, 3, 4]  # the starting state of indices to iterate from
batchSize = 1000000  # iterate through 1 million index states per batch
batchNumber = 0  # the batch number to start from - 1
batchNumberMax = 10  # run up to and including this batch number
rnLimit = 194  # iterate up to (but not including) for each index
rnComplete = False

while not rnComplete and batchNumber &lt; batchNumberMax:
    batchNumber += 1
    print(&#39;\nStart index (included): &#39; + str(rnIndex))
    rnBatch = []
    for i in range(batchSize):
        if i == batchSize - 1:
            print(&#39;Final index (included): &#39; + str(rnIndex))
        rnBatch.append(rnIndex)  # add each rnIndex to rnBatch
        rnIndex[-1] += 1
        if rnIndex[-1] == rnLimit:
            rnIndex[-2] += 1
            rnIndex[-1] = rnIndex[-2] + 1
            if rnIndex[-2] == rnLimit - 1:
                rnIndex[-3] += 1
                rnIndex[-2] = rnIndex[-3] + 1
                rnIndex[-1] = rnIndex[-2] + 1
                if rnIndex[-3] == rnLimit - 2:
                    rnIndex[-4] += 1
                    rnIndex[-3] = rnIndex[-4] + 1
                    rnIndex[-2] = rnIndex[-3] + 1
                    rnIndex[-1] = rnIndex[-2] + 1
                    if rnIndex[-4] == rnLimit - 3:
                        rnIndex[-5] += 1
                        rnIndex[-4] = rnIndex[-5] + 1
                        rnIndex[-3] = rnIndex[-4] + 1
                        rnIndex[-2] = rnIndex[-3] + 1
                        rnIndex[-1] = rnIndex[-2] + 1
                        if rnIndex[-5] == rnLimit - 4:
                            rnComplete = True
                            break
    print(&#39;len(rnBatch) = &#39;+str(len(rnBatch)))  # check the length of rnBatch

print(rnIndex)  # the rnIndex state to resume from

out:

Start index (included): [0, 1, 2, 3, 4]
Final index (included): [0, 1, 94, 99, 133]
len(rnBatch) = 1000000

...

Start index (included): [0, 8, 24, 122, 173]
Final index (included): [0, 9, 23, 54, 90]
len(rnBatch) = 1000000

Start index (included): [0, 9, 23, 54, 91]
Final index (included): [0, 10, 22, 182, 188]
len(rnBatch) = 1000000
[0, 10, 22, 182, 189]

Process finished with exit code 0

答案1

得分: 0

你可以使用一个生成器函数，该函数生成符合你提供的约束条件的所有可能组合的索引，即每个索引可以取值在0到193之间，并且每个迭代器必须大于前一个迭代器（i1 < i2 < i3 ... < in）。
以下是代码示例：

from itertools import combinations

def generate_indices(limit, n):
    # 生成n个数字的所有组合
    for combo in combinations(range(limit), n):
        # 检查每个数字是否大于前一个数字
        if all(x < y for x, y in zip(combo, combo[1:])):
            yield list(combo)

rnLimit = 194
n = 5
batchSize = 1000000
batchNumber = 0
batchNumberMax = 20

index_generator = generate_indices(rnLimit, n)

while batchNumber < batchNumberMax:
    batchNumber += 1
    print('\nStart index (included): ' + str(next(index_generator)))
    rnBatch = []
    for i in range(batchSize):
        try:
            rnIndex = next(index_generator)
            if i == batchSize - 1:
                print('Final index (included): ' + str(rnIndex))
            rnBatch.append(rnIndex)
        except StopIteration:
            break
    print('len(rnBatch) = '+str(len(rnBatch)))

这段代码应该产生与你原始代码相同的输出，但可以用于任意数量的有序迭代器。

英文:

You can use a generator function that yields all possible combination of indices with the constraints you provided that is
each of which can take values between 0 and 193, and each iterator must also be greater than the previous one (i1 < i2 < i3 ... < in).
here is what it looks like:

from itertools import combinations

def generate_indices(limit, n):
    # generate all combinations of n numbers
    for combo in combinations(range(limit), n):
        # check if each number is greater than the previous one
        if all(x &lt; y for x, y in zip(combo, combo[1:])):
            yield list(combo)

You can use this function in your loop instead of the nested if statements, like this:

rnLimit = 194  
n = 5 
batchSize = 1000000  
batchNumber = 0  
batchNumberMax = 20  


index_generator = generate_indices(rnLimit, n)

while batchNumber &lt; batchNumberMax:
    batchNumber += 1
    print(&#39;\nStart index (included): &#39; + str(next(index_generator)))
    rnBatch = []
    for i in range(batchSize):
        try:
            rnIndex = next(index_generator) 
            if i == batchSize - 1:
                print(&#39;Final index (included): &#39; + str(rnIndex))
            rnBatch.append(rnIndex)  
        except StopIteration:
           
            break
    print(&#39;len(rnBatch) = &#39;+str(len(rnBatch)))

This code should produce the same output as your original code, but it works for any number of ordered iterators.

答案2

得分: 0

你可以编写一个函数，根据先前的索引序列生成下一个索引序列，并使用它来从任何起始点开始遍历组合：

def nextSeq(maxVal, values):
    for i, v in enumerate(reversed(values), 1):
        if v <= maxVal - i:
            return values[:-i] + [values[-i] + k + 1 for k in range(i)]

输出：

seq = [0, 1, 2, 3, 4]
for _ in range(10):
    print(seq)
    seq = nextSeq(193, seq)

[0, 1, 2, 3, 4]
[0, 1, 2, 3, 5]
[0, 1, 2, 3, 6]
[0, 1, 2, 3, 7]
[0, 1, 2, 3, 8]
[0, 1, 2, 3, 9]
[0, 1, 2, 3, 10]
[0, 1, 2, 3, 11]
[0, 1, 2, 3, 12]
[0, 1, 2, 3, 13]

这个函数还可以用于创建一个生成器，可以在for循环中使用（不需要嵌套）：

def genSeq(maxVal, start):
    seq = list(start)
    while seq:
        yield seq
        seq = nextSeq(maxVal, seq)

输出：

start = [188, 189, 190, 191, 192]
for seq in genSeq(193, start):
    print(seq)

[188, 189, 190, 191, 192]
[188, 189, 190, 191, 193]
[188, 189, 190, 192, 193]
[188, 189, 191, 192, 193]
[188, 190, 191, 192, 193]
[189, 190, 191, 192, 193]

如果你想直接跳到特定的序列（第N个序列），可以使用递归函数将索引转换为相同顺序的序列：

from math import factorial as fact

def seqAtIndex(index, maxVal, size):
    if size == 1:
        return [index]
    value = base = chunk = 0
    while base + chunk <= index:
        base += chunk
        value += 1
        chunk = fact(maxVal + 1 - value) // fact(size - 1) // fact(maxVal + 2 - value - size)
    return [value - 1] + [value + s for s in seqAtIndex(index - base, maxVal - value, size - 1)]

输出：

for i in range(10): 
    print(i, seqAtIndex(i, 193, 5))

0 [0, 1, 2, 3, 4]
1 [0, 1, 2, 3, 5]
2 [0, 1, 2, 3, 6]
3 [0, 1, 2, 3, 7]
4 [0, 1, 2, 3, 8]
5 [0, 1, 2, 3, 9]
6 [0, 1, 2, 3, 10]
7 [0, 1, 2, 3, 11]
8 [0, 1, 2, 3, 12]
9 [0, 1, 2, 3, 13]

for i in range(2174032280, 2174032288):
    print(i, seqAtIndex(i, 193, 5))

2174032280 [187, 189, 191, 192, 193]
2174032281 [187, 190, 191, 192, 193]
2174032282 [188, 189, 190, 191, 192]
2174032283 [188, 189, 190, 191, 193]
2174032284 [188, 189, 190, 192, 193]
2174032285 [188, 189, 191, 192, 193]
2174032286 [188, 190, 191, 192, 193]
2174032287 [189, 190, 191, 192, 193]

请注意，seqAtIndex 比 nextSeq 或 genSeq 慢得多，所以你应该只用它来找到起始序列，然后使用其他函数来逐个顺序前进。

英文:

You can write a function that produces the next sequence of indexes from a previous one, and use it to advance through the combinations from any starting point:

def nextSeq(maxVal,values):
    for i,v in enumerate(reversed(values),1):
        if v &lt;= maxVal-i:
            return values[:-i]+[values[-i]+k+1 for k in range(i)]

output:

seq = [0,1,2,3,4]
for _ in range(10):
    print(seq)
    seq = nextSeq(193,seq)

[0, 1, 2, 3, 4]
[0, 1, 2, 3, 5]
[0, 1, 2, 3, 6]
[0, 1, 2, 3, 7]
[0, 1, 2, 3, 8]
[0, 1, 2, 3, 9]
[0, 1, 2, 3, 10]
[0, 1, 2, 3, 11]
[0, 1, 2, 3, 12]
[0, 1, 2, 3, 13]

The function could also be used to create a generator that can be used in a for-loop (without nesting):

def genSeq(maxVal,start):
    seq = list(start)
    while seq:
        yield seq
        seq = nextSeq(maxVal,seq)

output:

start = [188,189,190,191,192]
for seq in genSeq(193,start):
    print(seq)

[188, 189, 190, 191, 192]
[188, 189, 190, 191, 193]
[188, 189, 190, 192, 193]
[188, 189, 191, 192, 193]
[188, 190, 191, 192, 193]
[189, 190, 191, 192, 193]

If you want to jump directly to a specific sequence (Nth sequence), a recursive function can convert an index to a sequence in the same order:

from math import factorial as fact
def seqAtIndex(index,maxVal,size):
    if size == 1: return [index]
    value  = base = chunk = 0
    while base+chunk &lt;= index:
        base  += chunk
        value += 1
        chunk  = fact(maxVal+1-value)//fact(size-1)//fact(maxVal+2-value-size)
    return [value-1] \
         + [value+s for s in seqAtIndex(index-base,maxVal-value,size-1)]

output:

for i in range(10): 
    print(i,seqAtIndex(i,193,5))

0 [0, 1, 2, 3, 4]
1 [0, 1, 2, 3, 5]
2 [0, 1, 2, 3, 6]
3 [0, 1, 2, 3, 7]
4 [0, 1, 2, 3, 8]
5 [0, 1, 2, 3, 9]
6 [0, 1, 2, 3, 10]
7 [0, 1, 2, 3, 11]
8 [0, 1, 2, 3, 12]
9 [0, 1, 2, 3, 13]

for i in range(2174032280,2174032288):
    print(i,seqAtIndex(i,193,5))

2174032280 [187, 189, 191, 192, 193]
2174032281 [187, 190, 191, 192, 193]
2174032282 [188, 189, 190, 191, 192]
2174032283 [188, 189, 190, 191, 193]
2174032284 [188, 189, 190, 192, 193]
2174032285 [188, 189, 191, 192, 193]
2174032286 [188, 190, 191, 192, 193]
2174032287 [189, 190, 191, 192, 193]

Note that seqAtIndex is much slower than nextSeq or genSeq so you should only use it to find the starting sequence and then use the other functions to advance sequentially

答案3

得分: 0

from itertools import product, combinations, islice

batchSize = 3
rnLimit = 7

combs = combinations(range(rnLimit), 5)
while batch := list(islice(combs, 3)):
    print(batch)

英文:

from itertools import product, combinations, islice

batchSize = 3
rnLimit = 7

combs = combinations(range(rnLimit), 5)
while batch := list(islice(combs, 3)):
    print(batch)

Output showing the batches, your extra information could be added easily if actually necessary (Attempt This Online!):

[(0, 1, 2, 3, 4), (0, 1, 2, 3, 5), (0, 1, 2, 3, 6)]
[(0, 1, 2, 4, 5), (0, 1, 2, 4, 6), (0, 1, 2, 5, 6)]
[(0, 1, 3, 4, 5), (0, 1, 3, 4, 6), (0, 1, 3, 5, 6)]
[(0, 1, 4, 5, 6), (0, 2, 3, 4, 5), (0, 2, 3, 4, 6)]
[(0, 2, 3, 5, 6), (0, 2, 4, 5, 6), (0, 3, 4, 5, 6)]
[(1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 5, 6)]
[(1, 2, 4, 5, 6), (1, 3, 4, 5, 6), (2, 3, 4, 5, 6)]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从多个迭代器的任意状态进行迭代，具有：i1 < i2 < i3 ... < in

问题

答案1

答案2

答案3

我在使用类时遇到了意外的列表赋值语句。

如何根据权重为3D B样条上色

未定义的符号在导入tf-sentencepiece时发生。

如何在pyarrow数据类型中使用分类数据类型？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论