2023年4月11日 10:45:22go评论94阅读模式

英文:

In the Hypothesis library for Python, why does the text() strategy cause custom strategies to retry?

问题

我使用 composite 创建了一个自定义策略，该策略内部使用了 text 策略。

在调试另一个错误（FailedHealthCheck.data_too_large）时，我意识到从 text 策略中提取数据会导致我的复合策略被调用的频率大约是预期的两倍。

我能够重现以下的最小示例：

@hypothesis.strategies.composite
def my_custom_strategy(draw, n):
    &quot;&quot;&quot;生成包含 N 个字符串的列表的策略&quot;&quot;&quot;
    trace(&quot;a&quot;)
    value = [draw(hypothesis.strategies.text(max_size=256)) for _ in range(n)]
    trace(&quot;b&quot;)
    return value
@given(my_custom_strategy(100))
def test_my_custom_strategy(value):
    assert len(value) == 100
    assert all(isinstance(v, str) for v in value)

在这种情况下，trace("a") 被调用了 206 次，而 trace("b") 只被调用了 100 次。这些数字在多次运行中保持一致。

更为问题的是，当我调用 text() 的次数越多时，差距会呈超线性增长。当 n=200 时，trace("a") 被调用了 305 次。n=400 时，被调用了 984 次。当 n=500 或更多时，测试可靠地在第11次迭代后暂停，然后完成（仅有 11 次迭代，而不是 100 次！）

这里发生了什么？

英文:

I have a custom strategy built using composite that draws from text strategy internally.

Debugging another error (FailedHealthCheck.data_too_large) I realized that drawing from the text strategy can cause my composite strategy to be invoked roughly twice as often as expected.

I was able to reproduce the following minimal example:

@hypothesis.strategies.composite
def my_custom_strategy(draw, n):
    &quot;&quot;&quot;Strategy to generate lists of N strings&quot;&quot;&quot;
    trace(&quot;a&quot;)
    value = [draw(hypothesis.strategies.text(max_size=256)) for _ in range(n)]
    trace(&quot;b&quot;)
    return value
@given(my_custom_strategy(100))
def test_my_custom_strategy(value):
    assert len(value) == 100
    assert all(isinstance(v, str) for v in value)

In this scenario, trace("a") was invoked 206 times, whereas trace("b") was only invoked 100 times. These numbers are consistent across runs.

More problematic, the gap increases the more times I call text(), and super-linearly. When n=200, trace("a") is called 305 times. n=400, 984 times. n=500 or greater, the test reliably pauses and then completes after the 11th iteration (with only 11 iterations, instead of 100!)

What's happening here?

答案1

得分: 1

我怀疑这是因为你遇到了生成假设示例所使用的最大熵（约8K），如果你生成的一些字符串恰好很长，设置一个合理的max_size在文本策略中会有所帮助，如果我没弄错的话。

作为一个更一般的提示，如果你使用lists()策略（或其他集合策略），而不是选择一个整数，然后有那么多元素，那么缩小可能更有效。不过，这不是一个微妙的问题；如果你还没有注意到，你其实不需要做任何事情！

英文:

I suspect it's because you're running into the maximum entropy (about 8K) used to generate Hypothesis examples, if some of the strings you generate happen to be quite long. Setting a reasonable max_size in the text strategy would help, if I'm right.

As a more general tip, shrinking can be more efficient if you use the lists() strategy (or another collections strategy) rather than picking an integer and then that many elements. This is not a subtle problem though; if you haven't already noticed you don't need to do anything!

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Python的Hypothesis库中，为什么text()策略会导致自定义策略重试？

问题

答案1

如何在默认图例旁边添加自定义图例

根据多列的条件从DataFrame中删除重复行

AttributeError: 创建MIP模型时，’NoneType’对象没有’split’属性

Python API调用采样问题

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。