2023年5月28日 11:49:01go评论81阅读模式

英文:

Why the optimized prime number generator program take more time?

问题

以下是您提供的代码的翻译部分：

#%% 素数生成器 (PG-13)
import timeit
import matplotlib.pyplot as plt
import math

class Primes:
    @staticmethod
    def stream1():
        primes = []
        yield 2
        i = 3
        while True:
            is_prime = True
            for j in primes:
                if i % j == 0: 
                    is_prime = False
                    break
            if is_prime:
                yield i
                primes.append(i)
            i += 2
    @staticmethod
    def stream2():
        primes = []
        yield 2
        i = 3
        while True:
            is_prime = True
            for j in filter(lambda x: x < math.sqrt(i) + 1, primes):
                if i % j == 0: 
                    is_prime = False
                    break
            if is_prime:
                yield i
                primes.append(i)
            i += 2

def get_nth_prime1(n):
    s = Primes.stream1()
    cnt = 0
    for i in s:
        cnt += 1
        if cnt > n:
            break
    return i
def get_nth_prime2(n):
    s = Primes.stream2()
    cnt = 0
    for i in s:
        cnt += 1
        if cnt > n:
            break
    return i

s1 = Primes().stream1()
for i in range(20):
    print(next(s1), end=' ')
print()

s2 = Primes().stream2()
for i in range(20):
    print(next(s2), end=' ')
print()

t1 = timeit.timeit("get_nth_prime1(5000)",  
              "from __main__ import get_nth_prime1", number=1)
t2 = timeit.timeit("get_nth_prime2(5000)",  
              "from __main__ import get_nth_prime2", number=1)
print(f"检查直到 n: {t1}\n检查直到 sqrt(n): {t2}")

请注意，我只翻译了代码中的文本部分，不包括代码本身。

英文:

I recently work on the prime number generator program and try to optimize the time complexity.

My first attempt is simple: iterate from current_big_prime + 1 as current_number and in every iterate check from 1 to current_number-1 to see if current_number is prime.
Then I realize it is sufficient that every check iterate(the inner loop) go until sqrt(current_number)+1 instead of current_number-1, which should save a lot of time.

But the result is opposite: the optimized program is worse on time complexity.

Here is my program:

#%% Prime Streaming (PG-13)
import timeit
import matplotlib.pyplot as plt
import math

class Primes:
    @staticmethod
    def stream1():
        primes = []
        yield 2
        i = 3
        while True:
            is_prime = True
            for j in primes:
                if i % j == 0: 
                    is_prime = False
                    break
            if is_prime:
                yield i
                primes.append(i)
            i += 2
    @staticmethod
    def stream2():
        primes = []
        yield 2
        i = 3
        while True:
            is_prime = True
            for j in filter(lambda x: x &lt; math.sqrt(i) + 1, primes):
                if i % j == 0: 
                    is_prime = False
                    break
            if is_prime:
                yield i
                primes.append(i)
            i += 2

def get_nth_prime1(n):
    s = Primes.stream1()
    cnt = 0
    for i in s:
        cnt += 1
        if cnt &gt; n:
            break
    return i
def get_nth_prime2(n):
    s = Primes.stream2()
    cnt = 0
    for i in s:
        cnt += 1
        if cnt &gt; n:
            break
    return i

s1 = Primes().stream1()
for i in range(20):
    print(next(s1), end=&#39; &#39;)
print()

s2 = Primes().stream2()
for i in range(20):
    print(next(s2), end=&#39; &#39;)
print()

t1 = timeit.timeit(&quot;get_nth_prime1(5000)&quot;,  
              &quot;from __main__ import get_nth_prime1&quot;, number=1)
t2 = timeit.timeit(&quot;get_nth_prime2(5000)&quot;,  
              &quot;from __main__ import get_nth_prime2&quot;, number=1)
print(f&quot;check until n: {t1}\ncheck until sqrt(n): {t2}&quot;)

答案1

得分: 3

filter 操作的时间复杂度为 O(N)，在这种情况下是不必要的。相反，我们可以进行 if 检查，并在超过平方根时 break 循环。

你可以在这里阅读更多关于 filter 操作的信息 filter

在上述优化后，这是基准测试的结果：

检查直到 n: 1.7345025209997402
检查直到 sqrt(n): 0.10318170000027749

此外，作为额外内容，请查看埃拉托斯特尼筛法算法，它更加简洁和快速。

英文:

The filter operations costs O(N) and is unnecessary in this case. Instead we can do an if check and break the loop once it exceeds the square root.

You can read more about the filter operation here filter

Here's the benchmark after the above optimisations:

check until n: 1.7345025209997402
check until sqrt(n): 0.10318170000027749

Also, as a bonus, do checkout the Sieve of Eratosthenes algorithm which is a lot more cleaner and faster.

答案2

得分: 1

filter 不会提前停止。它会过滤掉大于平方根的数字，但为了实现这一点，它仍然会遍历所有数字并测试它们。它只是不会将大的数字传递给你。因此，虽然你确实节省了对大数字进行 i % j == 0 检查的成本，但相反你为它们支付了更高的筛选/lambda成本。对于小数字，你额外支付了这个成本。

你可以改用 takewhile，它确实可以做到你想要的。它会提供给你小数字，并在达到平方根时完全停止。

在我的测试中，你的代码在没有使用 filter 时报告的时间约为 ~1.7 秒，而在使用 filter 时为 ~4.9 秒。使用 takewhile 后，时间为约 ~0.15 秒。如果你在循环之前只计算一次限制条件 lambda x: x < sqrt，那么时间为约 ~0.08 秒。然而，正如另一个答案建议的那样，使用简单的 if+break 仍然更快，因为它避免了筛选/lambda的开销。

英文:

filter doesn't stop early. It does filter out the numbers larger than the square root, but to do that, it still goes through all the numbers and tests them. It just doesn't pass the large ones on to you. So while you do save the cost for your i % j == 0 for the large ones, you pay the larger filter/lambda cost for them instead. And for the small ones you pay it additionally.

You can instead use takewhile, that does do exactly what you were trying to do. It gives you the small numbers and entirely stops when it reaches the square root.

In my own testing, your code reported times ~1.7 seconds without filter and ~4.9 seconds with filter. With takewhile, it's ~0.15 seconds. And ~0.08 seconds if you use lambda x: x < sqrt, computing that limit only once, before the loop. A simple if+break as the other answer suggested is still faster, though, as it avoids the filter/lambda overhead.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

优化后的素数生成器程序为什么需要更多时间？

问题

答案1

答案2

绘制随时间变化的二元结果变量的散点图

你可以在Python中根据多个条件/其他变量创建一个新的数据框列吗？

ValueError: 数组的真值

将元素的标识映射到数值。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论