优化后的素数生成器程序为什么需要更多时间?

huangapple go评论71阅读模式
英文:

Why the optimized prime number generator program take more time?

问题

以下是您提供的代码的翻译部分:

#%% 素数生成器 (PG-13)
import timeit
import matplotlib.pyplot as plt
import math

class Primes:
    @staticmethod
    def stream1():
        primes = []
        yield 2
        i = 3
        while True:
            is_prime = True
            for j in primes:
                if i % j == 0: 
                    is_prime = False
                    break
            if is_prime:
                yield i
                primes.append(i)
            i += 2
    @staticmethod
    def stream2():
        primes = []
        yield 2
        i = 3
        while True:
            is_prime = True
            for j in filter(lambda x: x < math.sqrt(i) + 1, primes):
                if i % j == 0: 
                    is_prime = False
                    break
            if is_prime:
                yield i
                primes.append(i)
            i += 2

def get_nth_prime1(n):
    s = Primes.stream1()
    cnt = 0
    for i in s:
        cnt += 1
        if cnt > n:
            break
    return i
def get_nth_prime2(n):
    s = Primes.stream2()
    cnt = 0
    for i in s:
        cnt += 1
        if cnt > n:
            break
    return i

s1 = Primes().stream1()
for i in range(20):
    print(next(s1), end=' ')
print()

s2 = Primes().stream2()
for i in range(20):
    print(next(s2), end=' ')
print()

t1 = timeit.timeit("get_nth_prime1(5000)",  
              "from __main__ import get_nth_prime1", number=1)
t2 = timeit.timeit("get_nth_prime2(5000)",  
              "from __main__ import get_nth_prime2", number=1)
print(f"检查直到 n: {t1}\n检查直到 sqrt(n): {t2}")

请注意,我只翻译了代码中的文本部分,不包括代码本身。

英文:

I recently work on the prime number generator program and try to optimize the time complexity.

  1. My first attempt is simple: iterate from current_big_prime + 1 as current_number and in every iterate check from 1 to current_number-1 to see if current_number is prime.
  2. Then I realize it is sufficient that every check iterate(the inner loop) go until sqrt(current_number)+1 instead of current_number-1, which should save a lot of time.

But the result is opposite: the optimized program is worse on time complexity.

Here is my program:

#%% Prime Streaming (PG-13)
import timeit
import matplotlib.pyplot as plt
import math

class Primes:
    @staticmethod
    def stream1():
        primes = []
        yield 2
        i = 3
        while True:
            is_prime = True
            for j in primes:
                if i % j == 0: 
                    is_prime = False
                    break
            if is_prime:
                yield i
                primes.append(i)
            i += 2
    @staticmethod
    def stream2():
        primes = []
        yield 2
        i = 3
        while True:
            is_prime = True
            for j in filter(lambda x: x &lt; math.sqrt(i) + 1, primes):
                if i % j == 0: 
                    is_prime = False
                    break
            if is_prime:
                yield i
                primes.append(i)
            i += 2

def get_nth_prime1(n):
    s = Primes.stream1()
    cnt = 0
    for i in s:
        cnt += 1
        if cnt &gt; n:
            break
    return i
def get_nth_prime2(n):
    s = Primes.stream2()
    cnt = 0
    for i in s:
        cnt += 1
        if cnt &gt; n:
            break
    return i

s1 = Primes().stream1()
for i in range(20):
    print(next(s1), end=&#39; &#39;)
print()

s2 = Primes().stream2()
for i in range(20):
    print(next(s2), end=&#39; &#39;)
print()

t1 = timeit.timeit(&quot;get_nth_prime1(5000)&quot;,  
              &quot;from __main__ import get_nth_prime1&quot;, number=1)
t2 = timeit.timeit(&quot;get_nth_prime2(5000)&quot;,  
              &quot;from __main__ import get_nth_prime2&quot;, number=1)
print(f&quot;check until n: {t1}\ncheck until sqrt(n): {t2}&quot;)

答案1

得分: 3

filter 操作的时间复杂度为 O(N),在这种情况下是不必要的。相反,我们可以进行 if 检查,并在超过平方根时 break 循环。

你可以在这里阅读更多关于 filter 操作的信息 filter

在上述优化后,这是基准测试的结果:

检查直到 n: 1.7345025209997402
检查直到 sqrt(n): 0.10318170000027749

此外,作为额外内容,请查看 埃拉托斯特尼筛法 算法,它更加简洁和快速。

英文:

The filter operations costs O(N) and is unnecessary in this case. Instead we can do an if check and break the loop once it exceeds the square root.

You can read more about the filter operation here filter

Here's the benchmark after the above optimisations:

check until n: 1.7345025209997402
check until sqrt(n): 0.10318170000027749

Also, as a bonus, do checkout the Sieve of Eratosthenes algorithm which is a lot more cleaner and faster.

答案2

得分: 1

filter 不会提前停止。它会过滤掉大于平方根的数字,但为了实现这一点,它仍然会遍历 所有 数字并测试它们。它只是不会将大的数字传递给你。因此,虽然你确实节省了对大数字进行 i % j == 0 检查的成本,但相反你为它们支付了更高的筛选/lambda成本。对于小数字,你额外支付了这个成本。

你可以改用 takewhile,它确实可以做到你想要的。它会提供给你小数字,并在达到平方根时完全停止。

在我的测试中,你的代码在没有使用 filter 时报告的时间约为 ~1.7 秒,而在使用 filter 时为 ~4.9 秒。使用 takewhile 后,时间为约 ~0.15 秒。如果你在循环之前只计算一次限制条件 lambda x: x &lt; sqrt,那么时间为约 ~0.08 秒。然而,正如另一个答案建议的那样,使用简单的 if+break 仍然更快,因为它避免了筛选/lambda的开销。

英文:

filter doesn't stop early. It does filter out the numbers larger than the square root, but to do that, it still goes through all the numbers and tests them. It just doesn't pass the large ones on to you. So while you do save the cost for your i % j == 0 for the large ones, you pay the larger filter/lambda cost for them instead. And for the small ones you pay it additionally.

You can instead use takewhile, that does do exactly what you were trying to do. It gives you the small numbers and entirely stops when it reaches the square root.

In my own testing, your code reported times ~1.7 seconds without filter and ~4.9 seconds with filter. With takewhile, it's ~0.15 seconds. And ~0.08 seconds if you use lambda x: x &lt; sqrt, computing that limit only once, before the loop. A simple if+break as the other answer suggested is still faster, though, as it avoids the filter/lambda overhead.

huangapple
  • 本文由 发表于 2023年5月28日 11:49:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/76349857.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定