我需要帮助加速一个包含大量计算的Python for循环。

huangapple go评论72阅读模式
英文:

I need help to speed up of a Python for-loop with huge amount of calculations

问题

我正在开发一款需要运行大量计算的Python软件。我指的是多达数亿次甚至更多的计算(下面的代码中,n可以达到100,000次或更多)。我意识到Python不是这项工作的最佳软件,但我对C或C++没有经验。是否有办法加速下面的Python代码,还是需要引入C或C++?如果需要引入C或C++,是否有关于如何将其嵌入Python脚本的建议?

import math
import random

a = []
b = []
c = []
x1 = []
x2 = []
y1 = []
y2 = []
tresh = 30  # 整数或浮点数

n = 1000  # n可以达到100,000次甚至更多
for i in range(n):
    x1.append(random.randint(0, 100))
    x2.append(random.randint(0, 100))
    y1.append(random.randint(0, 100))
    y2.append(random.randint(0, 100))

def calc():
    x1_len = len(x1)
    y1_len = len(y1)

    for n in range(x1_len):
        for m in range(y1_len):
            d = math.sqrt((abs(y1[m] - x1[n])) ** 2 + (abs(y2[m] - x2[n])) ** 2)

            if tresh <= d <= tresh:
                a.append((y1[m] + x1[n]) / 2)
                b.append((y2[m] + x2[n]) / 2)
                c.append(d)

    return a, b, c

calc()

根据我目前的Python知识和经验,我不知道如何进一步优化代码。我已经查看了很多与for循环相关的问题,但没有找到任何能帮助我的解决方案。

英文:

I am working on a pice of Python software that requires to run a huge amount of calculations. I am talking about up to hundred of millions of calculations or more (n in below code can be 100000 or more). I have realized that Python is not the optimal software for this work but I have no experience with C or C++. Is there a way to speed up the below code in Python or do I need to introduce C or C++? If I need to introduce C or C++, any suggestion for how to embed this in a Python script?

import math
import random

a = []
b = []
c = []
x1 = []
x2 = []
y1 = []
y2 = []
tresh = 30 # int or float

n=1000 # n can be up to 100000 or even more
for i in range(n):
    x1.append(random.randint(0, 100))
    x2.append(random.randint(0, 100))
    y1.append(random.randint(0, 100))
    y2.append(random.randint(0, 100))

def calc():
    x1_len = len(x1)
    y1_len = len(y1)

    for n in range(x1_len):
        for m in range(y1_len):
            d = math.sqrt((abs(y1[m] - x1[n])) ** 2 + (abs(y2[m] - x2[n])) ** 2)

            if d &gt;= tresh and d &lt;= tresh:
                a.append((y1[m] + x1[n]) / 2)
                b.append((y2[m] + x2[n]) / 2)
                c.append(d)

    return a,b,c

calc()

With my current Python knowledge and experience I don't know how to optimize the code further. I have reviewed a lot of for-loop related questions but not found anything has helped me.

答案1

得分: 1

你可以使用外部的Python模块,比如numpy吗?它是Python中用于科学数值计算的基础包,肯定能加快计算速度。

另一件事是,你可以生成随机数后,使用extend而不是append,这也会减少一些计算时间。

英文:

Can you use external Python modules like numpy? It is a fundamental package for scientific numerical computing in Python. It will speed everything up for sure for you.

Other thing - you could generate your random numbers once, then use extend instead of append. It will also shave some computation time.

答案2

得分: 0

a rather simple solution is to use numba to compile it to machine code.

import math
import numpy as np
from numba import njit

n = 10,000  # n can be up to 100,000 or even more
x1 = np.random.randint(0, 100, (n,), dtype=np.int64)
x2 = np.random.randint(0, 100, (n,), dtype=np.int64)
y1 = np.random.randint(0, 100, (n,), dtype=np.int64)
y2 = np.random.randint(0, 100, (n,), dtype=np.int64)
tresh = 30  # int or float

@njit
def calc(x1, x2, y1, y2, tresh):
    a = []
    b = []
    c = []
    x1_len = len(x1)
    y1_len = len(y1)

    for n in range(x1_len):
        for m in range(y1_len):
            d = math.sqrt((abs(y1[m] - x1[n])) ** 2 + (abs(y2[m] - x2[n])) ** 2)

            if d &lt;= tresh:
                a.append((y1[m] + x1[n]) / 2)
                b.append((y2[m] + x2[n]) / 2)
                c.append(d)

    return a,b,c

calc(x1,x2,y1,y2, tresh)  # warmup for njit

import time
t1 = time.time()
calc(x1, x2, y1, y2, tresh)
t2 = time.time()
print(f&quot;took {t2-t1} seconds&quot;)

this takes only 3 seconds for 10,000 entries, if you'd like more performance than that then, while multithreading is possible for extra speedup, it's not simple, as you need python to be the one creating the threads, the current numba multithreading API won't manage this properly (because you cannot tell numba to use a separate list for each thread)

英文:

a rather simple solution is to use numba to compile it to machine code.

import math
import numpy as np
from numba import njit


n = 10_000  # n can be up to 100000 or even more
x1 = np.random.randint(0, 100, (n,), dtype=np.int64)
x2 = np.random.randint(0, 100, (n,), dtype=np.int64)
y1 = np.random.randint(0, 100, (n,), dtype=np.int64)
y2 = np.random.randint(0, 100, (n,), dtype=np.int64)
tresh = 30  # int or float


@njit
def calc(x1, x2, y1, y2, tresh):
    a = []
    b = []
    c = []
    x1_len = len(x1)
    y1_len = len(y1)

    for n in range(x1_len):
        for m in range(y1_len):
            d = math.sqrt((abs(y1[m] - x1[n])) ** 2 + (abs(y2[m] - x2[n])) ** 2)

            if d &lt;= tresh:
                a.append((y1[m] + x1[n]) / 2)
                b.append((y2[m] + x2[n]) / 2)
                c.append(d)

    return a,b,c

calc(x1,x2,y1,y2, tresh)  # warmup for njit

import time
t1 = time.time()
calc(x1, x2, y1, y2, tresh)
t2 = time.time()
print(f&quot;took {t2-t1} seconds&quot;)

this takes only 3 seconds for 10_000 entires, if you'd like more performance than that then, while multithreading is possible for extra speedup, it's not simple, as you need python to be the one creating the threads, the current numba multithreading API won't manage this properly (because you cannot tell numba to use a serpate list for each thread)

huangapple
  • 本文由 发表于 2023年2月27日 02:54:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/75574303.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定