英文:
I need help to speed up of a Python for-loop with huge amount of calculations
问题
我正在开发一款需要运行大量计算的Python软件。我指的是多达数亿次甚至更多的计算(下面的代码中,n可以达到100,000次或更多)。我意识到Python不是这项工作的最佳软件,但我对C或C++没有经验。是否有办法加速下面的Python代码,还是需要引入C或C++?如果需要引入C或C++,是否有关于如何将其嵌入Python脚本的建议?
import math
import random
a = []
b = []
c = []
x1 = []
x2 = []
y1 = []
y2 = []
tresh = 30 # 整数或浮点数
n = 1000 # n可以达到100,000次甚至更多
for i in range(n):
x1.append(random.randint(0, 100))
x2.append(random.randint(0, 100))
y1.append(random.randint(0, 100))
y2.append(random.randint(0, 100))
def calc():
x1_len = len(x1)
y1_len = len(y1)
for n in range(x1_len):
for m in range(y1_len):
d = math.sqrt((abs(y1[m] - x1[n])) ** 2 + (abs(y2[m] - x2[n])) ** 2)
if tresh <= d <= tresh:
a.append((y1[m] + x1[n]) / 2)
b.append((y2[m] + x2[n]) / 2)
c.append(d)
return a, b, c
calc()
根据我目前的Python知识和经验,我不知道如何进一步优化代码。我已经查看了很多与for循环相关的问题,但没有找到任何能帮助我的解决方案。
英文:
I am working on a pice of Python software that requires to run a huge amount of calculations. I am talking about up to hundred of millions of calculations or more (n in below code can be 100000 or more). I have realized that Python is not the optimal software for this work but I have no experience with C or C++. Is there a way to speed up the below code in Python or do I need to introduce C or C++? If I need to introduce C or C++, any suggestion for how to embed this in a Python script?
import math
import random
a = []
b = []
c = []
x1 = []
x2 = []
y1 = []
y2 = []
tresh = 30 # int or float
n=1000 # n can be up to 100000 or even more
for i in range(n):
x1.append(random.randint(0, 100))
x2.append(random.randint(0, 100))
y1.append(random.randint(0, 100))
y2.append(random.randint(0, 100))
def calc():
x1_len = len(x1)
y1_len = len(y1)
for n in range(x1_len):
for m in range(y1_len):
d = math.sqrt((abs(y1[m] - x1[n])) ** 2 + (abs(y2[m] - x2[n])) ** 2)
if d >= tresh and d <= tresh:
a.append((y1[m] + x1[n]) / 2)
b.append((y2[m] + x2[n]) / 2)
c.append(d)
return a,b,c
calc()
With my current Python knowledge and experience I don't know how to optimize the code further. I have reviewed a lot of for-loop related questions but not found anything has helped me.
答案1
得分: 1
你可以使用外部的Python模块,比如numpy吗?它是Python中用于科学数值计算的基础包,肯定能加快计算速度。
另一件事是,你可以生成随机数后,使用extend
而不是append
,这也会减少一些计算时间。
英文:
Can you use external Python modules like numpy? It is a fundamental package for scientific numerical computing in Python. It will speed everything up for sure for you.
Other thing - you could generate your random numbers once, then use extend
instead of append
. It will also shave some computation time.
答案2
得分: 0
a rather simple solution is to use numba to compile it to machine code.
import math
import numpy as np
from numba import njit
n = 10,000 # n can be up to 100,000 or even more
x1 = np.random.randint(0, 100, (n,), dtype=np.int64)
x2 = np.random.randint(0, 100, (n,), dtype=np.int64)
y1 = np.random.randint(0, 100, (n,), dtype=np.int64)
y2 = np.random.randint(0, 100, (n,), dtype=np.int64)
tresh = 30 # int or float
@njit
def calc(x1, x2, y1, y2, tresh):
a = []
b = []
c = []
x1_len = len(x1)
y1_len = len(y1)
for n in range(x1_len):
for m in range(y1_len):
d = math.sqrt((abs(y1[m] - x1[n])) ** 2 + (abs(y2[m] - x2[n])) ** 2)
if d <= tresh:
a.append((y1[m] + x1[n]) / 2)
b.append((y2[m] + x2[n]) / 2)
c.append(d)
return a,b,c
calc(x1,x2,y1,y2, tresh) # warmup for njit
import time
t1 = time.time()
calc(x1, x2, y1, y2, tresh)
t2 = time.time()
print(f"took {t2-t1} seconds")
this takes only 3
seconds for 10,000
entries, if you'd like more performance than that then, while multithreading is possible for extra speedup, it's not simple, as you need python to be the one creating the threads, the current numba multithreading API won't manage this properly (because you cannot tell numba to use a separate list for each thread)
英文:
a rather simple solution is to use numba to compile it to machine code.
import math
import numpy as np
from numba import njit
n = 10_000 # n can be up to 100000 or even more
x1 = np.random.randint(0, 100, (n,), dtype=np.int64)
x2 = np.random.randint(0, 100, (n,), dtype=np.int64)
y1 = np.random.randint(0, 100, (n,), dtype=np.int64)
y2 = np.random.randint(0, 100, (n,), dtype=np.int64)
tresh = 30 # int or float
@njit
def calc(x1, x2, y1, y2, tresh):
a = []
b = []
c = []
x1_len = len(x1)
y1_len = len(y1)
for n in range(x1_len):
for m in range(y1_len):
d = math.sqrt((abs(y1[m] - x1[n])) ** 2 + (abs(y2[m] - x2[n])) ** 2)
if d <= tresh:
a.append((y1[m] + x1[n]) / 2)
b.append((y2[m] + x2[n]) / 2)
c.append(d)
return a,b,c
calc(x1,x2,y1,y2, tresh) # warmup for njit
import time
t1 = time.time()
calc(x1, x2, y1, y2, tresh)
t2 = time.time()
print(f"took {t2-t1} seconds")
this takes only 3
seconds for 10_000
entires, if you'd like more performance than that then, while multithreading is possible for extra speedup, it's not simple, as you need python to be the one creating the threads, the current numba multithreading API won't manage this properly (because you cannot tell numba to use a serpate list for each thread)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论