英文:
Monte Carlo DataFrame is Highly Fragmented
问题
我非常新于Python,正在尝试运行一个蒙特卡洛模拟,用于课堂。我收到以下错误信息:
性能警告:DataFrame 高度碎片化。通常是由于多次调用 frame.insert
引起的,这会导致性能下降。考虑一次性使用 pd.concat(axis=1)
连接所有列。要获得一个非碎片化的框架,使用 newframe = frame.copy()
results.loc[i, 'Power1']=Power1
我尝试了以下代码:
for i in range(10000):
Power1=np.random.normal(16000,1000,1)
Power2=np.random.triangular(12000,15000,18000)
Efficiency1=np.random.normal(.88,.10,1)
Efficiency2=np.random.normal(.85,.05,1)
TotalPower=(Power1*Efficiency1)+(Power2*Efficiency2)
results.loc[i, 'Power1']=Power1
results.loc[i, 'Efficiency1']=Efficiency1
results.loc[i, 'Power2']=Power2
results.loc[i, 'Efficiency2']=Efficiency2
results.loc[i, 'TotalPower']=TotalPower
请注意,以上代码中的错误信息是关于DataFrame碎片化的性能警告,建议使用 pd.concat(axis=1)
一次性连接所有列以提高性能。
英文:
I'm very new to python and trying to run a monte carlo sim for a class. I'm receiving the below error:
PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling frame.insert
many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use newframe = frame.copy()
results.loc[i, 'Power1']=Power1
I tried the below code:
for i in range(10000):
Power1=np.random.normal(16000,1000,1)
Power2=np.random.triangular(12000,15000,18000)
Efficiency1=np.random.normal(.88,.10,1)
Efficiency2=np.random.normal(.85,.05,1)
TotalPower=(Power1*Efficiency1)+(Power2*Efficiency2)
results.loc[i, 'Power1']=Power1
results.loc[i, 'Efficiency1']=Efficiency1
results.loc[i, 'Power2']=Power2
results.loc[i, 'Efficiency2']=Efficiency2
results.loc[i, 'TotalPower']=TotalPower
答案1
得分: 1
不要对每一行进行循环,你可以尝试以下方法:
import numpy as np
import pandas as pd
N = 10000
result = pd.DataFrame(columns=['power1', 'efficiency1', 'power2', 'efficiency2', 'totalpower'])
result['power1'] = np.random.normal(16000, 1000, N)
result['power2'] = np.random.triangular(12000, 15000, 18000, size=N)
result['efficiency1'] = np.random.normal(0.88, 0.10, N)
result['efficiency2'] = np.random.normal(0.85, 0.05, N)
result['totalpower'] = result['power1']*result['efficiency1'] + result['power2']*result['efficiency2']
<details>
<summary>英文:</summary>
Instead of looping over each row, you might want to try the following.
import numpy as np
import pandas as pd
N = 10000
result = pd.DataFrame(columns=['power1', 'efficiency1', 'power2', 'efficiency2', 'totalpower'])
result['power1'] = np.random.normal(16000, 1000, N)
result['power2'] = np.random.triangular(12000, 15000, 18000, size=N)
result['efficiency1'] = np.random.normal(0.88, 0.10, N)
result['efficiency2'] = np.random.normal(0.85, 0.05, N)
result['totalpower'] = result['power1']*result['efficiency1'] + result['power2']*result['efficiency2']
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论