蒙特卡洛数据框高度碎片化

huangapple go评论49阅读模式
英文:

Monte Carlo DataFrame is Highly Fragmented

问题

我非常新于Python,正在尝试运行一个蒙特卡洛模拟,用于课堂。我收到以下错误信息:

性能警告:DataFrame 高度碎片化。通常是由于多次调用 frame.insert 引起的,这会导致性能下降。考虑一次性使用 pd.concat(axis=1) 连接所有列。要获得一个非碎片化的框架,使用 newframe = frame.copy()
results.loc[i, 'Power1']=Power1

我尝试了以下代码:

for i in range(10000):
    Power1=np.random.normal(16000,1000,1)
    Power2=np.random.triangular(12000,15000,18000)
    Efficiency1=np.random.normal(.88,.10,1)
    Efficiency2=np.random.normal(.85,.05,1)
    TotalPower=(Power1*Efficiency1)+(Power2*Efficiency2)
    results.loc[i, 'Power1']=Power1
    results.loc[i, 'Efficiency1']=Efficiency1
    results.loc[i, 'Power2']=Power2
    results.loc[i, 'Efficiency2']=Efficiency2
    results.loc[i, 'TotalPower']=TotalPower

请注意,以上代码中的错误信息是关于DataFrame碎片化的性能警告,建议使用 pd.concat(axis=1) 一次性连接所有列以提高性能。

英文:

I'm very new to python and trying to run a monte carlo sim for a class. I'm receiving the below error:

PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling frame.insert many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use newframe = frame.copy()
results.loc[i, 'Power1']=Power1

I tried the below code:

for i in range(10000):
    Power1=np.random.normal(16000,1000,1)
    Power2=np.random.triangular(12000,15000,18000)
    Efficiency1=np.random.normal(.88,.10,1)
    Efficiency2=np.random.normal(.85,.05,1)
    TotalPower=(Power1*Efficiency1)+(Power2*Efficiency2)
    results.loc[i, 'Power1']=Power1
    results.loc[i, 'Efficiency1']=Efficiency1
    results.loc[i, 'Power2']=Power2
    results.loc[i, 'Efficiency2']=Efficiency2
    results.loc[i, 'TotalPower']=TotalPower

答案1

得分: 1

不要对每一行进行循环,你可以尝试以下方法:

import numpy as np
import pandas as pd

N = 10000

result = pd.DataFrame(columns=['power1', 'efficiency1', 'power2', 'efficiency2', 'totalpower'])
result['power1'] = np.random.normal(16000, 1000, N)
result['power2'] = np.random.triangular(12000, 15000, 18000, size=N)
result['efficiency1'] = np.random.normal(0.88, 0.10, N)
result['efficiency2'] = np.random.normal(0.85, 0.05, N)

result['totalpower'] = result['power1']*result['efficiency1'] + result['power2']*result['efficiency2']


<details>
<summary>英文:</summary>

Instead of looping over each row, you might want to try the following.

import numpy as np
import pandas as pd

N = 10000

result = pd.DataFrame(columns=['power1', 'efficiency1', 'power2', 'efficiency2', 'totalpower'])
result['power1'] = np.random.normal(16000, 1000, N)
result['power2'] = np.random.triangular(12000, 15000, 18000, size=N)
result['efficiency1'] = np.random.normal(0.88, 0.10, N)
result['efficiency2'] = np.random.normal(0.85, 0.05, N)

result['totalpower'] = result['power1']*result['efficiency1'] + result['power2']*result['efficiency2']


</details>



huangapple
  • 本文由 发表于 2023年2月19日 00:59:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/75494904.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定