2023年2月6日 09:05:39go评论112阅读模式

英文:

Appending to a numpy array in for loop

问题

I'm trying to create a Monte Carlo simulation to simulate future stock prices using Numpy arrays.
我正在尝试使用NumPy数组创建蒙特卡洛模拟以模拟未来的股价。

My current approach is: create a For Loop which fills an array, stock_price_array, with simulated stock prices.
我的当前方法是：创建一个For循环，用模拟的股价填充数组stock_price_array。

These stock prices are generated by taking the last stock price, then multiplying it by 1 + an annual return.
这些股价是通过取最后一次的股价，然后乘以1加上年回报来生成的。

The annual returns are drawn randomly from a normal distribution and stored in the array annual_ret.
年回报是从正态分布中随机抽取的，然后存储在数组annual_ret中。

My problem is that although the "stock price" variables I print from my For Loop appear to be correct, I simply cannot figure out how to Append these stock price variables to stock_price_array.
我的问题是，尽管我从For循环中打印出来的“股价”变量似乎是正确的，但我简单地无法弄清楚如何将这些股价变量附加到stock_price_array中。

I've tried various methods, including initializing the stock_price_array using .full instead of .empty, changing the order of where the array appears in the For Loop, and checking the size of the array.
我尝试了各种方法，包括使用.full而不是.empty初始化stock_price_array，更改数组在For循环中出现的顺序，并检查数组的大小。

I've read other Stack Overflow posts on similar topics but can't figure out what I'm doing wrong.
我阅读了其他关于类似主题的Stack Overflow帖子，但无法弄清楚我做错了什么。

Thank you in advance for your help!
在此提前感谢您的帮助！

英文:

I'm trying to create a Monte Carlo simulation to simulate future stock prices using Numpy arrays.

My current approach is: create a For Loop which fills an array, stock_price_array, with simulated stock prices. These stock prices are generated by taking the last stock price, then multiplying it by 1 + an annual return. The annual returns are drawn randomly from a normal distribution and stored in the array annual_ret.

My problem is that although the "stock price" variables I print from my For Loop appear to be correct, I simply cannot figure out how to Append these stock price variables to stock_price_array.

I've read other Stack Overflow posts on similar topics but can't figure out what I'm doing wrong.

Thank you in advance for your help!

annual_mean = .06
annual_stdev = .15
start_stock_price = 100
numYears = 3
numSimulations = 4
stock_price_array = np.empty(numYears)
# draw an annual return from a normal distribution; this annual return will be random
annual_ret = np.random.normal(annual_mean, annual_stdev, numSimulations)
for i in range(numYears):
    stock_price = np.multiply(start_stock_price, (1 + annual_ret[i]))
    np.append(stock_price_array, [stock_price])
    start_stock_price = stock_price

答案1

得分: 2

numpy的第一条规则是：永远不要手动迭代数组。使用numpy函数进行批处理计算（它们会迭代数组，但这不是Python迭代，因此速度更快）。

无for循环解决方案

例如，在这里，你可以像这样做

np.cumprod(np.hstack([start_stock_price, annual_ret+1]))

它的作用是首先构建一个包含初始值和一些因子的数组。
因此，如果初始值为100，利率为0.1，-0.1，0.2，0.2（例如），然后hstack构建一个值数组100, 1.1, 0.9, 1.2, 1.2。

然后cumprod只是构建这些值的累积乘积

100, 100×1.1=110, 100×1.1×0.9=110×0.9=99, 100×1.1×0.9×1.2=99×1.2=118.8, 100×1.1×0.9×1.2×1.2=118.8×1.2=142.56

你的更正

无论如何回答你的初始问题（即使我强烈建议你尝试使用像我展示的cumprod的解决方案），你有两个选择：

要么提前分配一个数组，就像你做的那样（stock_price_array = np.empty(numYears)）。然后，不要尝试将新的stock_price附加到stock_price_array，而是只需填充已经存在的空位置之一，通过执行stock_price_array[i] = stock_price
要么不这样做。然后，将np.empty行替换为stock_price_array=[]。然后，在每个步骤中，你可以使用np.append来创建新的stock_price_array，像这样 stock_price_array = np.append(stock_price_array, [stock_price])

我强烈不建议第二种解决方案。因为你已经知道数组的最终大小，最好一次性创建它。因为np.append会重新创建一个全新的数组，然后将输入数据复制到其中。它不仅仅是扩展现有数组（一般来说，我们无法这样做）。

但无论如何，我都不建议这两种解决方案，因为我认为我的解决方案（使用cumprod）更可取。在numpy中，“for”是一个禁忌词。尤其是当for循环内部的操作是创建一个新数组，就像append一样。

蒙特卡罗

既然你提到了蒙特卡罗，然后展示了一个只计算一个结果的代码（你生成了1组年度回报，并执行了一次未来价值的计算），我想知道这是否真的是你想要的。
特别是，我注意到你有numSimulation和numYears，它们在你的代码中起了冗余的作用（因此在我的代码中也是如此）。
之所以不会抛出索引错误的唯一原因是因为numSimulation仅用于决定你绘制多少annual_ret。而且由于numSimulation > numYears，你有足够多的annual_ret来计算结果。

你最初的意图是否是要多次在numSimulation年内重新进行模拟，以获得numSimulation个结果？

如果我猜测得不完全准确，我认为你真正想做的可能更像是：

annual_ret = np.random.normal(annual_mean, annual_stdev, (numSimulations, numYears)) # 2D数组的利率。每行一个模拟，每列一年
t = np.pad(annual_ret+1, ((0,0), (1,0)), constant_values=start_stock_price) # 添加1，就像我们之前做过的那样。并在每次模拟的开头（即`start_stock_price`）填充
res = np.cumprod(t, axis=1) # 累积乘法。`axis=1`表示沿着轴1（年份）对每行（每个模拟）执行操作

英文:

The 1st rule of numpy is: never iterate your array yourself. Use numpy function that does all the computation in batch (and for doing so, they iterate the array, sure. But that iteration is not a python iteration, so it is way faster).

No-for solution

For example, here, you could do something like this

np.cumprod(np.hstack([start_stock_price, annual_ret+1]))

What it does is 1st building an array of a initial value, and some factors.
So if initial value is 100, and interest rate are 0.1, -0.1, 0.2, 0.2 (for example), then hstack build and array of values 100, 1.1, 0.9, 1.2, 1.2.

And the cumprod just build the cumulative product of those

100, 100×1.1=110, 100×1.1×0.9=110×0.9=99, 100×1.1×0.9×1.2=99×1.2=118.8, 100×1.1×0.9×1.2×1.2=118.8×1.2=142.56

Correction of yours

To answer to your initial question anyway (even if I strongly advise that you try to use solutions like the usage of cumprod I've shown), you have 2 choices:

Either you allocate in advance an array, as you did (your stock_price_array = np.empty(numYears)). And then, instead of trying to append the new stock_price to stock_price_array, you should simply fill one of the empty place that are already there. By simply doing stock_price_array[i] = stock_price
Or you don't. And then you replace the np.empty line by a stock_price_array=[]. And then, at each step, you do append the result to create a new stock_price_array, like this stock_price_array = np.append(stock_price_array, [stock_price])

I strongly advise against the 2nd solution. Since you already know the final size of the array, it is way better to create it once. Because np.append recreate a brand new array, then copies the input data it it. It does not just extend the existing array (generally speaking, we can't do that anyway).

But, well, anyway, I advise against both solution, since I find mine (with cumprod) preferable. for is the taboo word in numpy. And it is even more so, when what inside this for is the creation of a new array, like append is.

Monte-Carlo

Since you've mentioned Monte-Carlo, and then shown a code that compute only one result (you draw 1 set of annual ret, and perform one computation of future values), I am wondering if that is really what you want.
In particular, I see that you have numSimulation and numYears, that appear to be playing redundant roles in your code (and therefore in mines).
The only reason why it doesn't just throw a index error, is because numSimulation is used only to decide how many annual_ret you draw. And since numSimulation > numYears, you have more than enough annual_ret to compute the result.

Wasn't your initial intention to redo the simulation over the years numSimulation time, to have numSimulation results ?

In which case, you probably need numSimulation sets of numYears annual rate. So a 2D array. And like wise, you should be computing numSimulation series of numYears results.

If my guess is not completely off, I surmise that what you really wanted to do was rather in the effect of:

annual_ret = np.random.normal(annual_mean, annual_stdev, (numSimulations, numYears)) # 2d array of interest rate. 1 simulation per row, 1 year per column
t = np.pad(annual_ret+1, ((0,0), (1,0)), constant_values=start_stock_price) # Add 1 as we did earlier. And pad with an initial 100 (`start_stock_price`) at the beginning of each simulation
res = np.cumprod(t, axis=1) # cumulative multiplication. `axis=1` means that it is done along axis 1 (along years) for each row (for each simulation)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Appending to a numpy array in for loop

问题

答案1

无for循环解决方案

你的更正

蒙特卡罗

No-for solution

Correction of yours

Monte-Carlo

为什么不能对*[]Struct进行范围遍历？

I don't understand why I'm getting this error: "FileNotFoundError: [Errno 2] No such file or directory: 'style/style.css'"

我怎样将我的凭证添加到 .gitignore，但仍然可以执行我的 Python？

如何获取公司盈利公告数据API？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。