英文:
Python `for` loop doesn't update `numpy` array values
问题
import numpy as np
def target_gradient(theta):
e = 10
for i in range(theta.shape[0]):
theta_upper = theta
theta_lower = theta
theta_upper[i] = theta[i] + e
theta_lower[i] = theta[i] - e
print(f"theta_upper {theta_upper}")
print(f"theta_lower {theta_lower}")
return theta_upper, theta_lower
u, l = target_gradient(np.array([1, 1, 1, 1, 1]))
英文:
I'm trying to make a simple numerical gradient function and part of it is a for loop updating parameter values that would later be evaluated. The code is as follows:
import numpy as np
def target_gradient(theta):
e = 10
for i in range(theta.shape[0]):
theta_upper = theta
theta_lower = theta
theta_upper[i] = theta[i] + e
theta_lower[i] = theta[i] - e
print(f"theta_upper {theta_upper}")
print(f"theta_lower {theta_lower}")
return theta_upper, theta_lower
u, l = target_gradient(np.array([1, 1, 1, 1, 1]))
However, instead of the anticipated output, I get [1 1 1 1 1]
for both arrays. Print statements are there for monitoring and they show that throughout the loop the arrays didn't change (i.e. were [1 1 1 1 1]
).e=10
is so that the effect is more pronounced. I also tried the enumerate()
approach, but get the same result.
The full gradient funtion would look something like this
def target_gradient(theta, x, y):
e = 0.01
gradient = np.zeros(theta.shape[0])
for i in range(theta.shape[0]):
theta_upper = theta
theta_lower = theta
theta_upper[i] = theta[i] + e
theta_lower[i] = theta[i] - e
gradient[i] = (
foo(theta=theta_upper, x=x, y=y) - foo(theta=theta_lower, x=x, y=y)
) / (2 * e)
return gradient
Therefore, I am intentionally declaring theta_upper = theta
inside the loop because I want to calculate the gradient for which I need partial (numerical) derivatives.
答案1
得分: 2
以下是已翻译的部分:
If foo
can take vector arguments and return vector values, e.g.
def foo(theta, x, y):
return x * y * np.sin(theta)
Then you can simply do:
def target_gradient(theta, x, y, e=0.01):
foo_upper = foo(theta + e, x, y) # 将 e 添加到整个 theta 向量中,并调用 foo
foo_lower = foo(theta - e, x, y) # 从整个 theta 向量中减去 e,并调用 foo
return (foo_upper - foo_lower) / (2 * e)
Based on your code, where you pass a vector theta_upper
to foo
, I suspect this is the case.
If foo
can't take vector arguments and return vector values, e.g.
def foo(theta, x, y):
return x * y * math.sin(theta)
then you need to iterate over theta
, and call foo
for each value.
def target_gradient(theta, x, y, e=0.01):
gradient = np.zeros(theta.shape[0])
for i in range(theta.shape[0]):
foo_upper = foo(theta[i] + e, x[i], y[i]) # 取 theta 的单个值,并添加 e
foo_lower = foo(theta[i] - e, x[i], y[i]) # 取 theta 的单个值,并减去 e
gradient[i] = (foo_upper - foo_lower) / (2 * e)
return gradient
英文:
The best approach depends on what foo
is:
If foo
can take vector arguments and return vector values, e.g.
def foo(theta, x, y):
return x * y * np.sin(theta)
Then you can simply do:
def target_gradient(theta, x, y, e=0.01):
foo_upper = foo(theta + e, x, y) # Add e to the entire theta vector, and call foo
foo_lower = foo(theta - e, x, y) # Subtract e from the entire theta vector, and call foo
return (foo_upper - foo_lower) / (2 * e)
Based on your code, where you pass a vector theta_upper
to foo
, I suspect this is the case.
If foo
can't take vector arguments and return vector values, e.g.
def foo(theta, x, y):
return x * y * math.sin(theta)
then you need to iterate over theta
, and call foo
for each value.
def target_gradient(theta, x, y, e=0.01):
gradient = np.zeros(theta.shape[0])
for i in range(theta.shape[0]):
foo_upper = foo(theta[i] + e, x[i], y[i]) # Take a single value of theta, and add e
foo_lower = foo(theta[i] - e, x[i], y[i]) # Take a single value of theta, and subtract e
gradient[i] = (foo_upper - foo_lower) / (2 * e)
return gradient
答案2
得分: -1
theta_upper
和 theta_lower
在循环内部不会改变的原因是因为您正在创建 theta
的副本并将它们分配给 theta_upper
和 theta_lower
。因此,当您修改 theta_upper[i]
或 theta_lower[i]
时,您并没有修改原始的 theta
数组。
要修复这个问题,您可以使用 copy()
方法创建 theta
的副本,然后在循环内进行修改,如下所示:
def target_gradient(theta):
e = 10
for i in range(theta.shape[0]):
theta_upper = theta.copy()
theta_lower = theta.copy()
theta_upper[i] = theta[i] + e
theta_lower[i] = theta[i] - e
print(f"theta_upper {theta_upper}")
print(f"theta_lower {theta_lower}")
return theta_upper, theta_lower
英文:
The reason why theta_upper
and theta_lower
are not changing inside the loop is because you are creating copies of theta
and assigning them to theta_upper
and theta_lower
. Therefore, when you modify theta_upper[i]
or theta_lower[i]
, you are not modifying the original theta
array.
To fix this, you can use the copy()
method to create a copy of theta
that you can modify inside the loop, like this:
def target_gradient(theta):
e = 10
for i in range(theta.shape[0]):
theta_upper = theta.copy()
theta_lower = theta.copy()
theta_upper[i] = theta[i] + e
theta_lower[i] = theta[i] - e
print(f"theta_upper {theta_upper}")
print(f"theta_lower {theta_lower}")
return theta_upper, theta_lower
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论