英文:
Python `for` loop doesn't update `numpy` array values
问题
import numpy as np
def target_gradient(theta):
e = 10
for i in range(theta.shape[0]):
theta_upper = theta
theta_lower = theta
theta_upper[i] = theta[i] + e
theta_lower[i] = theta[i] - e
print(f"theta_upper {theta_upper}")
print(f"theta_lower {theta_lower}")
return theta_upper, theta_lower
u, l = target_gradient(np.array([1, 1, 1, 1, 1]))
英文:
I'm trying to make a simple numerical gradient function and part of it is a for loop updating parameter values that would later be evaluated. The code is as follows:
import numpy as np
def target_gradient(theta):
e = 10
for i in range(theta.shape[0]):
theta_upper = theta
theta_lower = theta
theta_upper[i] = theta[i] + e
theta_lower[i] = theta[i] - e
print(f"theta_upper {theta_upper}")
print(f"theta_lower {theta_lower}")
return theta_upper, theta_lower
u, l = target_gradient(np.array([1, 1, 1, 1, 1]))
However, instead of the anticipated output, I get [1 1 1 1 1] for both arrays. Print statements are there for monitoring and they show that throughout the loop the arrays didn't change (i.e. were [1 1 1 1 1]).e=10 is so that the effect is more pronounced. I also tried the enumerate() approach, but get the same result.
The full gradient funtion would look something like this
def target_gradient(theta, x, y):
e = 0.01
gradient = np.zeros(theta.shape[0])
for i in range(theta.shape[0]):
theta_upper = theta
theta_lower = theta
theta_upper[i] = theta[i] + e
theta_lower[i] = theta[i] - e
gradient[i] = (
foo(theta=theta_upper, x=x, y=y) - foo(theta=theta_lower, x=x, y=y)
) / (2 * e)
return gradient
Therefore, I am intentionally declaring theta_upper = theta inside the loop because I want to calculate the gradient for which I need partial (numerical) derivatives.
答案1
得分: 2
以下是已翻译的部分:
If foo can take vector arguments and return vector values, e.g.
def foo(theta, x, y):
return x * y * np.sin(theta)
Then you can simply do:
def target_gradient(theta, x, y, e=0.01):
foo_upper = foo(theta + e, x, y) # 将 e 添加到整个 theta 向量中,并调用 foo
foo_lower = foo(theta - e, x, y) # 从整个 theta 向量中减去 e,并调用 foo
return (foo_upper - foo_lower) / (2 * e)
Based on your code, where you pass a vector theta_upper to foo, I suspect this is the case.
If foo can't take vector arguments and return vector values, e.g.
def foo(theta, x, y):
return x * y * math.sin(theta)
then you need to iterate over theta, and call foo for each value.
def target_gradient(theta, x, y, e=0.01):
gradient = np.zeros(theta.shape[0])
for i in range(theta.shape[0]):
foo_upper = foo(theta[i] + e, x[i], y[i]) # 取 theta 的单个值,并添加 e
foo_lower = foo(theta[i] - e, x[i], y[i]) # 取 theta 的单个值,并减去 e
gradient[i] = (foo_upper - foo_lower) / (2 * e)
return gradient
英文:
The best approach depends on what foo is:
If foo can take vector arguments and return vector values, e.g.
def foo(theta, x, y):
return x * y * np.sin(theta)
Then you can simply do:
def target_gradient(theta, x, y, e=0.01):
foo_upper = foo(theta + e, x, y) # Add e to the entire theta vector, and call foo
foo_lower = foo(theta - e, x, y) # Subtract e from the entire theta vector, and call foo
return (foo_upper - foo_lower) / (2 * e)
Based on your code, where you pass a vector theta_upper to foo, I suspect this is the case.
If foo can't take vector arguments and return vector values, e.g.
def foo(theta, x, y):
return x * y * math.sin(theta)
then you need to iterate over theta, and call foo for each value.
def target_gradient(theta, x, y, e=0.01):
gradient = np.zeros(theta.shape[0])
for i in range(theta.shape[0]):
foo_upper = foo(theta[i] + e, x[i], y[i]) # Take a single value of theta, and add e
foo_lower = foo(theta[i] - e, x[i], y[i]) # Take a single value of theta, and subtract e
gradient[i] = (foo_upper - foo_lower) / (2 * e)
return gradient
答案2
得分: -1
theta_upper 和 theta_lower 在循环内部不会改变的原因是因为您正在创建 theta 的副本并将它们分配给 theta_upper 和 theta_lower。因此,当您修改 theta_upper[i] 或 theta_lower[i] 时,您并没有修改原始的 theta 数组。
要修复这个问题,您可以使用 copy() 方法创建 theta 的副本,然后在循环内进行修改,如下所示:
def target_gradient(theta):
e = 10
for i in range(theta.shape[0]):
theta_upper = theta.copy()
theta_lower = theta.copy()
theta_upper[i] = theta[i] + e
theta_lower[i] = theta[i] - e
print(f"theta_upper {theta_upper}")
print(f"theta_lower {theta_lower}")
return theta_upper, theta_lower
英文:
The reason why theta_upper and theta_lower are not changing inside the loop is because you are creating copies of theta and assigning them to theta_upper and theta_lower. Therefore, when you modify theta_upper[i] or theta_lower[i], you are not modifying the original theta array.
To fix this, you can use the copy() method to create a copy of theta that you can modify inside the loop, like this:
def target_gradient(theta):
e = 10
for i in range(theta.shape[0]):
theta_upper = theta.copy()
theta_lower = theta.copy()
theta_upper[i] = theta[i] + e
theta_lower[i] = theta[i] - e
print(f"theta_upper {theta_upper}")
print(f"theta_lower {theta_lower}")
return theta_upper, theta_lower
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论