2023年7月23日 18:42:57go评论60阅读模式

英文:

ValueError: non-broadcastable output operand with shape (1,64) doesn't match the broadcast shape (2,64)

问题

抱歉，我只能提供有关翻译的服务，不提供代码或错误排查。如果您需要关于代码错误的帮助，请提供更多信息以便其他人能够帮助您诊断问题。

英文:

I am following the instructions to make a neural network from the book "Neural Networks from Scratch in Python" and I have gotten to the SGD optimizer with learning rate decay, but when I try to add momentum to it, I get this error:

Traceback (most recent call last):
  File &quot;nn.py&quot;, line 142, in &lt;module&gt;
    optimizer.update_params(dense1)
  File &quot;nn.py&quot;, line 110, in update_params
    layer.biases += bias_updates
ValueError: non-broadcastable output operand with shape (1,64) doesn&#39;t match the broadcast shape (2,64)

For me this seems to be an error with NumPy, but I barely have experience with it and I don't understand this error well.

Here is my code:


import numpy as np
from nnfs.datasets import spiral_data
import nnfs
import pickle

nnfs.init()

class Layer_Dense:
    def __init__(self, n_inputs, n_neurons):
        self.weights = 0.01 * np.random.randn(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons))
    
    def forward(self, inputs):
        self.output = np.dot(inputs, self.weights) + self.biases
        self.inputs = inputs
    
    def backward(self, dvalues):
        self.dweights = np.dot(self.inputs.T, dvalues)
        self.dbiases = np.sum(dvalues, axis=0, keepdims=True)
        self.dinputs = np.dot(dvalues, self.weights.T)
        
class Activation_ReLU:
        def forward(self, inputs):
            self.output = np.maximum(0, inputs)
            self.inputs = inputs
        def backward(self, dvalues):
            self.dinputs = dvalues.copy()
            self.dinputs[self.inputs &lt;= 0] = 0

class Activation_Softmax:
    def forward(self, inputs):
        exp_values = np.exp(inputs - np.max(inputs, axis=1, keepdims=True))
        probabilities = exp_values / np.sum(exp_values, axis=1, keepdims=True)
        self.output = probabilities
    def backward(self, dvalues):
        self.dinputs = np.empty_like(dvalues)
        for index, (single_output, single_dvalues) in enumerate(zip(self.output, dvalues)):
            single_output = single_output.reshape(-1, 1)
            jacobian_matrix = np.diagflat(single_output) - np.dot(single_output,single_output.T)
            self.dinputs[index] = np.dot(jacobian_matrix, single_dvalues)

class Loss:
    def calculate(self, output, y):
        sample_losses = self.forward(output, y)
        data_loss = np.mean(sample_losses)
        return data_loss

class Loss_CategoricalCrossEntropy(Loss):
    def forward(self, y_pred, y_true):
        samples = len(y_pred)
        y_pred_clipped = np.clip(y_pred, 1e-7, 1-1e-7)
        if len(y_true.shape) == 1:
            correct_confidences = y_pred_clipped[
                range(samples),
                y_true
            ]
        elif len(y_true.shape) == 2:
            correct_confidences = np.sum(
                y_pred_clipped * y_true,
                axis=1
            )
        negative_log_likelihoods = -np.log(correct_confidences)
        return negative_log_likelihoods
    def backward(self, dvalues, y_true):
        samples = len(dvalues)
        labels = len(dvalues[0])
        if len(y_true.shape) == 1:
            y_true = np.eye(labels)[y_true]
        self.dinputs = -y_true / dvalues
        self.dinputs = self.dinputs / samples
class Activation_Softmax_Loss_CategoricalCrossEntropy():
    def __init__(self):
        self.activation = Activation_Softmax()
        self.loss = Loss_CategoricalCrossEntropy()
    def forward(self, inputs, y_true):
        self.activation.forward(inputs)
        self.output = self.activation.output
        return self.loss.calculate(self.output, y_true)
    def backward(self, dvalues, y_true):
        samples = len(dvalues)
        if len(y_true.shape) == 2:
            y_true = np.argmax(y_true, axis=1)
        self.dinputs = dvalues.copy()
        self.dinputs[range(samples), y_true] -= 1
        self.dinputs = self.dinputs / samples
class Optimizer_SGD:
    def __init__(self, learning_rate=1., decay=0., momentum=0.):
        self.learning_rate = learning_rate
        self.current_learning_rate = learning_rate
        self.decay = decay
        self.iterations = 0
        self.momentum = momentum

    def pre_update_params(self):
        if self.decay:
            self.current_learning_rate = self.learning_rate * (1. / (1. + self.decay * self.iterations))
    
    def update_params(self, layer):
        if self.momentum:
            if not hasattr(layer, &#39;weight_momentums&#39;):
                layer.weight_momentums = np.zeros_like(layer.weights)
                layer.bias_momentums = np.zeros_like(layer.biases)
            weight_updates = self.momentum * layer.weight_momentums - self.current_learning_rate * layer.dweights
            layer.weight_momentums = weight_updates
            bias_updates = self.momentum * layer.bias_momentums - self.current_learning_rate * layer.dweights
            layer.bias_momentums = bias_updates
        else:
            weight_updates = -self.current_learning_rate * layer.dweights
            bias_updates = -self.current_learning_rate * layer.dbiases
        layer.weights += weight_updates
        layer.biases += bias_updates
    def post_update_params(self):
        self.iterations += 1

X, y = spiral_data(samples=100, classes=3)

optimizer = Optimizer_SGD(decay=1e-3, momentum=0.5)
dense1 = Layer_Dense(2, 64)
activation1 = Activation_ReLU()
dense2 = Layer_Dense(64, 3)
loss_activation = Activation_Softmax_Loss_CategoricalCrossEntropy()

for epoch in range(10001):
    dense1.forward(X)
    activation1.forward(dense1.output)
    dense2.forward(activation1.output)
    loss = loss_activation.forward(dense2.output, y)
    predictions = np.argmax(loss_activation.output, axis=1)
    
    if len(y.shape) == 2:
        y = np.argmax(y, axis=1)
    accuracy = np.mean(predictions==y)
    
    if not epoch % 100:
        print(f&#39;epoch: {epoch}, acc: {accuracy:.3f}, loss: {loss:.3f}, lr: {optimizer.current_learning_rate}&#39;)
        
    loss_activation.backward(loss_activation.output, y)
    dense2.backward(loss_activation.dinputs)
    activation1.backward(dense2.dinputs)
    dense1.backward(activation1.dinputs)
    
    optimizer.pre_update_params()
    optimizer.update_params(dense1)
    optimizer.update_params(dense2)
    optimizer.post_update_params()


    

stream = [dense1.weights, dense1.biases, dense2.weights, dense2.biases]
with open(&quot;trained.nn&quot;, &quot;wb&quot;) as h:
    pickle.dump(stream, h)

I know pickle isn't the best way, but it was easiest for me to program.

I double and triple checked that I had written the right code and I am sure I have written it correctly. I, however, doubt that because then it would be an error in the book, which aren't that common in my experience.

答案1

得分: 0

ValueError: 非可广播的输出操作数，形状为 (1,64)，与广播形状 (2,64) 不匹配。

这个错误意味着NumPy无法将两个数组进行广播。广播发生在你尝试对两个不同形状的数组进行算术运算时。在这种情况下，你试图将形状为**(2, 64)的数组bias_updates与形状为(1, 64)**的数组layer.biases相加。

我建议你调查一下为什么数组bias_updates的形状是**(2, 64)，如果你期望它的形状是(1, 64)**。

根据你的代码的这一部分：

bias_updates = (
        self.momentum * layer.bias_momentums
        - self.current_learning_rate * layer.dweights
    )

bias_updates的形状取决于数组layer.dweights的形状（NumPy会默默地对你的数组进行广播）。

通常，要调试这种类型的错误，你应该写下每个数组的预期形状，并在运行时检查它们是否相符。

英文:

> ValueError: non-broadcastable output operand with shape (1,64) doesn't match the broadcast shape (2,64)

This error means that numpy can't broadcast two arrays together. Broadcasting happens when you try to use arithmetic operations on two arrays of different shapes. In this case you're trying to add an array bias_updates of shape (2, 64) to an array layer.biases of shape (1, 64).

I would recommand that you investigate why the array bias_updates is of shape (2, 64) if you expect it to be of shape (1, 64).

Given this part of your code:

bias_updates = (
self.momentum * layer.bias_momentums
- self.current_learning_rate * layer.dweights
)

The shape of bias_updates comes from the shape of the array layer.dweights (numpy is silently broadcasting your arrays).

In general to debug this kind of errors you should write down the expected shapes of each of your arrays and check that they correspond at runtime.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

ValueError: 输出操作数的形状 (1,64) 与广播形状 (2,64) 不匹配。

问题

答案1

需要帮助网页抓取表格。

无法使用Django的set_cookie方法设置Cookie。

问题：在Python中读取包含字符“ê：você”的葡萄牙文文本文件时出现问题。

Python requests works fine, when trying same request in golang is not working as expected

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论