问题

根据您提供的内容，以下是翻译好的部分：

我试图理解为什么直接计算密集层操作和使用`keras`实现之间存在差异。

根据文档（https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense），`tf.keras.layers.Dense()`应该实现操作`output = activation(dot(input, kernel) + bias)`，但下面的`result`和`result1`不相同。

```python
tf.random.set_seed(1)

bias = tf.Variable(tf.random.uniform(shape=(5,1)), dtype=tf.float32)
kernel = tf.Variable(tf.random.uniform(shape=(5,10)), dtype=tf.float32)
x = tf.constant(tf.random.uniform(shape=(10,1), dtype=tf.float32))

result = tf.nn.relu(tf.linalg.matmul(a=weights, b=x) + biases)
tf.print(result)

test = tf.keras.layers.Dense(units = 5, 
                            activation = 'relu',
                            use_bias = True, 
                            kernel_initializer = tf.keras.initializers.Constant(value=kernel), 
                            bias_initializer = tf.keras.initializers.Constant(value=bias), 
                            dtype=tf.float32)

result1 = test(tf.transpose(x))

print()
tf.print(result1)

输出

[[2.87080455]
 [3.25458574]
 [3.28776264]
 [3.14319134]
 [2.04760242]]

[[2.38769 3.63470697 2.62423944 3.31286287 2.91121125]]

使用test.get_weights()我可以看到内核和偏差(b)已设置为正确的值。我正在使用TF版本2.12.0。


<details>
<summary>英文:</summary>

I am trying to understand why there is a difference between calculating a dense layer operation directly and using the `keras` implementation. 

Following the documentation (https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense) `tf.keras.layers.Dense()` should implement the operation `output = activation(dot(input, kernel) + bias)` but `result` and `result1` below are not the same.

```python 
tf.random.set_seed(1)

bias = tf.Variable(tf.random.uniform(shape=(5,1)), dtype=tf.float32)
kernel = tf.Variable(tf.random.uniform(shape=(5,10)), dtype=tf.float32)
x = tf.constant(tf.random.uniform(shape=(10,1), dtype=tf.float32))

result = tf.nn.relu(tf.linalg.matmul(a=weights, b=x) + biases)
tf.print(result)

test = tf.keras.layers.Dense(units = 5, 
                            activation = &#39;relu&#39;,
                            use_bias = True, 
                            kernel_initializer = tf.keras.initializers.Constant(value=kernel), 
                            bias_initializer = tf.keras.initializers.Constant(value=bias), 
                            dtype=tf.float32)

result1 = test(tf.transpose(x))

print()
tf.print(result1)

output


[[2.87080455]
 [3.25458574]
 [3.28776264]
 [3.14319134]
 [2.04760242]]

[[2.38769 3.63470697 2.62423944 3.31286287 2.91121125]]

Using test.get_weights() I can see that the kernel and bias (b) are getting set to the correct values. I am using TF version 2.12.0.

答案1

得分: 0

经过一些实验，我意识到密集层的kernel需要具有shape=(10,5)，而不是原始问题中的(5,10)。这是因为units=5，所以需要传递大小为10的向量（因此input_shape=(10,)被注释掉作为提醒）。以下是已校正的代码：

tf.random.set_seed(1)

bias   = tf.Variable(tf.random.uniform(shape=(5,1)), dtype=tf.float32)
kernel = tf.Variable(tf.random.uniform(shape=(10,5)), dtype=tf.float32)
x = tf.constant(tf.random.uniform(shape=(10,1), dtype=tf.float32))

result = tf.nn.relu(tf.linalg.matmul(a=weights, b=x, transpose_a=True) + biases)
tf.print(result)

test = tf.keras.layers.Dense(units = 5, 
                            # input_shape=(10,),
                            activation = 'relu',
                            use_bias = True, 
                            kernel_initializer = tf.keras.initializers.Constant(value=kernel), 
                            bias_initializer = tf.keras.initializers.Constant(value=bias), 
                            dtype=tf.float32)

result1 = test(tf.transpose(x))

print()
tf.print(result1)

[[2.38769]
 [3.63470697]
 [2.62423944]
 [3.31286287]
 [2.91121125]]

[[2.38769 3.63470697 2.62423944 3.31286287 2.91121125]]

最终，我不完全确定底层发生了什么以及为什么keras没有引发错误。我将检查tf.keras.layers.Dense()的实现，但任何已经了解代码的人的想法或建议都会非常赞赏！

英文:

After some experimentation I realized that the kernel for the dense layer needs to be of shape=(10,5) as apposed to (5,10) as in the code from the original question above. This is implicit because units=5 so a vector of size 10 needs to be passed (hence why input_shape=(10,) is commented out as a reminder). Below is the corrected code:

tf.random.set_seed(1)

bias   = tf.Variable(tf.random.uniform(shape=(5,1)), dtype=tf.float32)
kernel = tf.Variable(tf.random.uniform(shape=(10,5)), dtype=tf.float32)
x = tf.constant(tf.random.uniform(shape=(10,1), dtype=tf.float32))

result = tf.nn.relu(tf.linalg.matmul(a=weights, b=x, transpose_a=True) + biases)
tf.print(result)

test = tf.keras.layers.Dense(units = 5, 
                            # input_shape=(10,),
                            activation = &#39;relu&#39;,
                            use_bias = True, 
                            kernel_initializer = tf.keras.initializers.Constant(value=kernel), 
                            bias_initializer = tf.keras.initializers.Constant(value=bias), 
                            dtype=tf.float32)

result1 = test(tf.transpose(x))

print()
tf.print(result1)

[[2.38769]
 [3.63470697]
 [2.62423944]
 [3.31286287]
 [2.91121125]]

[[2.38769 3.63470697 2.62423944 3.31286287 2.91121125]]

Ultimately, I am not entirely sure what was happening under the hood and why keras did not raise an error. I will check with the tf.keras.layers.Dense() implementation but any thoughts or suggestions by someone who knows the code already are highly appreciated!

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

理解 tf.keras.layers.Dense()

问题

答案1

Which metrics are printed (train or validation) when validation_split and validation_data is not specified in the keras model.fit function?

为什么PyTorch比TensorFlow慢得多？两者都在运行CUDA。

将示例放置在矩阵的行或列中？

如何在TensorFlow 2.5.0中集成自定义的动态时间规整器？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论