2023年6月2日 00:59:23go评论63阅读模式

英文:

Preprocessing and feature selection in a custom keras layer

问题

我正在参加这个ASL手语拼写Kaggle竞赛。

我们被提供了一组短语，如“9560 plano”。每个短语都与一个表格相关联。表格的行是视频的帧编号。列是人体上1630个点的x、y和z坐标。竞赛的目标是创建一个模型，从表格中恢复短语。

我有一个工作正常的模型，但它需要对数据进行一些预处理。

特别是，我

确定使用哪只手来拼写短语。
如果是左手，我将手反射在yz平面上，并用这些新坐标覆盖右手的坐标。基本上，这使得我的模型中的每个人都是“右撇子”。
然后，我从指向右手每个点的矢量中减去指向手腕的矢量。这样可以“稳定”手部。
最后，我只关心60个特征（right_hand_x_1、right_hand_x_2、...、right_hand_z_20）。

我想创建一个Layer子类来执行此预处理。

我不希望您完成所有这些。我相反请求一个执行类似操作的最小工作示例。

请提供一个Keras Layer子类的代码，该子类将执行以下操作：

接受一个2x4的2D输入张量

| col 1 | col2 | col3| col 4|
| ---   | ---  | --- | ---  |
| a1    | b1   | c1  | d1   |
| a2    | b2   | c2  | d2   |

如果c_1 > b_1，则返回2x3张量

| col 1 | col2 | col3| 
| ---   | ---  | --- | 
| a1    | c1 - b1   | d1 - b1 |
| a2    | c2 - b2   | d2 - b1 |

如果c_1 < b_1，则返回2x3张量

| col 1 | col2 | col3| 
| ---   | ---  | --- | 
| a1    | b1 - c1   | b1 - d1 |
| a2    | b2 - c2   | b2 - d2 |

英文:

I am participating in this ASL fingerspelling Kaggle Competition.

We are given a collection of phrases like "9560 plano". Each phrase has a table associated with it. The rows of the table are frame numbers of a video. The columns are the x, y, and z coordinates of 1630 points on a human body. The goal of the competition is to create a model which will recover the phrase from the table.

I have a model which works okay, but it requires some preprocessing of the data.

In particular, I

Identify which hand is being used to spell the phrase.
If it is the left hand, I reflect the hand in the yz plane, and overwrite the righthand coordinates with these new coordinates. Essentially this makes everyone "right-handed" in my model.
I then subtract the vector pointing to the wrist from the vector pointing to each point in the right hand. This "stabilizes" the hand.
In the end I only have 60 features I care about (right_hand_x_1, right_hand_x_2, ..., right_hand_z_20).

I would like to make a Layer subclass which does this preprocessing.

I don't expect you to do all of this. I instead request a minimal working example which does something similar.

Please give code for a keras Layer subclass which will:

take a 2x4 2D input tensor

col 1	col2	col3	col 4
a1	b1	c1	d1
a2	b2	c2	d2

if c_1 > b_1 returns the 2x3 tensor

col 1	col2	col3
a1	c1 - b1	d1 - b1
a2	c2 - b2	d2 - b1

if c_1 < b_1 returns the 2x3 tensor

col 1	col2	col3
a1	b1 - c1	b1 - d1
a2	b2 - c2	b2 - d2

答案1

得分: 0

你可以使用 tf.matmul 来获取新数据的两个版本（对于 c < b 和 c >= b 的情况）。然后，你可以使用张量比较来计算掩码。最后，根据掩码值使用 tf.where 来选择要使用的版本：

import tensorflow as tf
import keras as K
import numpy as np

class MyLayer(K.layers.Layer):
    def __init__(self):
        super().__init__()
        self.tensor1 = tf.constant([
            [1, 0,  0 ],
            [0, -1, -1],
            [0, 1,  0 ],
            [0, 0,  1 ]
        ], dtype='float32')
        self.tensor2 = tf.constant([
            [1, 0,  0 ],
            [0, 1,  1 ],
            [0, -1, 0 ],
            [0, 0,  -1]
        ], dtype='float32')

    def call(self, inputs):
        assert(inputs.shape[-1] == 4)
        mask = tf.reshape(tf.repeat((inputs[:, 2] > inputs[:, 1]), repeats=3, axis=0), (inputs.shape[0], 3))
        return tf.where(mask, tf.matmul(inputs, self.tensor1), tf.matmul(inputs, self.tensor2))


inputs = tf.constant(np.random.randint(0, 20, size=8), dtype='float32', shape=(2, 4))
my_layer = MyLayer()
print('输入:\n', inputs)
print('\n输出:\n', my_layer(inputs))

输出：

输入:
 tf.Tensor(
[[ 0. 14. 10. 17.]
 [ 8.  9. 16. 13.]], shape=(2, 4), dtype=float32)

输出:
 tf.Tensor(
[[ 0.  4. -3.]
 [ 8.  7.  4.]], shape=(2, 3), dtype=float32)

英文:

You can use tf.matmul to get both versions of new data (for cases where c < b and where c >= b). Then you can calculate the mask using tensor comparison. Lastly, use tf.where to choose which version to use based on the mask value:

import tensorflow as tf
import keras as K
import numpy as np

class MyLayer(K.layers.Layer):
    def __init__(self):
        super().__init__()
        self.tensor1 = tf.constant([
            [1, 0,  0 ],
            [0, -1, -1],
            [0, 1,  0 ],
            [0, 0,  1 ]
        ], dtype=&#39;float32&#39;)
        self.tensor2 = tf.constant([
            [1, 0,  0 ],
            [0, 1,  1 ],
            [0, -1, 0 ],
            [0, 0,  -1]
        ], dtype=&#39;float32&#39;)

    def call(self, inputs):
        assert(inputs.shape[-1] == 4)
        mask = tf.reshape(tf.repeat((inputs[:, 2] &gt; inputs[:, 1]), repeats=3, axis=0), (inputs.shape[0], 3))
        return tf.where(mask, tf.matmul(inputs, self.tensor1), tf.matmul(inputs, self.tensor2))


inputs = tf.constant(np.random.randint(0, 20, size=8), dtype=&#39;float32&#39;, shape=(2, 4))
my_layer = MyLayer()
print(&#39;Input:\n&#39;, inputs)
print(&#39;\nOutput:\n&#39;, my_layer(inputs))

outputs:

Input:
 tf.Tensor(
[[ 0. 14. 10. 17.]
 [ 8.  9. 16. 13.]], shape=(2, 4), dtype=float32)

Output:
 tf.Tensor(
[[ 0.  4. -3.]
 [ 8.  7.  4.]], shape=(2, 3), dtype=float32)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在自定义的Keras层中进行预处理和特征选择。

问题

答案1

argmax 和 reduce_max 在 TensorFlow 中有什么区别？

Keras：每步时间随样本数过滤增加，时代时间保持不变。

为什么 predict_on_batch 一直重复第一个输出？

LSTM在Keras中的输入维度是多少？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论