2020年10月8日 19:26:48go评论81阅读模式

英文:

LSTM and Dense layers preprocessing

问题

我正在尝试构建一个包含LSTM和Dense层的神经网络。

我的网络结构如下：

MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
            .seed(123)
            .weightInit(WeightInit.XAVIER)
            .updater(new Adam(0.1))
            .list()
            .layer(0,  new LSTM.Builder().activation(Activation.TANH).nIn(numInputs).nOut(120).build())
            .layer(1, new DenseLayer.Builder().activation(Activation.RELU).nIn(120).nOut(1000).build())
            .layer(2, new DenseLayer.Builder().activation(Activation.RELU).nIn(1000).nOut(20).build())
            .layer(new OutputLayer.Builder(LossFunction.NEGATIVELOGLIKELIHOOD).activation(Activation.SOFTMAX).nIn(20).nOut(numOutputs).build())
            .inputPreProcessor(1, new RnnToFeedForwardPreProcessor())
            .build();

我以以下方式读取我的数据：

SequenceRecordReader reader = new CSVSequenceRecordReader(0, ",");
reader.initialize(new NumberedFileInputSplit("TRAIN_%d.csv", 1, 17476));
DataSetIterator trainIter = new SequenceRecordReaderDataSetIterator(reader, miniBatchSize, 6, 7, false);
allData = trainIter.next();

// 加载测试/评估数据：
SequenceRecordReader testReader = new CSVSequenceRecordReader(0, ",");
testReader.initialize(new NumberedFileInputSplit("TEST_%d.csv", 1, 8498));
DataSetIterator testIter = new SequenceRecordReaderDataSetIterator(testReader, miniBatchSize, 6, 7, false);
allData = testIter.next();

所以，当数据传入网络时，其形状为[batch, features, timestamp] = [32,7,60]。我可以使用特定的错误来定义它，如下所示：

Received input with size(1) = 7 (input array shape = [32, 7, 60]); input.size(1) must match layer nIn size (nIn = 9)

因此，它正常传入网络。在第一个LSTM层之后，它必须被重塑为2-D，并继续通过Dense层。

但我遇到了下一个问题：

Labels and preOutput must have equal shapes: got shapes [32, 6, 60] vs [1920, 6]

在传入Dense层之前，它没有被重塑，我错过了一个特征（现在的形状是32, 6 , 60而不是32, 7 , 60），所以为什么会这样？

英文:

I am trying to build NN with LSTM and Dense layers.

Me net is:

 MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
            .seed(123)    
            .weightInit(WeightInit.XAVIER)
            .updater(new Adam(0.1))
            .list()
            .layer(0,  new LSTM.Builder().activation(Activation.TANH).nIn(numInputs).nOut(120).build())
            .layer(1, new DenseLayer.Builder().activation(Activation.RELU).nIn(120).nOut(1000).build())
            .layer(2, new DenseLayer.Builder().activation(Activation.RELU).nIn(1000).nOut(20).build())
            .layer(new OutputLayer.Builder(LossFunction.NEGATIVELOGLIKELIHOOD).activation(Activation.SOFTMAX).nIn(20).nOut(numOutputs).build())
            .inputPreProcessor(1, new RnnToFeedForwardPreProcessor())
            .build();

I read my data like that:

 SequenceRecordReader reader = new CSVSequenceRecordReader(0, &quot;,&quot;);
        reader.initialize(new NumberedFileInputSplit(&quot;TRAIN_%d.csv&quot;, 1, 17476));
        DataSetIterator trainIter = new SequenceRecordReaderDataSetIterator(reader, miniBatchSize, 6, 7, false);
        allData = trainIter.next();


        //Load the test/evaluation data:
        SequenceRecordReader testReader = new CSVSequenceRecordReader(0, &quot;,&quot;);
        testReader.initialize(new NumberedFileInputSplit(&quot;TEST_%d.csv&quot;, 1, 8498));
        DataSetIterator testIter = new SequenceRecordReaderDataSetIterator(testReader, miniBatchSize, 6, 7, false);
        allData = testIter.next();

So, when its going to net it has shape [batch, features, timestamp] = [32,7,60]
I can define it with special made error like that:

Received input with size(1) = 7 (input array shape = [32, 7, 60]); input.size(1) must match layer nIn size (nIn = 9)

So its normally go to the net. After first LSTM layer it must reshape into 2-D and go throw Dense layer next.

But I have next problem:

> Labels and preOutput must have equal shapes: got shapes [32, 6, 60] vs
> [1920, 6]

It did not reshape before going into Dense layer and I had missed 1 feature (now shape is 32, 6 , 60 instead of 32, 7 , 60), so why ???

答案1

得分: 1

if possible you'll want to use setInputType which will set up the pre processors for you.

Here's an example configuration of lstm to dense:

MultiLayerConfiguration conf1 = new NeuralNetConfiguration.Builder()
.trainingWorkspaceMode(wsm)
.inferenceWorkspaceMode(wsm)
.seed(12345)
.updater(new Adam(0.1))
.list()
.layer(new LSTM.Builder().nIn(3).nOut(3).dataFormat(rnnDataFormat).build())
.layer(new DenseLayer.Builder().nIn(3).nOut(3).activation(Activation.TANH).build())
.layer(new RnnOutputLayer.Builder().nIn(3).nOut(3).activation(Activation.SOFTMAX).dataFormat(rnnDataFormat)
.lossFunction(LossFunctions.LossFunction.MCXENT).build())
.setInputType(InputType.recurrent(3, rnnDataFormat))
.build();


RNN format is:

import org.deeplearning4j.nn.conf.RNNFormat;

This is an enum that specifies what your data format should be (channels last or first)
From the javadoc:

/**

NCW = "channels first" - arrays of shape [minibatch, channels, width]
NWC = "channels last" - arrays of shape [minibatch, width, channels]
"width" corresponds to sequence length and "channels" corresponds to sequence item size.
*/


Source here: https://github.com/eclipse/deeplearning4j/blob/1930d9990810db6214829c716c2ae7eb7f59cd13/deeplearning4j/deeplearning4j-nn/src/main/java/org/deeplearning4j/nn/conf/RNNFormat.java#L21

More here in our tests: https://github.com/eclipse/deeplearning4j/blob/1930d9990810db6214829c716c2ae7eb7f59cd13/deeplearning4j/deeplearning4j-core/src/test/java/org/deeplearning4j/nn/layers/recurrent/TestTimeDistributed.java#L58

<details>
<summary>英文:</summary>

if possible you&#39;ll want to use setInputType which will set up the pre processors for you. 

Here&#39;s an example configuration of lstm to dense:


RNN format is:

import org.deeplearning4j.nn.conf.RNNFormat;

This is an enum that specifies what your data format should be (channels last or first)
From the javadoc:

/**

NCW = "channels first" - arrays of shape [minibatch, channels, width]<br>
NWC = "channels last" - arrays of shape [minibatch, width, channels]<br>
"width" corresponds to sequence length and "channels" corresponds to sequence item size.
*/


Source here: https://github.com/eclipse/deeplearning4j/blob/1930d9990810db6214829c716c2ae7eb7f59cd13/deeplearning4j/deeplearning4j-nn/src/main/java/org/deeplearning4j/nn/conf/RNNFormat.java#L21

More here in our tests: https://github.com/eclipse/deeplearning4j/blob/1930d9990810db6214829c716c2ae7eb7f59cd13/deeplearning4j/deeplearning4j-core/src/test/java/org/deeplearning4j/nn/layers/recurrent/TestTimeDistributed.java#L58

</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

LSTM和Dense层预处理

问题

答案1

如何将Java API文档添加到IntelliJ IDEA以供离线使用。

Android Studio的类文件未找到：com.google.common.util.concurrent.ListenableFuture。

获取 https://localhost/myapp/saml/sso 重定向时出现 404 错误 – Spring MVC/Okta

尝试更新数据库的addValueEventListener。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论