2023年5月21日 00:00:16go评论81阅读模式

英文:

Whats the difference between a convolutional autoencoder (CAE) and a convolutional neural network (CNN)

问题

I'm working on a bachelor's project that involves using a convolutional autoencoder [1]. Now the goal was to make a model that could take as input a pixelated image with text and as output, predict the image with depixelated text.

I also gave labels to my training process. After training several models, I concluded that it is pretty easy to reconstruct pixelated text.

Now while I'm writing a paper about the project, I'm really struggling to understand what exactly is a convolutional autoencoder and what makes it a convolutional autoencoder.

I'm completely new to any type of ML. I found that autoencoders are neural networks that aim to minimize the difference between the output and input.

What makes it a "convolutional" autoencoder is the fact that it uses convolutions in its encoder to detect edges etc.

In my case, I depixelated pixelated text in images, so the output is not meant to be as close as possible to the input image but rather be the depixelated version.

Another thing is that everywhere I look for answers, it is stated that autoencoders are primarily used for unsupervised learning. While in my case, I use supervised learning since I pass the labels to the training process.

Lastly, since I don't try to minimize the difference between the output and input image, AND I use supervised learning, then what exactly is the difference between a convolutional autoencoder and a convolutional neural network?

英文:

I'm working on a bachelor's project that involves using a convolutional autoencoder [1]. I used the code from this blog. Now the goal was to make a model that could take as input a pixelated image with text and as output, predict the image with depixelated text. The only change I made from the "convolutional autoencoder" code in the reference is that I also gave labels to my training process. After training several models, I concluded that it is pretty easy to reconstruct pixelated text.

Now while I'm writing a paper about the project, I'm really struggling to understand what exactly is a convolutional autoencoder and what makes it a convolutional autoencoder.

I'm completely new to any type of ML. When I did research on autoencoders in general, I found that autoencoders are neural networks that aim to minimize the difference between the output and input. And what makes it a "convolutional" autoencoder is the fact that it uses convolutions in its encoder to detect edges etc.

-But now in my case I depixelated pixelated text in images, so the output is not meant to be as close as possible to the input image but rather be the depixelated version.

-Another thing is that everywhere I look for answers, it is stated that autoencoders are primarily used for unsupervised learning. While in my case, I use supervised learning since I pass the labels to the training process.

-Lastly, since I don't try to minimize the difference between the output and input image, AND I use supervised learning, then what exactly is the difference between a convolutional autoencoder and a convolutional neural network?

[1] https://blog.keras.io/building-autoencoders-in-keras.html

答案1

得分: 1

自动编码器（AE）是一种自监督神经网络模型，旨在学习其输入的表示，例如图像。我来解释一下：

AE由两个神经网络组成：编码器 f(x) -> z 和解码器 g(z) -> x；第三个重要部分是z，称为潜在空间或瓶颈。
AEs的目标是学习z，其中它的大小小于x，通过最小化重建目标来实现。基本上，潜在空间是x的紧凑表示。
有许多AE的变种，其中一种称为去噪自动编码器（DAE），这是您正在使用的一种，因为DAE旨在学习一个被一些噪声损坏的“清晰”版本的x。
CAE是一种AE，其中编码器和解码器是CNN。实际上，还有更严格的定义，要求z是一个3D张量（如果考虑批量大小，则是4D），称为卷积瓶颈（因为它具有与图像一样的三个维度）。
AE通常是自监督的，因为它自己的输入用作监督：这些输入是图像。

因此，CAE是具有卷积层和卷积瓶颈的AE。相反，您也可以将CNN视为其他类型的模型的构建块：分类器，回归器，自动编码器本身等等。

英文:

An auto-encoder (AE) is a self-supervised neural network model, that aims to learn a representation of its inputs, e.g. images. I'll explain myself:

The AE is made of two NNs: the encoder f(x) -> z, and the decoder g(z) -> x; a third important piece is the z, called the latent space or bottleneck.

AEs aim at learning z, where its size is smaller than x, by minimizing a reconstruction objective. Basically the latent space is a compact representation of x.
There are many variations of AEs, one is called Denoising AE (DAE) and is the one your are using, because DAE aims at learning a "clear" version of x that is corrupted by some noise.
A CAE is an AE in which the encoder and decoder are CNNs. Actually, there are more strict definitions which require z to be a 3D tensor (4D if you consider the batch size too) called the convolutional bottleneck (because it has the three dimensions as an image).
The AE is usually self-supervised because you its own inputs are supervision: the images.

So a CAE is an AE with convolutional layers, and a convolutional bottleneck. Instead, you can consider a CNN as a building block for other kinds of models too: classifiers, regressors, AEs itself, and others.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Whats the difference between a convolutional autoencoder (CAE) and a convolutional neural network (CNN)

问题

答案1

向stax.serial对象添加新层。

你如何克服tensorflow-gpu的导入错误？

Why are the weights not updating when splitting the model into two `class` in pytorch and torch-geometric?

为什么使用不同的批次大小时，conv2d会产生不同的结果？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。