2023年2月24日 06:31:49go评论84阅读模式

英文:

Reconstructing 2d image from 1d pixel sequence

问题

想象一下，有一张大小为 n 乘以 m 的图像。然后它被展平成一个长度为 n*m 的像素列表。如果不知道 n 和 m，如何还原原始图像？

这个想法不是要找到长度的所有因子对，而是要找到一些度量，这些度量中最高的度量与采样宽度最接近图像的原始宽度。类似于调整旧模拟电视上的水平保持。

我相信有一种算法应该可以完成这个确切的任务，但我找不到它。

英文:

Imagine there's an image of size n by m. Then it's flattened into a list of pixels with length of n*m. How to restore original image if n and m are unknown?

The idea is not to find all pairs of divisors of length, but to find some metric which is highest then sampling width is closest to original width of an image. Simular to adjusting horizontal hold on old analog TV.

I'm sure there is an algorithm which is supposed to do this exact thing but I couldn't find it.

答案1

得分: 2

Bruteforcing it appears to work.

I try all combinations (h, w) where n is the number of pixels, and h and w are divisors such such that n == h * w.

I return the pair of divisors that minimize the average difference between the value of a pixel on a row with the corresponding pixel on the next row.

Note that the value of a pixel is a triplet representing the three colors. I did not even try to be smart about what "difference between two colors" might mean. I just subtracted red from red, green from green, blue from blue, and summed it all with .mean() because I was too lazy to do a sum of squares or anything smart.

On an example picture of a bear, it works.

from imageio.v3 import imread
from sympy import divisors
import numpy as np
img = np.array(imread('Downloads/bear.png'))
height, width, n_channels = img.shape # (90, 136, 3)
flatimg = img.reshape((height*width, n_channels)) # (12240, 3)
def find_dims(flatimg):
    n_pixels, n_channels = flatimg shape # (12240, 3)
    divs = divisors(n_pixels)[1:-1]
    height, width = min(
        ((height, width) for height, width in zip(divs, reversed(divs))),
        key=lambda dim: np.diff(flatimg.reshape((*dim, n_channels)), axis=0).mean()
    )
    return height, width
print(find_dims(flatimg))
# (90, 136)

80x153 85x144 90x136

102x120 120x102 136x90

英文:

Bruteforcing it appears to work.

I try all combinations (h, w) where n is the number of pixels, and h and w are divisors such that n == h * w.

I return the pair of divisors that minimise the average difference between the value of a pixel on a row with the corresponding pixel on the next row.

Note that the value of a pixel is a triplet representing the three colours. I did not even try to be smart about what "difference between two colours" might mean. I just subtracted red from red, green from green, blue from blue, and summed it all with .mean() because I was too lazy to do a sum of squares or anything smart.

On an example picture of a bear, it works.

from imageio.v3 import imread
from sympy import divisors
import numpy as np
img = np.array(imread(&#39;Downloads/bear.png&#39;))
height, width, n_channels = img.shape # (90, 136, 3)
flatimg = img.reshape((height*width, n_channels)) # (12240, 3)
def find_dims(flatimg):
    n_pixels, n_channels = flatimg.shape # (12240, 3)
    divs = divisors(n_pixels)[1:-1]
    height,width = min(
        ((height, width) for height, width in zip(divs, reversed(divs))),
        key=lambda dim: np.diff(flatimg.reshape((*dim,n_channels)), axis=0).mean()
    )
    return height, width
print(find_dims(flatimg))
# (90, 136)

80x153 85x144 90x136

102x120 120x102 136x90

答案2

得分: 2

尝试图像区域的除数，使用合理的比例（如果图像没有被裁剪，比例不应超过16:9）。

对于不同的组合，计算整个图像上垂直梯度幅度的总和。对于正确的选择，这个总和应该是最小的。

英文:

Try the divisors of the image area, with reasonable ratios (if the image wasn't cropped, the ratio should not exceed 16:9.

For the different combinations, compute the sum of the vertical gradient magnitude across the whole image. It should be minimum for the right choice.

答案3

得分: 1

如果你查看libtiff的tools目录中的raw2tiff.c文件，它会寻找到原始图像文件的中间位置，并猜测不同扫描线宽度，然后读取成对的扫描线，直到两者之间的相关性最大化，并将其用作宽度。我猜想理论是照片中的大多数扫描线都相似，即彼此高度相关。

英文:

If you look at raw2tiff.c in the tools directory of libtiff it seeks to around the middle of a raw image file and guesses different scanline widths and reads pairs of lines till the correlation between the two is maximised and uses that as the width. I assume the theory is that most scanlines in a photo are pretty similar, i.e. highly correlated with each other.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

重建 2D 图像从 1D 像素序列

问题

答案1

答案2

答案3

为什么类字段在方法执行后会更新它们的数据

如何从文件中检查 Java 代码重复？

什么是使用BufferedImage和Java在7000张图像中查找纯色的最有效方法？

将3和5的倍数相加直至n。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。