重建 2D 图像从 1D 像素序列

huangapple go评论62阅读模式
英文:

Reconstructing 2d image from 1d pixel sequence

问题

想象一下,有一张大小为 n 乘以 m 的图像。然后它被展平成一个长度为 n*m 的像素列表。如果不知道 n 和 m,如何还原原始图像?

这个想法不是要找到长度的所有因子对,而是要找到一些度量,这些度量中最高的度量与采样宽度最接近图像的原始宽度。类似于调整旧模拟电视上的水平保持。

我相信有一种算法应该可以完成这个确切的任务,但我找不到它。

英文:

Imagine there's an image of size n by m. Then it's flattened into a list of pixels with length of n*m. How to restore original image if n and m are unknown?

The idea is not to find all pairs of divisors of length, but to find some metric which is highest then sampling width is closest to original width of an image. Simular to adjusting horizontal hold on old analog TV.

I'm sure there is an algorithm which is supposed to do this exact thing but I couldn't find it.

答案1

得分: 2

Bruteforcing it appears to work.

I try all combinations (h, w) where n is the number of pixels, and h and w are divisors such such that n == h * w.

I return the pair of divisors that minimize the average difference between the value of a pixel on a row with the corresponding pixel on the next row.

Note that the value of a pixel is a triplet representing the three colors. I did not even try to be smart about what "difference between two colors" might mean. I just subtracted red from red, green from green, blue from blue, and summed it all with .mean() because I was too lazy to do a sum of squares or anything smart.

On an example picture of a bear, it works.

from imageio.v3 import imread
from sympy import divisors
import numpy as np

img = np.array(imread('Downloads/bear.png'))
height, width, n_channels = img.shape # (90, 136, 3)
flatimg = img.reshape((height*width, n_channels)) # (12240, 3)

def find_dims(flatimg):
    n_pixels, n_channels = flatimg shape # (12240, 3)
    divs = divisors(n_pixels)[1:-1]
    height, width = min(
        ((height, width) for height, width in zip(divs, reversed(divs))),
        key=lambda dim: np.diff(flatimg.reshape((*dim, n_channels)), axis=0).mean()
    )
    return height, width

print(find_dims(flatimg))
# (90, 136)

重建 2D 图像从 1D 像素序列80x153 重建 2D 图像从 1D 像素序列85x144 重建 2D 图像从 1D 像素序列90x136

重建 2D 图像从 1D 像素序列102x120 重建 2D 图像从 1D 像素序列120x102 重建 2D 图像从 1D 像素序列136x90

英文:

Bruteforcing it appears to work.

I try all combinations (h, w) where n is the number of pixels, and h and w are divisors such that n == h * w.

I return the pair of divisors that minimise the average difference between the value of a pixel on a row with the corresponding pixel on the next row.

Note that the value of a pixel is a triplet representing the three colours. I did not even try to be smart about what "difference between two colours" might mean. I just subtracted red from red, green from green, blue from blue, and summed it all with .mean() because I was too lazy to do a sum of squares or anything smart.

On an example picture of a bear, it works.

from imageio.v3 import imread
from sympy import divisors
import numpy as np

img = np.array(imread('Downloads/bear.png'))
height, width, n_channels = img.shape # (90, 136, 3)
flatimg = img.reshape((height*width, n_channels)) # (12240, 3)

def find_dims(flatimg):
    n_pixels, n_channels = flatimg.shape # (12240, 3)
    divs = divisors(n_pixels)[1:-1]
    height,width = min(
        ((height, width) for height, width in zip(divs, reversed(divs))),
        key=lambda dim: np.diff(flatimg.reshape((*dim,n_channels)), axis=0).mean()
    )
    return height, width

print(find_dims(flatimg))
# (90, 136)

重建 2D 图像从 1D 像素序列80x153 重建 2D 图像从 1D 像素序列85x144 重建 2D 图像从 1D 像素序列90x136

重建 2D 图像从 1D 像素序列102x120 重建 2D 图像从 1D 像素序列120x102 重建 2D 图像从 1D 像素序列136x90

答案2

得分: 2

尝试图像区域的除数,使用合理的比例(如果图像没有被裁剪,比例不应超过16:9)。

对于不同的组合,计算整个图像上垂直梯度幅度的总和。对于正确的选择,这个总和应该是最小的。

英文:

Try the divisors of the image area, with reasonable ratios (if the image wasn't cropped, the ratio should not exceed 16:9.

For the different combinations, compute the sum of the vertical gradient magnitude across the whole image. It should be minimum for the right choice.

答案3

得分: 1

如果你查看libtifftools目录中的raw2tiff.c文件,它会寻找到原始图像文件的中间位置,并猜测不同扫描线宽度,然后读取成对的扫描线,直到两者之间的相关性最大化,并将其用作宽度。我猜想理论是照片中的大多数扫描线都相似,即彼此高度相关。

英文:

If you look at raw2tiff.c in the tools directory of libtiff it seeks to around the middle of a raw image file and guesses different scanline widths and reads pairs of lines till the correlation between the two is maximised and uses that as the width. I assume the theory is that most scanlines in a photo are pretty similar, i.e. highly correlated with each other.

huangapple
  • 本文由 发表于 2023年2月24日 06:31:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/75550990.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定