为什么JPEG文件的大小比预期的要大?

huangapple go评论80阅读模式
英文:

Why size of jpg file is bigger than expected?

问题

I generate a grayscale image and save it in jpg format.

SCENE_WIDTH = 28
SCENE_HEIGHT = 28

# draw random noise
p, n = 0.5, SCENE_WIDTH*SCENE_HEIGHT
scene_noise = np.random.binomial(1, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))*255
scene_noise = scene_noise.astype(np.uint8)

n = scene_noise
print('%d bytes' % (n.size * n.itemsize)) # 784 bytes

cv2.imwrite('scene_noise.jpg', scene_noise)
print('noise: ', os.path.getsize("scene_noise.jpg")) # 1549 bytes

from PIL import Image
im = Image.fromarray(scene_noise)
im.save('scene_noise2.jpg')
print('noise2: ', os.path.getsize("scene_noise2.jpg")) # 1017 bytes

When I change from:

scene_noise = np.random.binomial(1, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))*255

to:

scene_noise = np.random.binomial(255, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))

The size of the file decreases almost 2 times: ~ 775 bytes.

Can you please explain why the JPG file is bigger than the raw version, and why the size decreases when I change colors from black and white to the full grayscale spectrum?

cv2.__version__.split(".") # ['4', '1', '2']
英文:

I generate a grayscale image and save it in jpg format.

SCENE_WIDTH = 28
SCENE_HEIGHT = 28

# draw random noice
p, n = 0.5, SCENE_WIDTH*SCENE_HEIGHT
scene_noise = np.random.binomial(1, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))*255
scene_noise = scene_noise.astype(np.uint8)

n = scene_noise
print('%d bytes' % (n.size * n.itemsize)) # 784 bytes

cv2.imwrite('scene_noise.jpg', scene_noise)
print('noise: ', os.path.getsize("scene_noise.jpg")) # 1549 bytes

from PIL import Image
im = Image.fromarray(scene_noise)
im.save('scene_noise2.jpg')
print('noise2: ', os.path.getsize("scene_noise2.jpg")) # 1017 bytes 

when I change from:

scene_noise = np.random.binomial(1, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))*255

to:

scene_noise = np.random.binomial(255, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))

The size of file decrease almost 2 times: ~ 775 bytes.

Can you please explain why JPG file is bigger than the raw version and why the size decreases when I change colors from black and white to full grayscale spectrum?

cv2.__version__.split(".") # ['4', '1', '2']

答案1

得分: 1

  • 请问为什么JPEG文件比原始版本要大?

> 大小不同是因为您在比较不同的东西。第一个对象是一个NumPy数组,第二个是一个JPEG文件。JPEG编码会在开销中包含一些信息,而NumPy数组不存储也不需要这些信息。

  • 请问为什么当我将颜色从黑白改为完整的灰度光谱时,文件大小会减小?

> 这是由于JPEG编码造成的。如果您真的想要了解所有发生的事情,我强烈建议您了解JPEG编码的工作原理,因为我不会详细介绍这个(我对这个主题不是专家)。关于这方面的信息在维基百科JPEG文章中有详细记录。总的来说,您的图片中对比度越大,文件大小就越大。在这里,将图片转为纯黑白会导致像素始终在0到255之间变化,而灰度图片通常不会看到那么大的相邻像素变化。

英文:

Two things here:

  • can you explain why the JPEG file is bigger than the raw version?

> The size differs because you are not comparing the same things. The first object is a NumPy array, and the second one is a JPEG file. The JPEG file is bigger than the NumPy array (ie. after creating it with OpenCV) because JPEG encoding includes information in the overhead that a NumPy array does not store nor need.

  • can you explain why the size decreases when I change colours from black and white to a full grayscale spectrum?

> This is due to JPEG encoding. If you truly want to understand all of what happens, I highly suggest to understand how JPEG encoding works as I will not go into much detail about this (I am in no way a specialist in this topic). Information on this is well documented on the Wikipedia JPEG article. The general idea is that the more contrast you have in your picture, the bigger it will be in terms of size. Here, having a picture in black and white only will force you to always go between 0 and 255, whereas a grayscale picture will not usually see as big a change between adjacent pixels.

huangapple
  • 本文由 发表于 2020年1月7日 02:18:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/59617006.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定