2020年1月7日 02:18:44go评论83阅读模式

英文:

Why size of jpg file is bigger than expected?

问题

I generate a grayscale image and save it in jpg format.

SCENE_WIDTH = 28
SCENE_HEIGHT = 28

# draw random noise
p, n = 0.5, SCENE_WIDTH*SCENE_HEIGHT
scene_noise = np.random.binomial(1, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))*255
scene_noise = scene_noise.astype(np.uint8)

n = scene_noise
print('%d bytes' % (n.size * n.itemsize)) # 784 bytes

cv2.imwrite('scene_noise.jpg', scene_noise)
print('noise: ', os.path.getsize("scene_noise.jpg")) # 1549 bytes

from PIL import Image
im = Image.fromarray(scene_noise)
im.save('scene_noise2.jpg')
print('noise2: ', os.path.getsize("scene_noise2.jpg")) # 1017 bytes

When I change from:

scene_noise = np.random.binomial(1, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))*255

to:

scene_noise = np.random.binomial(255, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))

The size of the file decreases almost 2 times: ~ 775 bytes.

Can you please explain why the JPG file is bigger than the raw version, and why the size decreases when I change colors from black and white to the full grayscale spectrum?

cv2.__version__.split(".") # ['4', '1', '2']

英文:

I generate a grayscale image and save it in jpg format.

SCENE_WIDTH = 28
SCENE_HEIGHT = 28

# draw random noice
p, n = 0.5, SCENE_WIDTH*SCENE_HEIGHT
scene_noise = np.random.binomial(1, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))*255
scene_noise = scene_noise.astype(np.uint8)

n = scene_noise
print(&#39;%d bytes&#39; % (n.size * n.itemsize)) # 784 bytes

cv2.imwrite(&#39;scene_noise.jpg&#39;, scene_noise)
print(&#39;noise: &#39;, os.path.getsize(&quot;scene_noise.jpg&quot;)) # 1549 bytes

from PIL import Image
im = Image.fromarray(scene_noise)
im.save(&#39;scene_noise2.jpg&#39;)
print(&#39;noise2: &#39;, os.path.getsize(&quot;scene_noise2.jpg&quot;)) # 1017 bytes

when I change from:

scene_noise = np.random.binomial(1, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))*255

to:

scene_noise = np.random.binomial(255, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))

The size of file decrease almost 2 times: ~ 775 bytes.

Can you please explain why JPG file is bigger than the raw version and why the size decreases when I change colors from black and white to full grayscale spectrum?

cv2.__version__.split(&quot;.&quot;) # [&#39;4&#39;, &#39;1&#39;, &#39;2&#39;]

答案1

得分: 1

请问为什么JPEG文件比原始版本要大？

> 大小不同是因为您在比较不同的东西。第一个对象是一个NumPy数组，第二个是一个JPEG文件。JPEG编码会在开销中包含一些信息，而NumPy数组不存储也不需要这些信息。

请问为什么当我将颜色从黑白改为完整的灰度光谱时，文件大小会减小？

> 这是由于JPEG编码造成的。如果您真的想要了解所有发生的事情，我强烈建议您了解JPEG编码的工作原理，因为我不会详细介绍这个（我对这个主题不是专家）。关于这方面的信息在维基百科JPEG文章中有详细记录。总的来说，您的图片中对比度越大，文件大小就越大。在这里，将图片转为纯黑白会导致像素始终在0到255之间变化，而灰度图片通常不会看到那么大的相邻像素变化。

英文:

Two things here:

can you explain why the JPEG file is bigger than the raw version?

> The size differs because you are not comparing the same things. The first object is a NumPy array, and the second one is a JPEG file. The JPEG file is bigger than the NumPy array (ie. after creating it with OpenCV) because JPEG encoding includes information in the overhead that a NumPy array does not store nor need.

can you explain why the size decreases when I change colours from black and white to a full grayscale spectrum?

> This is due to JPEG encoding. If you truly want to understand all of what happens, I highly suggest to understand how JPEG encoding works as I will not go into much detail about this (I am in no way a specialist in this topic). Information on this is well documented on the Wikipedia JPEG article. The general idea is that the more contrast you have in your picture, the bigger it will be in terms of size. Here, having a picture in black and white only will force you to always go between 0 and 255, whereas a grayscale picture will not usually see as big a change between adjacent pixels.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

为什么JPEG文件的大小比预期的要大？

问题

答案1

conda: 如果我们将pip_interop_enabled设置为True，会有什么不同？

如何使用Selenium和BeautifulSoup正确获取网页上的链接？

KNN imputation for missing categorical-string values python for a specific column in a dataframe and return with replaced value as a dataframe

重新塑造GRU的输入

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论