2023年3月4日 09:13:37go评论113阅读模式

英文:

Why do image size differ when vertical vs horizontal?

问题

尝试使用PIL创建一个随机图像，如以下示例：

import numpy
from PIL import Image
a = numpy.random.rand(48, 84)
img = Image.fromarray(a.astype('uint8')).convert('1')
print(len(img.tobytes()))

这段代码会输出528。

当我们反转numpy数组的维度时：

a = numpy.random.rand(84, 48)

我们得到的输出是504。

为什么会这样呢？

我原本期望字节数量是相同的，因为numpy数组的大小相同。

英文:

Tried to create a random image with PIL as per the example:

import numpy
from PIL import image
a = numpy.random.rand(48,84)
img = Image.fromarray(a.astype(&#39;uint8&#39;)).convert(&#39;1&#39;)
print(len(img.tobytes()))

This particular code will output 528.
Wen we flip the numbers of the numpy array:

a = numpy.random.rand(84,48)

The output we get is 504.
Why is that?

I was expecting for the byte number to be the same, since the numpy arrays are the same size.

答案1

得分: 5

调用tobytes()方法时，布尔数组*的数据可能按行进行编码。在您的第二个示例中，img的每一行都包含48个布尔值。因此，每行可以用6个字节（48位）表示。6字节 * 84行 = img中的504字节。然而，在您的第一个示例中，每一行有84个像素，不是8的整数倍。在这种情况下，编码器使用11个字节（88位）表示每一行。每行有4个额外的填充位。因此，现在总大小为11字节 * 48行 = 528字节。

如果您测试一系列随机输入形状以编码2D布尔数组，您将发现当每行的元素数量是8的整数倍时，编码的总字节数等于宽度 * 高度 / 8。然而，当行长度不是8的整数倍时，编码将包含更多字节，因为它必须为每行填充1到7位。

总之，理想情况下，我们希望每个字节存储八个布尔值，但由于行长度并不总是8的整数倍，而编码器按行对数组进行序列化，这变得复杂。

用于澄清的编辑: *在模式 "1"（二进制或 "bilevel" 图像）中，PIL.Image对象实际上表示一个布尔数组。在模式1中，原始图像（在这种情况下是NumPy数组 a）被阈值化以将其转换为二进制图像。

英文:

When you call tobytes() on the boolean array*, the data is likely encoded per row. In your second example, there are 48 booleans in each row of img. So each row can be represented with 6 bytes (48 bits). 6 bytes * 84 rows = 504 bytes in img. However, in your first example, there are 84 pixels per row, which is not divisible by 8. In this case, the encoder represents each row with 11 bytes (88 bits). There are 4 extra bits of padding per row. So now the total size is 11 bytes * 48 rows = 528 bytes.

If you test a bunch of random input shapes for a 2d boolean array to encode, you will find that when the number of elements per row is divisible by 8, the number of total bytes in the encoding is equal to the width * height / 8. However, when the row length is not divisible by 8, the encoding will contain more bytes because it has to pad each row with between 1 and 7 bits.

In summary - ideally, we would want to store eight boolean values per byte, but this is complicated by the fact that the row length isn't always divisible by 8, and the encoder serializes the array by row.

Edit for clarification: *the PIL.Image object in mode "1" (binary or "bilevel" image) effectively represents a boolean array. In mode 1, the original image (in this case, the numpy array a) is thresholded to convert it to a binary image.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

为什么竖直和水平方向的图像大小不同？

问题

答案1

主程序和使用dlopen加载的库需要不同版本的libsqlite3.so。

每个循环后同一年份的柱状间隙

Matplotlib：同一图中的两个树状图

Polars – 从S3读取Parquet只读取第一个文件

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。