在PDF中添加图像

huangapple go评论59阅读模式
英文:

Adding images in raw PDF

问题

我正在尝试手动创建PDF文件。我已经掌握了基础知识,但有一件事我无法弄清楚,那就是如何处理图像。

我目前正在尝试的是,作为一个起步,通过二进制代码添加一个简单的图像。这个二进制代码的长度为9,应该能够表示一个3x3的黑白图像。代码是:111000111(这应该只是一个横穿中间的黑线)。当然,这太过简化,没有压缩,对于更复杂的图像也不可用,但我非常迫切,只是想显示一些东西 :).

希望有人能帮助我,并教我更多关于这个主题的知识。

我的新PDF(在johnwhitington的评论之后,除了点b之外):

%PDF-1.7

1 0 obj
  << 

     /Pages 2 0 R
     /Type /Catalog  >>
endobj

2 0 obj
  << 

     /Type /Pages
     /Count 1
     /Kids [3 0 R]  >>
endobj

3 0 obj
  << 

    /Type /Page
    /Parent 2 0 R  
    /Contents [4 0 R]
    /MediaBox [500 500]

    /Resources 
    <<
      /XObject 
      <<
        /Im1 5 0 R
      >>
    >>
  >>
endobj


4 0 obj 
  <<

  >>
    stream
      q
        1 0 0 1 100 100 cm
        /Im1 Do
      Q
    endstream
endobj


5 0 obj 
  <<
    /Type /XObject
    /Subtype /Image
    /Height 3
    /Width 3
    /BitsPerComponent 1
    /Length 9
    /ColorSpace /DeviceGray
  >>
    stream111000111endstream
endobj


trailer
  << /Root 1 0 R
   /Size 7
  >>
%%EOF

我的旧PDF:

%PDF-1.7

1 0 obj
  << 

     /Pages 2 0 R
     /Type /Catalog  >>
endobj

2 0 obj
  << 

     /Type /Pages
     /Count 1
     /Kids [3 0 R]  >>
endobj

3 0 obj
  << 

    /Type /Page
    /Parent 2 0 R  
    /Contents [4 0 R]

    /Resources 
    <<
      /ProcSet [/PDF /ImageB]    
      /XObject 
      <<
        /Im1 5 0 R
      >>
    >>
  >>
endobj


4 0 obj 
  <<

  >>
    stream
      q
        1 0 0 1 0 0 cm
        /Im1 DO
      Q
    endstream
endobj


5 0 obj 
  <<
    /Type /XObject
    /Subtype /Image
    /Height 3
    /Width 3
    /BitsPerComponent 1
    /Length 9
    /ColorSpace /DeviceGray
  >>
    stream111000111endstream
endobj


trailer
  << /Root 1 0 R
   /Size 7
  >>
%%EOF
英文:

I am trying to make pdf's manually. I got the basics but the one thing I cannot figure out is images.

The thing I am trying right now, as a start, is adding a simple image in the form of binary code. The binary code has a length of 9 and should be able to represent a black and white image of 3x3. The code being: 111000111 (this should just be a black horizontal line through the middle). Ofcourse this is over simplified, not compressed and not usable for more complex images but I am desperate and just want to display SOMETHING :).

Hope someone can help and teach me more about this topic.

my new pdf (after johnwhitington comment except for point b)

%PDF-1.7

1 0 obj
  &lt;&lt; 

     /Pages 2 0 R
     /Type /Catalog  &gt;&gt;
endobj

2 0 obj
  &lt;&lt; 

     /Type /Pages
     /Count 1
     /Kids [3 0 R]  &gt;&gt;
endobj

3 0 obj
  &lt;&lt; 

    /Type /Page
    /Parent 2 0 R  
    /Contents [4 0 R]
    /MediaBox [500 500]

    /Resources 
    &lt;&lt;
      /XObject 
      &lt;&lt;
        /Im1 5 0 R
      &gt;&gt;
    &gt;&gt;
  &gt;&gt;
endobj


4 0 obj 
  &lt;&lt;

  &gt;&gt;
    stream
      q
        1 0 0 1 100 100 cm
        /Im1 Do
      Q
    endstream
endobj


5 0 obj 
  &lt;&lt;
    /Type /XObject
    /Subtype /Image
    /Height 3
    /Width 3
    /BitsPerComponent 1
    /Length 9
    /ColorSpace /DeviceGray
  &gt;&gt;
    stream111000111endstream
endobj


trailer
  &lt;&lt; /Root 1 0 R
   /Size 7
  &gt;&gt;
%%EOF

my old PDF:

%PDF-1.7

1 0 obj
  &lt;&lt; 

     /Pages 2 0 R
     /Type /Catalog  &gt;&gt;
endobj

2 0 obj
  &lt;&lt; 

     /Type /Pages
     /Count 1
     /Kids [3 0 R]  &gt;&gt;
endobj

3 0 obj
  &lt;&lt; 

    /Type /Page
    /Parent 2 0 R  
    /Contents [4 0 R]

    /Resources 
    &lt;&lt;
      /ProcSet [/PDF /ImageB]    
      /XObject 
      &lt;&lt;
        /Im1 5 0 R
      &gt;&gt;
    &gt;&gt;
  &gt;&gt;
endobj


4 0 obj 
  &lt;&lt;

  &gt;&gt;
    stream
      q
        1 0 0 1 0 0 cm
        /Im1 DO
      Q
    endstream
endobj


5 0 obj 
  &lt;&lt;
    /Type /XObject
    /Subtype /Image
    /Height 3
    /Width 3
    /BitsPerComponent 1
    /Length 9
    /ColorSpace /DeviceGray
  &gt;&gt;
    stream111000111endstream
endobj


trailer
  &lt;&lt; /Root 1 0 R
   /Size 7
  &gt;&gt;
%%EOF

答案1

得分: 2

@johnwhitington 已经介绍了 PDF 中基本的图像编程。

图像通常使用四行代码在页面内容的 q 块中放置和缩放,示例代码如下:

q 192 0 0 192 100 100 cm /Img0 Do Q 

192 是 dx 和 dy,0 是“倾斜”,100 是 x 和 y,Img0 是图像编号,Do 是叠加写入代码。这些并没有描述 Img0 的实际大小。

要查看使用混合文本方法插入 JPEG,请参见 https://stackoverflow.com/a/75710613/10802527

因此,您需要使用编辑器中的任何有效方法注入图像像素,对于 RGB 图像,最简单的方法是使用 JPG 导入,然而,JPG 对于 PDF 不是理想的格式,因为它是纯二进制的,大多数文本文件不能使用纯二进制输入。因此,对于 MP4 视频和 JPEG 图像,需要将其转换为文本安全格式,比如文本的十六进制 00 01 02 03,以便将所有 256 ANSI 代码写入 ANSI 编辑器,如记事本。

换句话说,对于字节,1 位黑色是 00,白色是 FF,对于 RGB,分别是 FFFFFF000000

正确,那么如何写入该像素就像这样,设置指针指向页面资源中的一个对象,例如像 &lt;&lt;/XObject &lt;&lt;/Img0 6 0 R&gt;&gt;&gt;&gt; 这样的条目。

6 0 obj 需要声明像像素数量、颜色和编码类型这样的信息。

6 0 obj &lt;&lt;/Type/XObject/Subtype/Image/ColorSpace/Device...

在您的示例中,我们可以看到

16 0 obj
  &lt;&lt;
    /Type /XObject
    /Subtype /Image
    /Height 3
    /Width 3
    /BitsPerComponent 1
    /Length 9
    /ColorSpace /DeviceGray
  &gt;&gt;
stream
 &#173;&#160;
endstream

请注意,这样做,不会在流 startendstream 之间产生任何可见对象,为什么!它看起来应该是 001101101

在PDF中添加图像

答案是因为它实际上是二进制流,而在记事本等 ANSI 编辑器中不可见。

这些字符是二进制 20 AD A0,其中 20 是空白,AD 是二进制 101 01101\n = A0

因此,这些字节看起来像

00100000
10101101
10100000

通过将其扩展到 6 位来测试,正如预期的那样,我们现在得到了这个

在PDF中添加图像

因此,核心问题是,在文本格式中,比特被视为文字而不是可见的 01,这对于处理图像不方便。在这个级别,我们需要开始使用编码,比如 ASCII 十六进制(/ASCIIHexDecode)

答案

因此,PDF 是位流作为字节流,您想要的是 111000111
这将是

111
000
111

因此,它是

11100000
00000000
11100000

轻松使用这个或类似的

stream
&#224;&#224;
endstream

其中两个 &#224; 之间有一个看不见的黑色字符
结果将完美像素,但显示黑色 00000000 字符在 ANSI 文本中不容易编写,它们理想上需要十六进制编码,对于 RGB 的一个技巧是,对于黑色,使用空格表示为 [space] = \x20 = 因此,与“安全”ANSI 文本字符的字节格式一样暗。因此,黑色和白色的 8 位将是 &#255;&#255;&#255; &#255;&#255;&#255;

在PDF中添加图像

因此,对于类似但略带黑色的结果,我们可以使用以下方法

    /Height 3
    /Width 3
    /BitsPerComponent 8
    /Length 27
    /ColorSpace /DeviceRGB
  &gt;&gt;
stream
&#255;&#255;&#255;&#255;&#255;&#255;&#255;&#255;&#255;         &#255;&#255;&#255;&#255;&#255;&#255;&#255;&#255;&#255;
endstream
endobj
英文:

@johnwhitington has covered the basic image programming in a PDF

The image is placed and scaled usually using four lines of code in a q block in the pages contents such as this working single line:-

q 192 0 0 192 100 100 cm /Img0 Do Q

192 is the dx and dy, 0 is the "skew" 100 is the x y, Img0 is the image number and Do is the stacked write code. None of this describes what real size Img0 is

To see a jpeg insertion using a hybrid text approach (after the text is prepared) see https://stackoverflow.com/a/75710613/10802527

So you need to inject the image pixels using any method that works in the editor, for RGB images JPG import is simplest HOWEVER JPG is not an ideal format for PDF as its pure binary and most text files cannot use pure binary inputs. So for MP4 video and JPEG images they need to be converted into a text safe format such as textual HeX 00 01 02 03 etc thus all 256 ANSI codes can be written into an ANSI editor, such as NotePad.

In other byte words 1 bit black is 00 and white is FF for RGB that's FFFFFF and 000000

Right, so how to write that pixel is like this set a pointer to an object in the page resources perhaps an entry like &lt;&lt;/XObject &lt;&lt;/Img0 6 0 R&gt;&gt;&gt;&gt; the

6 0 obj needs declarations such as number of pixels and colors and encoding type.

6 0 obj &lt;&lt;/Type/XObject/Subtype/Image/ColorSpace/Device...

in your example we can see

16 0 obj 
  &lt;&lt;
    /Type /XObject
    /Subtype /Image
    /Height 3
    /Width 3
    /BitsPerComponent 1
    /Length 9
    /ColorSpace /DeviceGray
  &gt;&gt;
stream
 &#173;&#160;
endstream

note that will without any visible object between stream start and endstream produce this odd image WHY ! that looks like it should be

001101101

在PDF中添加图像

the answer is because its what's really there as binary stream which is not visible in an ANSI editor like notepad.

the characters are binary 20 AD A0 where 20 is a blank whitespace and AD is
binary 101 01101 and \n = A0

so those bytes look like

00100000
10101101
10100000

lets test that by widen to 6 bits and as expected we now get this

在PDF中添加图像

So the core issue is that in a text format the bits are taken as literal NOT visible 01's which cannot be convenient for handling imagery. At this level what we need is to start use an encoding such as ASCII HeX (/ASCIIHexDecode)

Answer

So a PDF is a Bitstream as a ByteStream and you want 111000111
that will be

111
000
111

thus its

11100000
00000000
11100000

easy use this or similar

stream
&#224;&#224;
endstream

where there is an invisible black character between the two &#224; 's
result will be pixel perfect, but shows that black 00000000 characters are not easy to write in ANSI text they ideally need HEX coding one trick for RGB is use a white space for black as [space] = \x20 = thus as dark as a "safe" ANSI text character can be in its bytes format. so blackish and white as 8 bits would be &#255;&#255;&#255; &#255;&#255;&#255;

在PDF中添加图像

Hence for a similar, BUT blackish result as above we could use

    /Height 3
    /Width 3
    /BitsPerComponent 8
    /Length 27
    /ColorSpace /DeviceRGB
  &gt;&gt;
stream
&#255;&#255;&#255;&#255;&#255;&#255;&#255;&#255;&#255;         &#255;&#255;&#255;&#255;&#255;&#255;&#255;&#255;&#255;
endstream
endobj

答案2

得分: 1

a) 你不需要执行。

b) 如果每个分量是一个位,而且有九个像素,图像中将有两个字节(九个位),而不是九个字节。 '1' 和 '0' 是 8 位字符,而不是位。

c) 现在不需要 ProcSet。

d) 要在屏幕上看到你的图像,你需要类似于 1 0 0 1 100 100 cm 这样的东西来将其放大,以便可见。

e) 你的页面需要一个 /MediaBox。

英文:

A few hints:

a) You need Do not DO.

b) If it's one bit per component, and nine pixels, there will be two bytes (nine bits), not nine bytes in the image. '1' and '0' are 8-bit characters, not bits.

c) You don't need ProcSet these days.

d) To see your image on the screen, you'll want something like 1 0 0 1 100 100 cm to scale it up so it's visible.

e) Your page needs a /MediaBox

huangapple
  • 本文由 发表于 2023年5月6日 20:38:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/76188934.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定