为什么在编码成tfrecord文件之前要将图像(numpy数组)转换为字符串?

huangapple go评论85阅读模式
英文:

Why image (numpy array) is convert to string before encoding into tfrecord file?

问题

最近,我正在解码图像(比如位图格式)为tfrecord文件。

但是,我想知道背后的原因:

在数据被写入tfrecord文件之前,为什么需要将numpy数组数据转换为字符串类型?

就像这样:

from PIL import Image
...
npimg = np.array(Image.open(img_path))
# 我的问题:
# 为什么需要将numpy数组img转换为字符串?
img_raw = npimg.tostring()
...
# 稍后,将img_raw写入tf.train.Example

这是我在博客文章Tfrecords Guide中找到的完整代码示例。

from PIL import Image
import numpy as np
import skimage.io as io
import tensorflow as tf


def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

tfrecords_filename = 'pascal_voc_segmentation.tfrecords'

writer = tf.python_io.TFRecordWriter(tfrecords_filename)

original_images = []
filename_pairs = [
     ('/path/to/example1.jpg',
      '/path/to/example2.jpg'),
     ...,
     ('/path/to/exampleN.jpg',
      '/path/to/exampleM.jpg'),
]

for img_path, annotation_path in filename_pairs:
    
    # 读取数据到numpy数组
    img = np.array(Image.open(img_path))
    annotation = np.array(Image.open(annotation_path))
    
    height = img.shape[0]
    width = img.shape[1]
    
    original_images.append((img, annotation))

    # 我的问题:
    # 为什么需要将numpy数组img转换为字符串?
    img_raw = img.tostring()
    annotation_raw = annotation.tostring()
    
    example = tf.train.Example(features=tf.train.Features(feature={
        'height': _int64_feature(height),
        'width': _int64_feature(width),
        'image_raw': _bytes_feature(img_raw),
        'mask_raw': _bytes_feature(annotation_raw)}))
    
    writer.write(example.SerializeToString())

writer.close()

任何提示将不胜感激。提前感谢。

英文:

Recently, I'm working on decoding image (let's say a bitmap format) into a tfrecord file

But, I'm wondering about the reason

Why do we need to convert numpy array data into a string type

before the data is been written into tfrecord file?

like

from PIL import Image
...
npimg = np.array(Image.open(img_path))
# My question:
# why do we need to convert numpy array img to stirng?
img_raw = npimg.tostring()
...
# later on, write img_raw to tf.train.Example

Here's the full code example that I found on the blog post Tfrecords Guide.

from PIL import Image
import numpy as np
import skimage.io as io
import tensorflow as tf


def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

tfrecords_filename = 'pascal_voc_segmentation.tfrecords'

writer = tf.python_io.TFRecordWriter(tfrecords_filename)

original_images = []
filename_pairs = [
     ('/path/to/example1.jpg',
      '/path/to/example2.jpg'),
     ...,
     ('/path/to/exampleN.jpg',
      '/path/to/exampleM.jpg'),
]

for img_path, annotation_path in filename_pairs:
    
    # read data into numpy array
    img = np.array(Image.open(img_path))
    annotation = np.array(Image.open(annotation_path))
    
    height = img.shape[0]
    width = img.shape[1]
    
    original_images.append((img, annotation))

    # My question:
    # why do we need to convert numpy array img to stirng?
    img_raw = img.tostring()
    annotation_raw = annotation.tostring()
    
    example = tf.train.Example(features=tf.train.Features(feature={
        'height': _int64_feature(height),
        'width': _int64_feature(width),
        'image_raw': _bytes_feature(img_raw),
        'mask_raw': _bytes_feature(annotation_raw)}))
    
    writer.write(example.SerializeToString())

writer.close()

Any hint would be grateful. Thanks in advance.

答案1

得分: 2

为了高效读取数据,将数据序列化并存储在一组文件中(每个文件大小为100-200MB),可以帮助线性读取每个文件。如果数据正在通过网络进行流式传输,这尤其有用。这也可用于缓存任何数据预处理。

编辑:
当您将图像传输到服务器(tensorflow-server)时,这非常有用。在那里,您必须以序列化字符串的形式发送数据,因为某些媒体适用于流式传输文本。您永远不知道——有些协议可能会将您的二进制数据解释为控制字符(就像调制解调器一样),或者您的二进制数据可能会受损,因为底层协议可能会认为您输入了特殊的字符组合(就像FTP如何转换换行符)。

为了解决这个问题,人们将二进制数据编码成字符。Base64是其中一种编码类型。

为什么是64?
因为您通常可以依赖于许多字符集中存在相同的64个字符,并且您可以相当有信心地认为您的数据会在另一端无损传输。

英文:

To read data efficiently it can be helpful to serialize your data and store it in a set of files (100-200MB each) that can each be read linearly. This is especially true if the data is being streamed over a network. This can also be useful for caching any data-preprocessing.

Edit:
This comes in handy when you are transfering the image to a server (tensorflow-server). There you have to send the data in serialized string
because some media are made for streaming text. You never know -- some protocols may interpret your binary data as control characters (like a modem), or your binary data could be screwed up because the underlying protocol might think that you've entered a special character combination (like how FTP translates line endings).

So to get around this, people encode the binary data into characters. Base64 is one of these types of encodings.

Why 64?
Because you can generally rely on the same 64 characters being present in many character sets, and you can be reasonably confident that your data's going to end up on the other side of the wire uncorrupted.

huangapple
  • 本文由 发表于 2020年1月4日 00:35:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/59582111.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定