英文:
Fast read/unpacking of float32 from int16 in Python
问题
我有一个Python脚本,用于读取一些以int16格式打包的二进制数据。我想尽快将这些数据转换为float32。
目前我正在这样做,对于每个文件:
data = np.fromfile(fid, 'int16').astype('float32')
这种方法不幸的是,fromfile 和 astype 用时一样长(在我的情况下需要几秒钟)。我想知道是否有更快的方法来做到这一点?
也许初始化一个零数组,然后使用np.frombuffer逐个填充两个字节?
请给予建议,谢谢。
英文:
Say I have a Python script which reads some binary data, packed as int16. I want to convert this data to float32 as fast as possible.
Currently I am doing this, per file
data = np.fromfile(fid, 'int16').astype('float32')
This has the unfortunate effect that the fromfile and the astype take equally long (several seconds in my case). I was wondering if there's a faster way of doing this?
Maybe initializing a zero array and using np.frombuffer to finally populate two bytes at a time?
Please advise, thanks.
答案1
得分: 1
你可以尝试另一种方法,通过逐步读取和转换数据。
这里有一个示例:
chunk_size = 1000 # 你想要读取的元素数量
file_size = os.path.getsize(file)
float32_array = np.empty(file_size // 2, dtype=np.float32)
bytes_to_read = chunk_size * 2 # 乘以2,因为int16占用2个字节
bytes_read = 0
while bytes_read < file_size:
chunk = np.fromfile(file, dtype=np.int16, count=chunk_size)
float32_chunk = chunk.astype(np.float32)
float32_array[bytes_read // 2:bytes_read // 2 + chunk_size] = float32_chunk
bytes_read += bytes_to_read
英文:
You can try an alternative approach by reading and converting the data in smaller chunks.
Here's an example :
chunk_size = 1000 # The number of element you want to read
file_size = os.path.getsize(file)
float32_array = np.empty(file_size // 2, dtype=np.float32)
bytes_to_read = chunk_size * 2 # Multiply by 2 since int16 takes 2 bytes
bytes_read = 0
while bytes_read < file_size:
chunk = np.fromfile(file, dtype=np.int16, count=chunk_size)
float32_chunk = chunk.astype(np.float32)
float32_array[bytes_read // 2:bytes_read // 2 + chunk_size] = float32_chunk
bytes_read += bytes_to_read
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论