英文:
How do I read doubles from a binary file in a loop?
问题
以下是翻译好的部分:
简单的问题,但我在网上没有找到有帮助的答案。我有一个用C++创建的文件,我首先输出一个std::size_t k
,然后写入2 * x
个double
。
我需要首先在Python中读取std::size_t k
,然后在循环中从0
到k - 1
迭代,在每次迭代中读取两个double x, y
并对它们进行一些操作:
with open('file', 'rb') as f:
fig, ax = pyplot.subplots()
k = numpy.fromfile(f, numpy.uint64, 1)[0] # 这里需要指定dtype和count
for j in range(0, k):
# 获取double x和y
x = numpy.fromfile(f, numpy.float64, 1)[0] # 这里需要指定dtype和count
y = numpy.fromfile(f, numpy.float64, 1)[0] # 这里需要指定dtype和count
ax.scatter(x=x, y=y, c=0)
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
我读取的k
的值是3832614067495317556
,但它应该是4096
。在读取x
的地方,我立即遇到了索引超出范围的异常。
英文:
Simple question, but I don't find a helpful answer on the web. I have file create with C++, where I first output a std::size_t k
and then write 2 * x
double
s.
I need to first read the std::size_t k
in python and then iterate in a loop from 0
to k - 1
, read two double x, y
in each iteration and do something with them:
with open('file', 'r') as f:
fig, ax = pyplot.subplots()
k = numpy.fromfile(f, numpy.uint64)[0] # does not work
for j in range(0, k):
# get double x and y somehow
x = numpy.fromfile(f, numpy.double)[0]
y = numpy.fromfile(f, numpy.double)[0]
ax.scatter(x = x, y = y, c = 0)
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
The value I read in k
is 3832614067495317556
, but it should be 4096
. And at the point where I read x
, I immediately get an index out of range exception.
答案1
得分: 1
你的C++代码有问题。标准的<<
运算符在处理二进制数据时表现不佳。
请使用以下代码:
#include <fstream>
int main() {
std::ofstream out("file.dat", std::ios::binary);
std::size_t k = 4096;
out.write(reinterpret_cast<char*>(&k), sizeof k);
double a = 1.1;
for (int i = 0; i < 8; ++i) {
auto b = a * i;
out.write(reinterpret_cast<char*>(&b), sizeof b);
}
}
你可以在https://en.cppreference.com/w/cpp/io/basic_ofstream的底部看到示例,这是我从那里获取的。
在二进制模式下打开文件(在Windows上是必需的),只读取1个项目的大小,并使用offset
参数一次性读取所有剩余的双精度数据。还要验证std::size_t
是否等于uint64
,这对于相关的机器是成立的。
import numpy as np
with open('file.dat', 'rb') as f:
fig, ax = pyplot.subplots()
# 项目数,偏移字节
k = np.fromfile(f, np.uint64, count=1)[0]
# 可能需要将文件指针移回开头
f.seek(0)
xy = np.fromfile(f, np.double, offset=8)
for x, y in zip(xy[::2], xy[1::2]):
ax.scatter(x=x, y=y, c=0)
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
如果你不对散点图做任何特殊处理,只是想在一个散点图中绘制所有数据,你不需要使用for循环:
import numpy as np
with open('file.dat', 'rb') as f:
fig, ax = pyplot.subplots()
# 项目数,偏移字节
k = np.fromfile(f, np.uint64, count=1)[0]
# 可能需要将文件指针移回开头
f.seek(0)
xy = np.fromfile(f, np.double, offset=8)
ax.scatter(x=xy[::2], y=xy[1::2], c=np.zeros(len(xy) // 2))
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
英文:
Your C++ code is wrong. The standard <<
operator doesn't behave well with binary data.
Use the following:
#include <fstream>
int
main()
std::ofstream out("file.dat", std::ios::binary);
std::size_t k = 4096;
out.write(reinterpret_cast<char*>(&k), sizeof k);
double a = 1.1;
for (int i = 0; i < 8; ++i) {
auto b = a*i;
out.write(reinterpret_cast<char*>(&b), sizeof b);
}
}
See the example at the bottom of https://en.cppreference.com/w/cpp/io/basic_ofstream, where I grabbed this from.
(Answer related to improvements and mistakes of the Python code, even if that's not the essential problem.)
Open the file in binary mode (necessary on Windows), read just 1 item for the size, and read all the remaining doubles in one go using the offset
parameter. Also verify that std::size_t
is equal to uint64
for the relevant machine(s).
import numpy as np
with open('file.dat', 'rb') as f:
fig, ax = pyplot.subplots()
# count in items, offset in bytes
k = np.fromfile(f, np.uint64, count=1)[0]
# Might need to move the file pointer back to the start
f.seek(0)
xy = np.fromfile(f, np.double, offset=8)
for x, y in zip(xy[::2], xy[1::2]):
ax.scatter(x = x, y = y, c = 0)
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
If you're not doing anything special with the scatter plots, and just want to plot all data in one scatter plot, you don't need a for-loop:
import numpy as np
with open('file.dat', 'rb') as f:
fig, ax = pyplot.subplots()
# count in items, offset in bytes
k = np.fromfile(f, np.uint64, count=1)[0]
# Might need to move the file pointer back to the start
f.seek(0)
xy = np.fromfile(f, np.double, offset=8)
ax.scatter(x = xy[::2], y = xy[1::2], c = np.zeros(len(xy)/2))
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论