2023年5月11日 19:56:41go评论99阅读模式

英文:

How do I read doubles from a binary file in a loop?

问题

以下是翻译好的部分：

简单的问题，但我在网上没有找到有帮助的答案。我有一个用C++创建的文件，我首先输出一个std::size_t k，然后写入2 * x个double。

我需要首先在Python中读取std::size_t k，然后在循环中从0到k - 1迭代，在每次迭代中读取两个double x, y并对它们进行一些操作：

with open('file', 'rb') as f:
    fig, ax = pyplot.subplots()
    k = numpy.fromfile(f, numpy.uint64, 1)[0] # 这里需要指定dtype和count
    for j in range(0, k):
        # 获取double x和y
        x = numpy.fromfile(f, numpy.float64, 1)[0] # 这里需要指定dtype和count
        y = numpy.fromfile(f, numpy.float64, 1)[0] # 这里需要指定dtype和count
        ax.scatter(x=x, y=y, c=0)
        ax.set_xlim(0, 1)
        ax.set_ylim(0, 1)

我读取的k的值是3832614067495317556，但它应该是4096。在读取x的地方，我立即遇到了索引超出范围的异常。

英文:

Simple question, but I don't find a helpful answer on the web. I have file create with C++, where I first output a std::size_t k and then write 2 * x doubles.

I need to first read the std::size_t k in python and then iterate in a loop from 0 to k - 1, read two double x, y in each iteration and do something with them:

with open(&#39;file&#39;, &#39;r&#39;) as f:
    fig, ax = pyplot.subplots()
    k = numpy.fromfile(f, numpy.uint64)[0] # does not work
    for j in range(0, k):
        # get double x and y somehow
        x = numpy.fromfile(f, numpy.double)[0]
        y = numpy.fromfile(f, numpy.double)[0]
        ax.scatter(x = x, y = y, c = 0)
        ax.set_xlim(0, 1)
        ax.set_ylim(0, 1)

The value I read in k is 3832614067495317556, but it should be 4096. And at the point where I read x, I immediately get an index out of range exception.

答案1

得分: 1

你的C++代码有问题。标准的<<运算符在处理二进制数据时表现不佳。

请使用以下代码：

#include <fstream>
int main() {
    std::ofstream out("file.dat", std::ios::binary);
    std::size_t k = 4096;
    out.write(reinterpret_cast<char*>(&k), sizeof k);
    double a = 1.1;
    for (int i = 0; i < 8; ++i) {
        auto b = a * i;
        out.write(reinterpret_cast<char*>(&b), sizeof b);
    }
}

你可以在https://en.cppreference.com/w/cpp/io/basic_ofstream的底部看到示例，这是我从那里获取的。

在二进制模式下打开文件（在Windows上是必需的），只读取1个项目的大小，并使用offset参数一次性读取所有剩余的双精度数据。还要验证std::size_t是否等于uint64，这对于相关的机器是成立的。

import numpy as np
with open('file.dat', 'rb') as f:
    fig, ax = pyplot.subplots()
    # 项目数，偏移字节
    k = np.fromfile(f, np.uint64, count=1)[0]
    # 可能需要将文件指针移回开头
    f.seek(0)
    xy = np.fromfile(f, np.double, offset=8)
    for x, y in zip(xy[::2], xy[1::2]):
        ax.scatter(x=x, y=y, c=0)
        ax.set_xlim(0, 1)
        ax.set_ylim(0, 1)

如果你不对散点图做任何特殊处理，只是想在一个散点图中绘制所有数据，你不需要使用for循环：

import numpy as np
with open('file.dat', 'rb') as f:
    fig, ax = pyplot.subplots()
    # 项目数，偏移字节
    k = np.fromfile(f, np.uint64, count=1)[0]
    # 可能需要将文件指针移回开头
    f.seek(0)
    xy = np.fromfile(f, np.double, offset=8)
    ax.scatter(x=xy[::2], y=xy[1::2], c=np.zeros(len(xy) // 2))
    ax.set_xlim(0, 1)
    ax.set_ylim(0, 1)

英文:

Your C++ code is wrong. The standard << operator doesn't behave well with binary data.

Use the following:

#include &lt;fstream&gt;
int
main()
        std::ofstream out(&quot;file.dat&quot;, std::ios::binary);
        std::size_t k = 4096;
        out.write(reinterpret_cast&lt;char*&gt;(&amp;k), sizeof k);
        double a = 1.1;
        for (int i = 0; i &lt; 8; ++i) {
                auto b = a*i;
                out.write(reinterpret_cast&lt;char*&gt;(&amp;b), sizeof b);
        }
}

See the example at the bottom of https://en.cppreference.com/w/cpp/io/basic_ofstream, where I grabbed this from.

(Answer related to improvements and mistakes of the Python code, even if that's not the essential problem.)

Open the file in binary mode (necessary on Windows), read just 1 item for the size, and read all the remaining doubles in one go using the offset parameter. Also verify that std::size_t is equal to uint64 for the relevant machine(s).

import numpy as np
with open(&#39;file.dat&#39;, &#39;rb&#39;) as f:
    fig, ax = pyplot.subplots()
    # count in items, offset in bytes
    k = np.fromfile(f, np.uint64, count=1)[0]
    # Might need to move the file pointer back to the start
    f.seek(0)
    xy = np.fromfile(f, np.double, offset=8)
    for x, y in zip(xy[::2], xy[1::2]):
        ax.scatter(x = x, y = y, c = 0)
        ax.set_xlim(0, 1)
        ax.set_ylim(0, 1)

If you're not doing anything special with the scatter plots, and just want to plot all data in one scatter plot, you don't need a for-loop:

import numpy as np
with open(&#39;file.dat&#39;, &#39;rb&#39;) as f:
    fig, ax = pyplot.subplots()
    # count in items, offset in bytes
    k = np.fromfile(f, np.uint64, count=1)[0]
    # Might need to move the file pointer back to the start
    f.seek(0)
    xy = np.fromfile(f, np.double, offset=8)
    ax.scatter(x = xy[::2], y = xy[1::2], c = np.zeros(len(xy)/2))
    ax.set_xlim(0, 1)
    ax.set_ylim(0, 1)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在循环中从二进制文件中读取双精度数？

问题

答案1

可以使用Python和Go吗？

使用pandas根据另一列中的相应值，将列中的空白替换为DF中相同列中的值。

如何将员工打卡数据转换为15分钟间隔矩阵？

如何从以页面形式显示的在线电子书制作PDF？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。