2015年9月24日 02:59:18go评论150阅读模式

英文:

storing matrices in golang in compressed binary format

问题

我正在探索Go和Python之间的比较，特别是在数学计算方面。我注意到Go有一个矩阵包mat64。

我想问一下同时使用Go和Python的人，是否有与Numpy的savez_compressed相当的函数/工具，可以将数据存储在npz格式（即"压缩"二进制，一个文件中包含多个矩阵）中，用于Go的矩阵？
另外，Go的矩阵能够处理像Numpy那样的字符串类型吗？

英文:

I am exploring a comparison between Go and Python, particularly for mathematical computation. I noticed that Go has a matrix package mat64.

I wanted to ask someone who uses both Go and Python if there are functions / tools comparable that are equivalent of Numpy's savez_compressed which stores data in a npz format (i.e. "compressed" binary, multiple matrices per file) for Go's matrics?
Also, can Go's matrices handle string types like Numpy does?

答案1

得分: 2

.npz是numpy特定的格式。很不可能Go本身会在标准库中支持这种格式。我也不知道今天是否存在任何第三方库，而且（10秒钟）的搜索也没有找到一个。如果你需要特定的npz格式，可以使用Python + numpy。

如果你只是想在Go中使用类似的东西，你可以使用任何格式。二进制格式包括golang binary和gob。根据你想要做什么，你甚至可以使用非二进制格式，比如json，然后自己进行压缩。

Go没有内置的矩阵。你找到的那个库是第三方库，它只处理float64类型的数据。

然而，如果你只需要以矩阵（n维）格式存储字符串，你可以使用n维切片。对于二维切片，它的声明如下：var myStringMatrix [][]string。

英文:

.npz is a numpy specific format. It is unlikely that Go itself would ever support this format in the standard library. I also don't know of any third party library that exists today, and (10 second) search didn't pop one up. If you need npz specifically, go with python + numpy.

If you just want something similar from Go, you can use any format. Binary formats include golang binary and gob. Depending on what you're trying to do, you could even use a non-binary format like json and just compress it on your own.

Go doesn't have built-in matrices. That library you found is third party and it only handles float64s.

However, if you just need to store strings in matrix (n-dimensional) format, you would use a n-dimensional slice. For 2-dimensional it looks like this: var myStringMatrix [][]string.

答案2

得分: 1

npz文件是zip归档文件。归档和压缩（可选）由Python的zip模块处理。npz文件包含了每个保存的变量的一个npy文件。任何基于操作系统的归档工具都可以解压缩和提取组件.npy文件。

所以剩下的问题是 - 你能模拟npy格式吗？这并不是微不足道的，但也不难。它由一个包含形状、步幅、数据类型和顺序信息的头块组成，后面是一个数据块，实际上是数组的数据缓冲区的字节图像。

因此，缓冲区信息和数据与numpy数组内容密切相关。如果变量不是普通数组，save函数会使用Python的pickle机制。

首先，我建议使用csv格式。它不是二进制格式，也不快，但每个人都可以生成和读取它。我们经常收到关于使用np.loadtxt或np.genfromtxt读取此类文件的问题。查看np.savetxt的代码，了解numpy如何生成这种文件。它非常简单。

另一个通用选择是使用数组的tolist格式的JSON。之所以想到这个，是因为GO是谷歌为Web应用程序开发的自家替代Python的语言。JSON是一种基于简化的JavaScript语法的跨语言格式。

英文:

npz files are zip archives. Archiving and compression (optional) are handled by the Python zip module. The npz contains one npy file for each variable that you save. Any OS based archiving tool can decompress and extract the component .npy files.

So the remaining question is - can you simulate the npy format? It isn't trivial, but also not difficult either. It consists of a header block that contains shape, strides, dtype, and order information, followed by a data block, which is, effectively, a byte image of the data buffer of the array.

So the buffer information, and data are closely linked to the numpy array content. And if the variable isn't a normal array, save uses the Python pickle mechanism.

For a start I'd suggest using the csv format. It's not binary, and not fast, but everyone and his brother can generate and read it. We constantly get SO questions about reading such files using np.loadtxt or np.genfromtxt. Look at the code for np.savetxt to see how numpy produces such files. It's pretty simple.

Another general purpose choice would be JSON using the tolist format of an array. That comes to mind because GO is Google's home grown alternative to Python for web applications. JSON is a cross language format based on simplified Javascript syntax.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Golang中以压缩的二进制格式存储矩阵。

问题

答案1

答案2

Golang中结构体字面量和指针在访问结构体字段时的区别是什么？

获取下一个小时的时间戳

如何找出值从它们的周期开始时发生了怎样的变化？

Golang：从.tif文件中获取坐标

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。