2022年10月1日 06:30:36go评论82阅读模式

英文:

Why does Base64 buffer sizing make it larger than the length of the underlying text?

问题

我正在尝试将字节数组编码为Base64，并遇到两个问题。我可以使用base64.StdEncoding.EncodedLen(text)来实现，但我担心这样会很耗费资源，所以我想看看是否可以只用len(text)来实现。以下是代码（这些函数被命名为"Marshal"，因为我在JSON编组期间将它们用作字段转换器）：

package main

import (
	"crypto/rand"
	"encoding/base64"
	"fmt"
)

func main() {
	b := make([]byte, 60)
	_, _ = rand.Read(b)

	// Marshal Create Dst Buffer
	MarshalTextBuffer(b)

	// Marshal Convert to String
	MarshalTextStringWithBufferLen(b)

	// Marshal Convert to String
	MarshalTextStringWithDecodedLen(b)
}

func MarshalTextBuffer(text []byte) error {
	ba := base64.StdEncoding.EncodeToString(text)
	fmt.Println(ba)
	return nil
}

func MarshalTextStringWithBufferLen(text []byte) error {
	ba := make([]byte, len(text)+30) // 为什么len(text)不够？暂时使用'30'，只是为了防止溢出。
	base64.StdEncoding.Encode(ba, text)
	fmt.Println(ba)
	return nil
}

func MarshalTextStringWithDecodedLen(text []byte) error {
	ba := make([]byte, base64.StdEncoding.EncodedLen(len(text)))
	base64.StdEncoding.Encode(ba, text)
	fmt.Println(ba)
	return nil
}

这是输出结果：

IL5CW8T9WSgwU5Hyi9JsLLkU/EcydY6pG2fgLQJsMaXgxhSh74RTagzr6b9yDeZ8CP4Azc8xqq5/+Cgk
[73 76 53 67 87 56 84 57 87 83 103 119 85 53 72 121 105 57 74 115 76 76 107 85 47 69 99 121 100 89 54 112 71 50 102 103 76 81 74 115 77 97 88 103 120 104 83 104 55 52 82 84 97 103 122 114 54 98 57 121 68 101 90 56 67 80 52 65 122 99 56 120 113 113 53 47 43 67 103 107 0 0 0 0 0 0 0 0 0 0]
[73 76 53 67 87 56 84 57 87 83 103 119 85 53 72 121 105 57 74 115 76 76 107 85 47 69 99 121 100 89 54 112 71 50 102 103 76 81 74 115 77 97 88 103 120 104 83 104 55 52 82 84 97 103 122 114 54 98 57 121 68 101 90 56 67 80 52 65 122 99 56 120 113 113 53 47 43 67 103 107]

为什么中间的函数MarshalTextStringWithBufferLen需要额外的填充？

base64.StdEncoding.EncodedLen函数是否耗费资源（例如，我可以使用底部的函数解决问题，但我担心资源消耗）？

英文:

I am trying to encode a byte array as Base64 and running into two issues. I can do this with base64.StdEncoding.EncodedLen(text) but I'm worried that's costly, so I wanted to see if I could do it just with len(text). Here is the code (the functions are named "Marshal" because I'm using them as a field converter during JSON Marshaling):

package main

import (
	&quot;crypto/rand&quot;
	&quot;encoding/base64&quot;
	&quot;fmt&quot;
)

func main() {
	b := make([]byte, 60)
	_, _ = rand.Read(b)

	// Marshal Create Dst Buffer
	MarshalTextBuffer(b)

	// Marshal Convert to String
	MarshalTextStringWithBufferLen(b)

	// Marshal Convert to String
	MarshalTextStringWithDecodedLen(b)
}

func MarshalTextBuffer(text []byte) error {
	ba := base64.StdEncoding.EncodeToString(text)
	fmt.Println(ba)
	return nil
}

func MarshalTextStringWithBufferLen(text []byte) error {
	ba := make([]byte, len(text)+30) // Why does len(text) not suffice? Temporarily using &#39;30&#39; for now, just so it doesn&#39;t overrun.
	base64.StdEncoding.Encode(ba, text)
	fmt.Println(ba)
	return nil
}

func MarshalTextStringWithDecodedLen(text []byte) error {
	ba := make([]byte, base64.StdEncoding.EncodedLen(len(text)))
	base64.StdEncoding.Encode(ba, text)
	fmt.Println(ba)
	return nil
}

Here's the output:

IL5CW8T9WSgwU5Hyi9JsLLkU/EcydY6pG2fgLQJsMaXgxhSh74RTagzr6b9yDeZ8CP4Azc8xqq5/+Cgk
[73 76 53 67 87 56 84 57 87 83 103 119 85 53 72 121 105 57 74 115 76 76 107 85 47 69 99 121 100 89 54 112 71 50 102 103 76 81 74 115 77 97 88 103 120 104 83 104 55 52 82 84 97 103 122 114 54 98 57 121 68 101 90 56 67 80 52 65 122 99 56 120 113 113 53 47 43 67 103 107 0 0 0 0 0 0 0 0 0 0]
[73 76 53 67 87 56 84 57 87 83 103 119 85 53 72 121 105 57 74 115 76 76 107 85 47 69 99 121 100 89 54 112 71 50 102 103 76 81 74 115 77 97 88 103 120 104 83 104 55 52 82 84 97 103 122 114 54 98 57 121 68 101 90 56 67 80 52 65 122 99 56 120 113 113 53 47 43 67 103 107]

Why does the middle one MarshalTextStringWithBufferLen require extra padding?

Is base64.StdEncoding.EncodedLen a costly function (e.g. I can solve it with the bottom function, but I worry about the cost).

答案1

得分: 1

Base-64编码将二进制数据（每字节8位）存储为文本（每字节使用6位），因此每3个字节编码为4个字节（3x8 = 4x6）。所以你代码中的len(text) + 30是错误的，应该是len(text)*4/3（如果len(text)可被3整除），但为了可读性和避免错误，你应该使用base64.StdEncoding.EncodedLen()来获取长度。

如果你查看base64.StdEncoding.EncodedLen的代码，你会发现它与自己计算的速度一样快（尤其是它将被内联）。

英文:

Base-64 encoding stores binary data (8 bits per byte) as text (using 6 bits per byte), so every 3 bytes is encoded as 4 bytes (3x8 = 4x6). So len(text) + 30 in your code is wrong, and should be len(text)*4/3 (if len(text) is divisible by 3) but to make for readability and to avoid bugs you should be using base64.StdEncoding.EncodedLen() to get the length.

If you look at the code for base64.StdEncoding.EncodedLen you will see that it is as fast as doing the calcs yourself (esp. as it will be in-lined).

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

为什么Base64缓冲区大小比底层文本的长度要大？

问题

答案1

Golang在死锁检测方面有奇怪的行为。

如何在连接路径时添加尾部斜杠

有没有一个分布式数据处理流水线框架，或者一个好的组织方式？

Google App Engine Golang 导入不起作用

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论