英文:
Use base64.StdEncoding or base64.RawStdEncoding to decode base64 string in Go
问题
我们知道,在Go语言中有两种方法可以解码base64字符串,即base64.StdEncoding
和base64.RawStdEncoding
。如何正确使用其中一种方法来解码base64字符串呢?
如果调用了错误的编码方法,例如使用RawStdEncoding
来解码StdEncoding
字符串,将会出现错误信息illegal base64 data at input byte xxx
。
根据文档:
const (
StdPadding rune = '=' // 标准填充字符
NoPadding rune = -1 // 无填充
)
RawStdEncoding是标准的原始、无填充的base64编码,定义在RFC 4648的第3.2节中。它与StdEncoding相同,但省略了填充字符。
我们应该通过检查填充的结尾是否为StdPadding
来区分它们吗?代码示例如下:
lastByte := s[len(s)-1:]
if lastByte == string(base64.StdPadding) {
base64.StdEncoding.DecodeString(s)
} else {
base64.RawStdEncoding.DecodeString(s)
}
这是一种优雅的方式吗?还有什么遗漏的地方吗?有没有更优雅的解码base64字符串的方式呢?
更新:
也许通过错误检查来实现一种原始的方式如下所示:
rawByte, err := base64.StdEncoding.DecodeString(s)
if err != nil {
rawByte, err = base64.RawStdEncoding.DecodeString(s)
}
英文:
As we know, there are two methods to decode base64 string in go base64.StdEncoding
or base64.RawStdEncoding
. How to use one of them correctly to decode one base64 string?
If the incorrect encoding method is invoked. For example, if RawStdEncoding
is used to decode one StdEncoding
string, the error illegal base64 data at input byte xxx
will come up.
Per doc
const (
StdPadding rune = '=' // Standard padding character
NoPadding rune = -1 // No padding
)
> RawStdEncoding is the standard raw, unpadded base64 encoding, as defined in RFC 4648 section 3.2. This is the same as StdEncoding but omits padding characters.
Should we distinguish them by checking the end of padding is StdPadding
or not? code snippet
lastByte := s[len(s)-1:]
if lastByte == string(base64.StdPadding) {
base64.StdEncoding.DecodeString(s)
} else {
base64.RawStdEncoding.DecodeString(s)
}
Is that an elegant way to do that? Or anything am I missing? What is the elegant way to decode base64 string?
Update:
Maybe one raw way to do it through error checking as below
rawByte, err := base64.StdEncoding.DecodeString(s)
if err != nil {
rawByte, err = base64.RawStdEncoding.DecodeString(s)
}
答案1
得分: 4
我们知道,在Go语言中有两种方法可以解码base64字符串:base64.StdEncoding和base64.RawStdEncoding。
还有一种方法是使用base64.URLEncoding,它使用字符“-”和“_”作为URL不安全的base64字符“+”和“/”的替代。
我们应该通过检查填充的结尾是否为StdPadding来区分它们吗?代码片段如下:
这种方法行不通。base64编码有1/3的几率没有可见的填充:
b := []byte("abc123") // len(b) % 3 == 0 - 没有填充
fmt.Println(base64.StdEncoding.EncodeToString(b)) // YWJjMTIz
fmt.Println(base64.RawStdEncoding.EncodeToString(b)) // YWJjMTIz
https://play.golang.org/p/LMtIHlyXdn7
那么如何区分它们并确定使用了哪种编码?
是的,你可以像在你更新的问题中那样进行双重解码:
rawByte, err := base64.StdEncoding.DecodeString(s)
if err != nil {
rawByte, err = base64.RawStdEncoding.DecodeString(s)
}
你可以使用一些技巧来进行一些有根据的猜测。例如:
e := base64.StdEncoding.EncodeToString(b) // 总是产生长度为4的倍数
if len(e) % 4 != 0 {
// 不能是base64.StdEncoding - 那么尝试base64.RawStdEncoding?
}
英文:
> As we know, there are two methods to decode base64 string in go
> base64.StdEncoding or base64.RawStdEncoding.
there's also base64.URLEncoding which uses characters -
and _
as substitutes for the URL-unsafe base64 characters +
and /
.
> Should we distinguish them by checking the end of padding is
> StdPadding or not? code snippet
This won't work. There is a 1 in 3 chance that a base64 encoding will have no visible padding:
b := []byte("abc123") // len(b) % 3 == 0 - no padding
fmt.Println(base64.StdEncoding.EncodeToString(b)) // YWJjMTIz
fmt.Println(base64.RawStdEncoding.EncodeToString(b)) // YWJjMTIz
https://play.golang.org/p/LMtIHlyXdn7
so how do you tell them apart - and determine which encoding was used?
Yes you can-double decode like in your updated Question:
rawByte, err := base64.StdEncoding.DecodeString(s)
if err != nil {
rawByte, err = base64.RawStdEncoding.DecodeString(s)
}
There are some tricks you can employ to make some educated guesses. For example:
e := base64.StdEncoding.EncodeToString(b) // always produces a mutiple of 4 length
if len(e) % 4 != 0 {
// cannot be base64.StdEncoding - so try base64.RawStdEncoding?
}
答案2
得分: 1
如果你得到了illegal base64 data at input byte ...
的错误信息:
- 你可能使用了错误的base64解码器,或者
- 在调用解码器之前,必须去除base64字符串后面的更多数据,或者
- 输入的数据不是base64编码的。
> 我们应该通过检查填充的结尾是否为StdPadding来区分它们吗?
不需要。就像你知道数据是经过base64编码的一样,你也应该知道它是如何被编码的,并且使用base64.StdEncoding
或base64.RawStdEncoding
中的一个,而不是两者都使用。你不需要猜测这些,只需使用与发送方使用的编码相对应的解码方法即可。
Base64编码可以有以下差异:
- 填充/不填充(结尾没有
=
) - 标准(
+
,/
)或URL(-
,_
)字母表 - 是否包含换行符(例如,MIME在76个字符处换行,PEM在64个字符处换行)
你可以通过目视检查编码后的字符串来猜测编码方案。但请注意,填充并不总是存在的 - 这取决于源数据的长度是否是3的倍数,因为每个3字节的元组被编码为4个6位字符。
英文:
If you get illegal base64 data at input byte ...
then:
- you either used the wrong base64 decoder, or
- there's more data after the base64 string that must be stripped before invoking the decoder, or
- the input is not base64.
> Should we distinguish them by checking the end of padding is StdPadding or not?
No. Just like you know that the data is at all base64-encoded, you should also know how exactly it is encoded and use e.g. either base64.StdEncoding
or base64.RawStdEncoding
, not both. You don't guess these things, but simply use the decode method that corresponds to the encoding used by the sender.
Base64 encoding can differ by:
- padded/unpadded (no
=
s at the end) - standard (
+
,/
) or URL (-
,_
) alphabet - with/without newlines (e.g. MIME splits lines on 76 characters, PEM on 64)
You can visually inspect the encoded string to guess the encoding scheme. But note that padding is not always present - it depends on whether the length of source data is a multiple of 3 or not, since each tuple of 3 bytes is encoded as 4 6-bit characters.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论