英文:
Is base64 encoding required when sending email with pdf attachment?
问题
我想发送一封带有PDF附件的电子邮件,我在这里找到了一个完美的示例:
https://zetcode.com/golang/email-smtp/
它确实可以工作,但我看不出需要进行Base64编码的必要性,所以我省略了Base64编码并修改了标题,新的BuildMail
函数如下所示:
func BuildMail(mail Mail) []byte {
var buf bytes.Buffer
buf.WriteString(fmt.Sprintf("From: %s\r\n", mail.Sender))
buf.WriteString(fmt.Sprintf("To: %s\r\n", strings.Join(mail.To, ";")))
buf.WriteString(fmt.Sprintf("Subject: %s\r\n", mail.Subject))
boundary := "my-boundary-779"
buf.WriteString("MIME-Version: 1.0\r\n")
buf.WriteString(fmt.Sprintf("Content-Type: multipart/mixed; boundary=%s\n",
boundary))
buf.WriteString(fmt.Sprintf("\r\n--%s\r\n", boundary))
buf.WriteString("Content-Type: text/plain; charset=\"utf-8\"\r\n")
buf.WriteString(fmt.Sprintf("\r\n%s", mail.Body))
buf.WriteString(fmt.Sprintf("\r\n--%s\r\n", boundary))
buf.WriteString("Content-Type: application/pdf\r\n")
buf.WriteString("Content-Disposition: attachment; filename=words.pdf\r\n")
buf.WriteString("Content-ID: <words.pdf>\r\n\r\n")
data := readFile("words.pdf")
buf.Write(data)
buf.WriteString(fmt.Sprintf("\r\n--%s", boundary))
buf.WriteString("--")
return buf.Bytes()
}
但是运行此代码会导致我收到一个空的PDF附件。因此,通过SMTP发送附件时是否需要对附件进行Base64编码?为什么?
英文:
I want to send a email with pdf attachment and I find a perfect example here:
https://zetcode.com/golang/email-smtp/
It does work, but I don't see the necessity of base64 encoding, so I omit the base64 encoding and modify the headers and the new BuildMail
function looks like this:
func BuildMail(mail Mail) []byte {
var buf bytes.Buffer
buf.WriteString(fmt.Sprintf("From: %s\r\n", mail.Sender))
buf.WriteString(fmt.Sprintf("To: %s\r\n", strings.Join(mail.To, ";")))
buf.WriteString(fmt.Sprintf("Subject: %s\r\n", mail.Subject))
boundary := "my-boundary-779"
buf.WriteString("MIME-Version: 1.0\r\n")
buf.WriteString(fmt.Sprintf("Content-Type: multipart/mixed; boundary=%s\n",
boundary))
buf.WriteString(fmt.Sprintf("\r\n--%s\r\n", boundary))
buf.WriteString("Content-Type: text/plain; charset=\"utf-8\"\r\n")
buf.WriteString(fmt.Sprintf("\r\n%s", mail.Body))
buf.WriteString(fmt.Sprintf("\r\n--%s\r\n", boundary))
buf.WriteString("Content-Type: application/pdf\r\n")
buf.WriteString("Content-Disposition: attachment; filename=words.pdf\r\n")
buf.WriteString("Content-ID: <words.pdf>\r\n\r\n")
data := readFile("words.pdf")
buf.Write(data)
buf.WriteString(fmt.Sprintf("\r\n--%s", boundary))
buf.WriteString("--")
return buf.Bytes()
}
But running this code results in my receiving an empty pdf attachment. So is it required that the attachment should be base64-encoded when sending via smtp? Why?
答案1
得分: 3
Base-64编码 特别 不是必需的,但之所以需要 某种 编码(在有限的集合中选择)是因为SMTP是一种“7位干净”协议,也就是说,它被规定为处理US-ASCII字符集子集中具有值的字节,该字符集仅定义范围为[0..127]的代码,参见规范,该规范指定的允许范围是[1..127]。
由于PDF是一种二进制格式,其中的文档可能包含范围在[1..127]之外的字节,即[0..255]。正因为如此,SMTP消息中携带的任何有效负载——不仅仅是“附件”,还包括纯文本——都必须以任何产生由字节组成的输出的方式进行编码,这些字节在范围[1..127]或更窄范围内。
Base-64满足这一特性,但引用可打印编码、base36、UTF-7以及基本上任何其他已经发明的编码方式或您自己想出的编码方式都具有这一主要特性。
当然,接下来的问题是使用一个受到预期接收者理解的编码方式(请注意,邮件传输代理对于您如何编码有效负载毫不知情,只要最终结果是7位干净的)。
然后,问题基本上归结为您的目的是什么。如果您生成的邮件消息是供人使用他们的邮件用户代理(MUAs)阅读的,那么Base-64和引用可打印编码是普遍支持的,使用其中之一是明智的。如果这是一个您控制的服务(一个程序),您可以使用具有必要属性的任何编码方式。
如果我们谈论编码添加的 开销 问题(编码表示大小与原始原始数据大小的比率),比如对于包含极少非ASCII内容的普通英语文本,如单词“naïve”的“正规”拼写,QP很容易胜过Base-64,因为它只会对这些“奇怪”的字符进行编码,而常规字符将保持不变。对于“主要是二进制”的内容,比如PDF、ZIP归档(常见的现代“办公套件”生成的文档都是伪装的ZIP归档),Base-64胜出。
英文:
No, base-64 encoding specifically is not required, but the reason some encoding (out of a limited set) is required is due to the fact SMTP is a "7-bit-clean" protocol, that is, it's specified to manipulate bytes with values in the subset of the US-ASCII character set, which only defines codes in the range [0..127]–see the spec which says the allowed range is [1..127].
Since PDF is a binary format, documents in it may contain bytes outside of the range [1..127],–namely, [0..255]. And precisely because of that, any payload carried in SMTP messages—not only "attachments" but also plain human-readable text–must be encoded in any way which produces output composed of bytes in the range [1..127] or more narrow.
Base-64 fulfills this property, but so does quoted-printable encoding, base36, UTF-7 and basically anything other already invented or whatever you yourself could come up with–as long as it has that major property.
Of course, then there's the question of using an encoding which is understandable by the intended recipients (note that mail transport agents are blissfully unaware of how you encode your payloads as long as the end result is 7-bit clean).
And then the things basically boil down to what you're after.
If you're generating mail messages intended to be read by humans using their MUAs, then base-64 and quoted-printable are universally supported, and it's a sensible thing to use one of these. If it's a service (a program) you control, you can use absolutely any encoding with the necessary properties.
The question of using, say, base-64 vs quoted-printable is more sublte if we talk about overhead the encoding adds (the ratio of the size of the encoded representation compared to the size of the original raw blob): say, for plain English text with miniscule bits of non-ASCII stuff like the "proper" spelling of the word "naïve", QP easily wins over base-64 as it would encode only those "funky" characters and the regular characters will be left as is. For "mostly-binary" stuff such as PDFs, ZIP archives (documents produced by popular contemporary "office" suites are all ZIP archives in disguize), base-64 wins.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论