英文:
Parsing multi-part emails from maildir
问题
我需要解析从Unix Maildir读取的多部分电子邮件文件。你能否建议一个合适的库来完成这个任务?
这些电子邮件是通过IMAP获取并转储到Maildir中的。我需要解析这些电子邮件文件,并提取包括头部、Base64附件、HTML部分和纯文本部分在内的所有部分。
谢谢
编辑
我知道我可以通过关键词搜索库,但我也希望能够得到一些关于质量和经验的意见。
我可以处理实际的Maildir并获取邮件文件。我关心的是解析多部分电子邮件(以字符串形式提供)并提取各个部分。
英文:
I need to parse a multi-part email files read from unix maildir.
Can you please suggest an appropriate library(s) to do this?
The emails are sucked in via IMAP and dumped to a maildir.
I need to parse those email files and extract all the parts including header, base64 attachments, html parts and plaintext parts.
Thanks
EDIT
I know I can search for the libraries with keywords and stuff but I also would like some opinions on quality and experience if possible.
I can deal with the actual maildir and picking up mail files. My concern is the parsing of the multipart emails (being fed as strings) and extracting individual parts.
答案1
得分: 5
我使用 github.com/jhillyerd/enmime 包来完成这个任务。给定一个 io.Reader
r
:
// 解析消息体
env, _ := enmime.ReadEnvelope(r)
// 可以通过 Envelope.GetHeader(name) 获取头部信息。
fmt.Printf("发件人: %v\n", env.GetHeader("From"))
// 地址类型的头部信息可以解析为一组解码后的 mail.Address 结构体列表。
alist, _ := env.AddressList("To")
for _, addr := range alist {
fmt.Printf("收件人: %s <%s>\n", addr.Name, addr.Address)
}
fmt.Printf("主题: %v\n", env.GetHeader("Subject"))
// 纯文本正文可以通过 mime.Text 获取。
fmt.Printf("文本正文: %v 个字符\n", len(env.Text))
// HTML 正文存储在 mime.HTML 中。
fmt.Printf("HTML 正文: %v 个字符\n", len(env.HTML))
// mime.Inlines 是一组内联附件。
fmt.Printf("内联附件: %v\n", len(env.Inlines))
// mime.Attachments 包含非内联附件。
fmt.Printf("附件: %v\n", len(env.Attachments))
英文:
I've had some luck doing this with the github.com/jhillyerd/enmime package. Given an io.Reader
r
:
// Parse message body
env, _ := enmime.ReadEnvelope(r)
// Headers can be retrieved via Envelope.GetHeader(name).
fmt.Printf("From: %v\n", env.GetHeader("From"))
// Address-type headers can be parsed into a list of decoded mail.Address structs.
alist, _ := env.AddressList("To")
for _, addr := range alist {
fmt.Printf("To: %s <%s>\n", addr.Name, addr.Address)
}
fmt.Printf("Subject: %v\n", env.GetHeader("Subject"))
// The plain text body is available as mime.Text.
fmt.Printf("Text Body: %v chars\n", len(env.Text))
// The HTML body is stored in mime.HTML.
fmt.Printf("HTML Body: %v chars\n", len(env.HTML))
// mime.Inlines is a slice of inlined attacments.
fmt.Printf("Inlines: %v\n", len(env.Inlines))
// mime.Attachments contains the non-inline attachments.
fmt.Printf("Attachments: %v\n", len(env.Attachments))
答案2
得分: 0
这是我的例子:缺失的部分是提取附件。如果你解决了这个问题,请告诉我...我已经苦恼了几个星期,一直在尝试提取附件...
import (
"fmt"
"io/ioutil"
"net/mail"
)
func extractEmail(mail *mail.Message){
header := mail.Header
fmt.Println(header.Get("Date"))
fmt.Println(header.Get("From"))
fmt.Println(header.Get("To"))
fmt.Println(header.Get("cc"))
fmt.Println(header.Get("bcc"))
fmt.Println(header.Get("Subject"))
body, err := ioutil.ReadAll(mail.Body)
if err != nil {
checkErr(err, "Reading Body")
}
fmt.Println(body)
}
英文:
Here is my example: The part missing is extracting the attachments. Please let me know if you figured that part out... Been scratching my head to extract attachments for weeks now...
import (
"fmt"
"io/ioutil"
"net/mail"
)
func extractEmail(mail *mail.Message){
header := mail.Header
fmt.Println(header.Get("Date"))
fmt.Println(header.Get("From"))
fmt.Println(header.Get("To"))
fmt.Println(header.Get("cc"))
fmt.Println(header.Get("bcc"))
fmt.Println(header.Get("Subject"))
body, err := ioutil.ReadAll(mail.Body)
if err != nil {
checkErr(err, "Reading Body")
}
fmt.Println(body)
}
答案3
得分: 0
标准库中有一个示例:
https://pkg.go.dev/mime/multipart@go1.16.6#example-NewReader
英文:
There is an example in standard library:
https://pkg.go.dev/mime/multipart@go1.16.6#example-NewReader
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论