英文:
What is the best way to get text from .eml file?
问题
尝试从我的本地驱动器上的几个 eml 文件中获取发件人、收件人、主题和消息正文。现在我尝试使用 Apache Commons Email,但有时会无错误地循环。以下是我的代码,它应该从 eml 文件中获取文本并将其保存为 txt 文件:
MimeMessage mimeMessage = MimeMessageUtils.createMimeMessage(null, file);
MimeMessageParser parser = new MimeMessageParser(mimeMessage);
if (parser.parse().hasPlainContent()) {
//尝试获取消息的文本
try (FileWriter writer = new FileWriter(txtName)) {
writeHeaders(writer, parser);
writer.write(parser.parse().getPlainContent());
} catch (IOException e) {
e.printStackTrace();
}
} else if (parser.parse().hasHtmlContent()) {
try (FileWriter writer = new FileWriter(txtName)) {
writeHeaders(writer, parser);
String text = Jsoup.parse(parser.parse().getHtmlContent()).text();
writer.write(text);
} catch (IOException e) {
e.printStackTrace();
}
}
此外,这是 writeHeaders 方法:
private void writeHeaders(FileWriter writer, MimeMessageParser parser) throws Exception {
writer.write("发件人:" + parser.getFrom() + "\n");
writer.write("收件人:" + parser.getTo() + "\n");
writer.write("主题:" + parser.getSubject() + "\n");
writer.write("消息:" + "\n" + "\n");
}
这是获取附件的方法:
if (parser.parse().hasAttachments()) {
//从 eml 获取并保存附件
List<DataSource> attachments = parser.parse().getAttachmentList();
for (DataSource attachment : attachments) {
if (attachment.getName() != null && !attachment.getName().isEmpty()) {
try (InputStream is = attachment.getInputStream()) {
File save = new File(saveDir + File.separator + attachment.getName());
FileOutputStream fos = new FileOutputStream(save);
byte[] buf = new byte[4096];
int bytesRead;
while ((bytesRead = is.read(buf)) != -1) {
fos.write(buf, 0, bytesRead);
}
fos.close();
if (save.getName().endsWith("eml")) {
parseEml(save, count);
}
} catch (Exception e) {
e.printStackTrace();
}
英文:
I try to get to, from, topic and message body from several eml files which are on my local drive. Now I've tried to use Apache Commons Email, but sometimes it loops with no errors. Here is my code which supposed to get text from eml and save it to txt:
MimeMessage mimeMessage = MimeMessageUtils.createMimeMessage(null, file);
MimeMessageParser parser = new MimeMessageParser(mimeMessage);
if (parser.parse().hasPlainContent()) {
//Trying to get text of the message
try (FileWriter writer = new FileWriter(txtName)) {
writeHeaders(writer, parser);
writer.write(parser.parse().getPlainContent());
} catch (IOException e) {
e.printStackTrace();
}
} else if (parser.parse().hasHtmlContent()) {
try (FileWriter writer = new FileWriter(txtName)) {
writeHeaders(writer, parser);
String text = Jsoup.parse(parser.parse().getHtmlContent()).text();
writer.write(text);
} catch (IOException e) {
e.printStackTrace();
}
}
Also here is writeHeaders method:
private void writeHeaders(FileWriter writer, MimeMessageParser parser) throws Exception {
writer.write("From :" + parser.getFrom() + "\n");
writer.write("To:" + parser.getTo() + "\n");
writer.write("Subject:" + parser.getSubject() + "\n");
writer.write("Message:" + "\n" + "\n");
}
And here is method to get attachments:
if (parser.parse().hasAttachments()) {
//Getting and saving attachments from eml
List<DataSource> attachments = parser.parse().getAttachmentList();
for (DataSource attachment : attachments) {
if (attachment.getName() != null && !attachment.getName().isEmpty()) {
try (InputStream is = attachment.getInputStream()) {
File save = new File(saveDir + File.separator + attachment.getName());
FileOutputStream fos = new FileOutputStream(save);
byte[] buf = new byte[4096];
int bytesRead;
while ((bytesRead = is.read(buf)) != -1) {
fos.write(buf, 0, bytesRead);
}
fos.close();
if (save.getName().endsWith("eml")) {
parseEml(save, count);
}
} catch (Exception e) {
e.printStackTrace();
}
So, maybe there are any easier ways to get text and attachments?
答案1
得分: 3
Yes much easier. Simple Java Mail (Github) can read .eml files and makes the content very accessible. If you find something like a looping error there too (unlikely), I'll be happy to assist you there (I actively maintain Simple Java Mail):
Email email = EmailConverter.emlToEmail(emlFile);
email.getFromRecipient();
email.getSubject();
email.getPlainText();
email.getHTMLText();
email.getAttachments();
email.getEmbeddedImages();
email.getHeaders();
// etc. etc.
Also supports S/MIME encrypted emails (if you have the required certificates to decrypt the emails).
英文:
Yes much easier. Simple Java Mail (Github) can read .eml files and makes the content very accessible. If you find something like a looping error there too (unlikely), I'll be happy to assist you there (I actively maintain Simple Java Mail):
Email email = EmailConverter.emlToEmail(emlFile);
email.getFromRecipient();
email.getSubject();
email.getPlainText();
email.getHTMLText();
email.getAttachments();
email.getEmbeddedImages();
email.getHeaders();
// etc. etc.
Also supports S/MIME encrypted emails (if you have the required certificates to decrypt the emails).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论