格式化主题电子邮件

huangapple go评论89阅读模式
英文:

Format subject email

问题

# 我想恢复电子邮件的“SUBJECT”值

import imaplib
import os
import email

email_user = 'xxxxxxx@xxxxxxx'
email_pass = 'xxxxxxxx'

M = imaplib.IMAP4_SSL('imap.gmail.com', 993)
M.login(email_user, email_pass)
M.select('INBOX')

typ, message_numbers = M.search(None, 'ALL')

num = b'2420'
typ, data = M.fetch(num, '(RFC822)')

raw_email = data[0][1].decode('utf-8')
email_message = email.message_from_string(raw_email)

print(email_message['Subject'])

该值为

=?UTF-8?Q?=5BNAS=5FLEBARS=5D_Active_Backup_for_Business_=2D_La_t=C3=A2che_?=
=?UTF-8?Q?de_sauvegarde_DBS_=2D_SIDEXIS_sur_NAS=5FLEBARS_est_termin=C3=A9e?=

但我想要这种编码

[NAS_LEBARS] Active Backup for Business - La tâche de sauvegarde DBS - SIDEXIS sur NAS_LEBARS est terminée

谢谢

英文:

I want to recover the value "SUBJECT" of an email

import imaplib
import os
import email

email_user = 'xxxxxxx@xxxxxxx'
email_pass = 'xxxxxxxx'

M = imaplib.IMAP4_SSL('imap.gmail.com', 993)
M.login(email_user, email_pass)
M.select('INBOX')

typ, message_numbers = M.search(None, 'ALL')

num = b'2420'
typ, data = M.fetch(num, '(RFC822)')

raw_email = data[0][1].decode('utf-8')
email_message = email.message_from_string(raw_email)

print(email_message['Subject'])

the value is

=?UTF-8?Q?=5BNAS=5FLEBARS=5D_Active_Backup_for_Business_=2D_La_t=C3=A2che_?=
=?UTF-8?Q?de_sauvegarde_DBS_=2D_SIDEXIS_sur_NAS=5FLEBARS_est_termin=C3=A9e?=

but i want this encode

[NAS_LEBARS] Active Backup for Business - La tâche de sauvegarde DBS - SIDEXIS sur NAS_LEBARS est terminée

Thanks

答案1

得分: -1

这基本上做了你需要的事情,它:

  • 获取每个编码行
  • 提取需要解码的部分
  • 转换内容,
    • 用空格替换每个 ''_''
    • 用具有相应十六进制代码的字节替换每个 ''=XX''
    • 保留所有其他字符不变
  • 将整个结果解码为UTF-8字节数组
import re

subject = [
    ''=?UTF-8?Q?=5BNAS=5FLEBARS=5D_Active_Backup_for_Business_=2D_La_t=C3=A2che_?='',
    ''=?UTF-8?Q?de_sauvegarde_DBS_=2D_SIDEXIS_sur_NAS=5FLEBARS_est_termin=C3=A9e?=''
]


def convert_content(content):
    iter_content = iter(content)
    try:
        while True:
            ch = next(iter_content)
            if ch == ''_':
                yield b' '
            elif ch == ''=':
                yield bytearray.fromhex(next(iter_content)+next(iter_content))
            else:
                yield ch.encode('utf-8')
    except StopIteration:
        pass


def process(data):
    for line in data:
        m = re.match(r'=\?(?:utf|UTF)-8\?(?:q|Q)\?(.*)\?=', line)
        yield b''.join(convert_content(m.group(1))).decode('utf-8')


print(''.join(process(subject)))

输出:

[NAS_LEBARS] Active Backup for Business - La tâche de sauvegarde DBS - SIDEXIS sur NAS_LEBARS est terminée
英文:

This pretty much does what you need, it:

  • takes each encoded line
  • extracts the part that needs to be decoded
  • converts that,
    • replacing each '_' with a space
    • replacing each '=XX' with the byte with that hex code
    • leaving all other characters as is
  • decodes the entire result as a UTF-8 bytes array
import re

subject = [
    '=?UTF-8?Q?=5BNAS=5FLEBARS=5D_Active_Backup_for_Business_=2D_La_t=C3=A2che_?=',
    '=?UTF-8?Q?de_sauvegarde_DBS_=2D_SIDEXIS_sur_NAS=5FLEBARS_est_termin=C3=A9e?='
]


def convert_content(content):
    iter_content = iter(content)
    try:
        while True:
            ch = next(iter_content)
            if ch == '_':
                yield b' '
            elif ch == '=':
                yield bytearray.fromhex(next(iter_content)+next(iter_content))
            else:
                yield ch.encode('utf-8')
    except StopIteration:
        pass


def process(data):
    for line in data:
        m = re.match(r'=\?(?:utf|UTF)-8\?(?:q|Q)\?(.*)\?=', line)
        yield b''.join(convert_content(m.group(1))).decode('utf-8')


print(''.join(process(subject)))

Output:

[NAS_LEBARS] Active Backup for Business - La tâche de sauvegarde DBS - SIDEXIS sur NAS_LEBARS est terminée

huangapple
  • 本文由 发表于 2020年1月6日 14:43:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/59607753.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定