如何迭代两个文件并仅提取匹配前的一行。

huangapple go评论63阅读模式
英文:

python How to iterate over two files and grep only one line before the match

问题

这是names.txt文件:

David
Mary
Rose
Saeed

这是emails.txt文件:

  - address1@gmail.com
  - Mary
  - address2@gmail.com
  - Rose
  - address3@hotmail.com
  - David
  - address4@yahoo.com
  - Saeed
  - address5@gmail.com
  - Jones

emails.txt中有比names.txt更多的电子邮件地址。

我想要在names.txt中查找每个名字,并且仅在该名字之前查找一行。

例如,要在names.txt中查找名字David并找到其电子邮件address3@hotmail.com

这是我的Python代码到目前为止(仅打印names.txt的第一行):

import re

with open('names.txt', 'r') as file:
    with open('emails.txt', 'r') as emails:
        for name in file:
            for email in emails:
                if re.search(name.strip(), email):
                    print(email.strip())

当我找到更接近答案的内容时,我会更新问题。

英文:

This is names.txt:

David
Mary
Rose
Saeed

This is emails.txt:

  - address1@gmail.com
  - Mary
  - address2@gmail.com
  - Rose
  - address3@hotmail.com
  - David
  - address4@yahoo.com
  - Saeed
  - address5@gmail.com
  - Jones

In emails.txt there are more emails than names.txt.

I want to grep each name in names.txt and grep that name but only one line before that.

For example, to grep the name David and find its email address3@hotmail.com.

This is my python code up to now (which only prints the first line of names.txt:

import re

with open('names.txt', 'r') as file:
    with open('emails.txt', 'r') as emails:
        for name in file:
            for email in emails:
                if re.search(name, email):
                    print(email)

I'll update the question when I find something new to get closer to the answer.

答案1

得分: 1

这是你想要的吗?

Name: David, Email:  address3@hotmail.com
Name: Mary, Email:  address1@gmail.com
Name: Rose, Email:  address2@gmail.com
Name: Saeed, Email:  address4@yahoo.com
英文:

Is this what you want?

import re

with open('names.txt', 'r') as names_file:
    with open('emails.txt', 'r') as emails_file:
        names = names_file.readlines()
        emails = emails_file.readlines()

        for name in names:
            name = name.strip()
            for i, email in enumerate(emails):
                email = re.sub('-', '', email)
                if re.search(name, email):
                    prev_email = emails[i-1].strip()
                    prev_email = re.sub('-', '', prev_email)
                    print(f"Name: {name}, Email: {prev_email}")
                    break

Result

Name: David, Email:  address3@hotmail.com
Name: Mary, Email:  address1@gmail.com
Name: Rose, Email:  address2@gmail.com
Name: Saeed, Email:  address4@yahoo.com

huangapple
  • 本文由 发表于 2023年7月11日 04:21:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/76657081.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定