英文:
Match regexp email in website in go
问题
我尝试在Goland中使用包含URL的文件来查找网站中的电子邮件匹配项。例如,如果我在文件中输入"http://facebook.com",它将尝试查找网站中的所有电子邮件,但结果始终为0。我认为我选择了错误的函数,但我尝试找到其他函数,但结果相同。以下是代码:
package main
import (
"bufio"
"bytes"
"fmt"
"log"
"net/http"
"os"
"regexp"
"sync"
)
func main() {
var wg sync.WaitGroup
wg.Add(1)
go emailWeb(os.Args[1], &wg)
wg.Wait()
}
func emailWeb(name string, wg *sync.WaitGroup) {
file, err := os.Open(name)
if err != nil {
log.Fatal(err)
}
defer file.Close()
scanner := bufio.NewScanner(file)
for scanner.Scan() {
str := scanner.Text()
nb_arobase := numberEmail(str)
fmt.Println("URL : ", str, " nb email: ", nb_arobase)
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
(*wg).Done()
}
func numberEmail(url string) int {
count := 0
reg := regexp.MustCompile(`[a-z0-9._%+\-]+@[a-z0-9.\-]+\.[a-z]{2,4}`)
response, err := http.Get(url)
if err != nil {
log.Fatal(err)
} else {
str := response.Body
buf := new(bytes.Buffer)
buf.ReadFrom(str)
bodyStr := buf.String()
for i := 0; i < len(bodyStr); i++ {
if reg.MatchString(string(bodyStr[i])) {
count += 1
}
}
}
return count
}
希望这可以帮助你解决问题。
英文:
I try to find email match in a website in goland with a file include url, for example, if i put "http://facebook.com" in the file, he will try to find all email find in the website, but he always result 0. I think I choose the wrong function but i try to find other function but i've got the same result. Here the code :
package main
import (
"bufio"
"bytes"
"fmt"
"log"
"net/http"
"os"
"regexp"
"sync"
)
func main() {
var wg sync.WaitGroup
wg.Add(1)
go emailWeb(os.Args[1], &wg)
wg.Wait()
}
func emailWeb(name string, wg *sync.WaitGroup) {
file, err := os.Open(name)
if err != nil {
log.Fatal(err)
}
defer file.Close()
scanner := bufio.NewScanner(file)
for scanner.Scan() {
str := scanner.Text()
nb_arobase := numberEmail(str)
fmt.Println("URL : ", str, " nb email: ", nb_arobase)
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
(*wg).Done()
}
func numberEmail(url string) int {
count := 0
reg := regexp.MustCompile(`[a-z0-9._%+\-]+@[a-z0-9.\-]+\.[a-z]{2,4}`)
response, err := http.Get(url)
if err != nil {
log.Fatal(err)
} else {
str := response.Body
buf := new(bytes.Buffer)
buf.ReadFrom(str)
bodyStr := buf.String()
for i := 0; i < len(bodyStr); i++ {
if reg.MatchString(string(bodyStr[i])) {
count += 1
}
}
}
return count
}
答案1
得分: 0
你正在尝试将正则表达式与HTTP响应体中的每个单个字符进行匹配。如果你想要计算整个响应体中的匹配次数,可以通过计算匹配的索引来实现。
resp, err := http.Get(url)
if err != nil {
log.Println(err)
return 0
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
log.Println(err)
return 0
}
return len(reg.FindAllIndex(body))
英文:
You're trying to match the regexp against each individual character in the http response body. You can count the matches in the entire body if you want by counting the matched indexes.
resp, err := http.Get(url)
if err != nil {
log.Println(err)
return 0
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
log.Println(err)
return 0
}
return len(reg.FindAllIndex(body))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论