如何在Go中重用HTTP请求实例

huangapple go评论74阅读模式
英文:

How to reuse HTTP request instance in Go

问题

我正在构建一个API,用于从网页上抓取一些数据。

为了实现这个目的,我需要向主页发送一个GET请求,从HTML中抓取一个名为'RequestVerificationToken'的值,然后再向同一URL发送一个带有用户名、密码和RequestVerificationToken的POST请求。

我之前用Python做到了这一点:

session_requests = requests.session()
result = session_requests.get(LOGIN_URL)
parser = createBS4Parser(result.text)
return parser.find('input', attrs={'name': '__RequestVerificationToken'})["value"]

pageDOM = session_requests.post(
        LOGIN_URL,
        data=requestPayload, //RequestVerificationToken is in here
        headers=requestHeaders
)

在Python中,当我重复使用session_requests变量时,它会重用先前的HTTP请求实例。

然而,当我尝试在Go中做同样的事情时,由于无效的令牌,我遇到了错误。我猜这是因为Go在POST请求中使用了一个新的实例。

有没有办法让Go的行为与Python相同?

英文:

I'm building an API that scrapes some data off a webpage.

To do so, i need to send a GET request to a home page, scrape a 'RequestVerificationToken' from the HTML, then send another POST request to the same URL with a username, password, and the RequestVerificationToken.

I've been able to do this previously with Python:

session_requests = requests.session()
result = session_requests.get(LOGIN_URL)
parser = createBS4Parser(result.text)
return parser.find('input', attrs={'name': '__RequestVerificationToken'})["value"]

 pageDOM = session_requests.post(
        LOGIN_URL,
        data=requestPayload, //RequestVerificationToken is in here
        headers=requestHeaders
 )

It seems like when i reuse the session_requests variable in Python, it's reusing the previous instance of the HTTP request.

However, when i try to do this in Go, I get an error due to an invalid token. I assume that this is because for the POST request, Go is using a new instance.

Is there any way I can get the same behavior from Go as I was with Python?

答案1

得分: 1

package main

import (
	"fmt"
	"log"

	"github.com/gocolly/colly"
	"github.com/gocolly/colly/proxy"
)

func main() {
	// 初始化配置
	c := colly.NewCollector(colly.AllowURLRevisit())
	// 定义代理链
	revpro, err := proxy.RoundRobinProxySwitcher("socks5://127.0.0.1:9050", "socks5://127.0.0.1:9050")
	if err != nil {
		log.Fatal(err)
	}
	c.SetProxyFunc(revpro)
	// 从HTML中解析所需字段,我们提取用于登录的csrf_token
	c.OnHTML("form[role=form] input[type=hidden][name=CSRF_TOKEN]", func(e *colly.HTMLElement) {
		csrftok := e.Attr("value")
		fmt.Println(csrftok)
		// 将csrf值与密码一起提交
		err := c.Post("https://www.something.com/login.jsp", map[string]string{"CSRF_TOKEN": csrftok, "username": "username", "password": "password"})
		if err != nil {
			log.Fatal(err)
		}
		return
	})
	// 要访问的网站
	c.Visit("https://www.something.com/login.jsp")
	// 使用克隆维持连接,而不是发起回调请求
	d := c.Clone()
	d.OnHTML("a[href]", func(e *colly.HTMLElement) {
		link := e.Attr("href")
		fmt.Printf("Link found: %q -> %s\n", e.Text, link)

	})

	d.Visit("https://skkskskskk.htm")
}
英文:
 package main

 import (
    "fmt"
    "log"

   "github.com/gocolly/colly"
   "github.com/gocolly/colly/proxy"
     )

  func main() {
//initiates the configuration
c := colly.NewCollector(colly.AllowURLRevisit())
//defining the proxy chain
revpro, err := proxy.RoundRobinProxySwitcher("socks5://127.0.0.1:9050", "socks5://127.0.0.1:9050")
if err != nil {
	log.Fatal(err)
}
c.SetProxyFunc(revpro)
//parsing the required field from html we are extracting the csrf_token required for the login
c.OnHTML("form[role=form] input[type=hidden][name=CSRF_TOKEN]", func(e *colly.HTMLElement) {
	csrftok := e.Attr("value")
	fmt.Println(csrftok)
	//posting the csrf value along with password
	err := c.Post("https://www.something.com/login.jsp", map[string]string{"CSRF_TOKEN": csrftok, "username": "username", "password": "password"})
	if err != nil {
		log.Fatal(err)
	}
	return
})
//The website to visit
c.Visit("https://www.something.com/login.jsp")
//maintaining the connection using clone not initiating a callback request
d := c.Clone()
d.OnHTML("a[href]", func(e *colly.HTMLElement) {
	link := e.Attr("href")
	fmt.Printf("Link found: %q -> %s\n", e.Text, link)

})

d.Visit("https://skkskskskk.htm")
  }

huangapple
  • 本文由 发表于 2022年8月12日 17:24:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/73331966.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定