英文:
Why do I get a timeout in PhpStorm HTTP Client and Laravel HTTP Client but not in Postman?
问题
When I send a plain GET request with Postman, it consistently takes about 228ms and yields a JSON response. As you can see, I disabled cookies and headers in the request to have the exact same request. The generated curl seems to confirm this. Please have a better look at my first 2 screenshots of Postman for proof (I would doubt me as well). No headers at all, even in the 2nd image of the debug console, you'll see no headers should be sent.
When I try the same thing with PhpStorm's HTTP client, I get a timeout (except for 1 time which worked after 30 seconds).
Same thing in code:
$result = Http::timeout(10)
->get('https://mobileapi.jumbo.com/v17/products');
dump($result->body());
Making the request in curl
is even weirder.
So we have 3 different responses for seemingly the same request:
- Postman: fast response (< 400ms) with expected body.
- Both PhpStorm and PHP code: timeout even with more than 10 seconds of allowed time.
- Curl request generated by Postman: Access Denied.
Update: Found a solution to my problem, but the reason why Postman works without it still baffles me: A User-Agent header containing exactly "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Firefox/102.0" works. However, as you can see in the above screenshots, I don't send this header in Postman, and there it works without. This question as to why this happens still stands.
英文:
When I send a plain GET request with Postman this takes about 228ms consistently and yields a json response. As you can see I disabled cookies and headers in the request to have the exact same request. The generated curl seems to confirm this. Please have a better look at my first 2 screenshots of Postman for proof (I would doubt me as well). No headers at all, even in de 2nd image of the debug console you'll see no headers should be sent.
When I try the same thing with PhpStorm's HTTP client
I get a timeout (except for 1 time which worked after 30 seconds)
Same thing in code:
$result = Http::timeout(10)
->get('https://mobileapi.jumbo.com/v17/products');
dump($result->body());
Making the request in curl
is even weirder:
So we have 3 different responses for seemingly the same request:
- Postman: fast response (< 400ms) with expected body
- both PhpStorm and php code: timeout even with more that 10 seconds of allowed time
- curl request generated by Postman: Access Denied
Update
Found a solution to my problem, but the reason why Postman works without it still baffles me: A User-Agent header containing exactly "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Firefox/102.0" works. However as you can see in above screenshots I don't send this header in Postman and there it works without. This question as to why this happens still stands.
答案1
得分: 1
I will only provide translations for the text you've provided:
"为什么在表面上看起来相同的请求会产生不同的行为呢?"
"毫无疑问,Web 服务器基于 HTTP 头部有一些请求过滤规则。"
"即使所有的 HTTP 客户端看起来都发送相同的请求,实际情况是每个客户端都会悄悄附加略有不同的头部。因此,我创建了一个 RequestBin 来检查客户端之间的差异。"
"curl --location 'https://...'"
Host: envyr05unq3bk.x.pipedream.net
X-Amzn-Trace-Id: Root=1-64a95ba2-4633081a75af397c63ece198
User-Agent: curl/7.71.1
Accept: */*
"只使用 Host 标头的 Postman"
Host: envyr05unq3bk.x.pipedream.net
X-Amzn-Trace-Id: Root=1-64a95bac-102cc1e826a026f328c6c583
"PhpStorm HTTP 客户端"
Host: envyr05unq3bk.x.pipedream.net
X-Amzn-Trace-Id: Root=1-64a95cc5-40d9098a346629e74f52f4d9
User-Agent: Apache-HttpClient/4.5.14 (Java/17.0.7)
Accept-Encoding: br,deflate,gzip,x-gzip
"根据您的域名检查,将 User-Agent 设置为 'curl/7.71.1' 或 'Apache-HttpClient/4.5.14 (Java/17.0.7)' 会导致超时。正如在其他回复中提到的,可能存在一个用户代理的黑名单。"
1: https://public.requestbin.com
英文:
Why do you get different behaviors using seemingly the same request?
Surely, the web server has some request filtering rules based on HTTP headers.
Even if it looks that all HTTP clients send the same request, the reality is that each one attaches slightly different headers silently. Given so, I created a RequestBin to check the differences between the clients.
curl --location 'https://...'
Host: envyr05unq3bk.x.pipedream.net
X-Amzn-Trace-Id: Root=1-64a95ba2-4633081a75af397c63ece198
User-Agent: curl/7.71.1
Accept: */*
Postman with Host header only
Host: envyr05unq3bk.x.pipedream.net
X-Amzn-Trace-Id: Root=1-64a95bac-102cc1e826a026f328c6c583
PhpStorm HTTP Client
Host: envyr05unq3bk.x.pipedream.net
X-Amzn-Trace-Id: Root=1-64a95cc5-40d9098a346629e74f52f4d9
User-Agent: Apache-HttpClient/4.5.14 (Java/17.0.7)
Accept-Encoding: br,deflate,gzip,x-gzip
Checking against your domain, it seems that setting the User-Agent to curl/7.71.1
or Apache-HttpClient/4.5.14 (Java/17.0.7)
causes a time-out. As mentioned in other responses, there is probably a blacklist for user agents.
答案2
得分: 0
这通常是由于请求的发送或接收方式存在一些微妙差异造成的。以下是一些可能的原因:
头部信息:Postman 可能会自动包含头部信息,而 PhpStorm、Laravel 或 curl 可能不会。
SSL/TLS 配置:如果服务器使用 HTTPS,SSL/TLS 设置可能会导致差异。
代理:客户端之间代理设置的差异可能会影响结果。
Cookie:确保所有客户端中都真正禁用了 Cookie。
重要的是要比较每个客户端发送的实际请求,以识别任何不一致之处。
另一个可能的原因是防火墙设置。
英文:
It's often due to some subtle differences in the way the requests are being sent or received. Here are some possible reasons:
Headers: Postman might automatically include headers that PhpStorm, Laravel, or curl don't.
SSL/TLS configurations: If the server uses HTTPS, the SSL/TLS setup could cause differences.
Proxies: Differences in proxy settings across the clients could affect the results.
Cookies: Ensure cookies are truly disabled in all clients.
It's important to compare the actual requests being sent in each client to identify any discrepancies.
One other possible reason can be Firewall Settings
答案3
得分: 0
Postman在每个请求中都会发送一个默认的User-Agent(临时标头),如果您没有指定不同的User-Agent,这可能是您能够在Postman中发出请求而无需显式设置User-Agent标头的原因。API端点可能会识别到Postman的默认User-Agent并允许该请求。
您可以尝试在标头部分启用标头,但将值字段留空。这将覆盖Postman设置的临时标头。
PhpStorm的内置HTTP客户端不会自动向请求添加User-Agent标头。如果您想包括User-Agent标头,您需要手动添加它到您的请求中。
API端点可能需要设置一些标头,否则它将阻止该请求。
英文:
Postman has a default User-Agent that it sends with every request (temporary headers) if you don't specify a different one. This is likely why you're able to make the request in Postman without explicitly setting the User-Agent header. The api endpoint probably recognizes Postman's default User-Agent and allows the request.
You could try to enable the header in the header section but leave the value field blank. This will override the temp headers set by postman.
PhpStorm's built-in HTTP client does not automatically add a User-Agent header to the requests. If you want to include a User-Agent header, you need to add it manually to your request.
The api endpoint probably requires some headers to be set, otherwise it will block the request.
答案4
得分: 0
以下是您要翻译的内容:
奇怪的是,但答案相当简单:该站点使用黑名单来过滤User-Agent
,很可能使用了自制的解决方案。
第一个指示是查看他们首页的robots.txt,这可能表明他们因为ChatGPT机器人的增加流量或者不想被抓取而看到了增加的流量。这是一种相当老式的方法,里面的disallow-list看起来是手动组合的。
第二至少对于API,他们似乎解析User-Agent并设置了一些检查。例如,将UA设置为\0
会产生400 Bad Request的响应。
第三,使用Curl并将User-Agent设置为Cool
返回一个正确的结果:
$ curl --header "User-Agent: Cool" https://mobileapi.jumbo.com/v17/products
第四,字母php
似乎在黑名单上,这意味着以它开头的任何User-Agent(例如PhpStorm
,phpstorm
,php2020
)似乎都不会有回应。但是,User Agents ThePhp
或 ThisIsA Php Bot
可以正常工作...
然而最后,他们似乎也使用白名单。将标头设置为PhpStorm
时,请求超时。将其设置为Mozilla PhpStorm
,则会收到正确的回复。但对于PhpMozilla
或Php Mozilla
则不行。从中我只能推测他们首先检查已知的浏览器标识符,如果找到了则回复,然后检查黑名单中的字符串(但没有完全遍历子字符串),如果两者都找不到,则仍然回复(这似乎是Postman和我荒谬的例子的情况)。
例如,将UA设置为StackOverflow
也会得到良好的回复。噢,guzzle
似乎也在黑名单上(我猜这是您的Laravel UA)。
TL;DR Postman有效,因为您设置的标头,因为凡是不在(奇怪解析的)黑名单上的User Agent都有效。
严肃地说,有人应该告诉他们关于速率限制或其他防止过多流量的方式。
英文:
Weird indeed, but the answer is quite simple: The site uses a blacklist to filter User-Agent
's and quite possible does so with a homemade solution.
First indicator is looking at their homepage's robots.txt, which may indicate that they've seen increased traffic due to ChatGPT bots or simply don't want to be scraped. It's quite "a bit" of an old-fashioned approach and the disallow-list in there looks like manually put together.
Second at least for the API they seem to parse the User-Agent and have some checks in place.
Setting the UA to \0
for instance produces a 400 Bad Request reply.
Third, using Curl and setting the User-Agent to Cool
returns a proper result:
$ curl --header "User-Agent: Cool" https://mobileapi.jumbo.com/v17/products
Fourth, the letters php
seem to be on that blacklist, which means any User-Agent that starts with it (PhpStorm
, phpstorm
, php2020
) seems to lead to no reply. User Agents ThePhp
or ThisIsA Php Bot
work though ...
However and finally, they also seem to use a whitelist.
Setting headers to PhpStorm
and the request times out. Setting it to Mozilla PhpStorm
and a proper reply comes back. Not for PhpMozilla
or Php Mozilla
From that I can only assume they first check for known browser identifiers and reply if they find, then check for blacklisted strings (but without properly going through the substrings) and if they find neither, they still reply (which seems to be the case for Postman and my absurd examples).
For instance setting UA to StackOverflow
also yields a good reply. Ah and guzzle
seems to be on that blacklist as well (I guess that's your Laravel UA).
TL;DR Postman works, with the headers you've set, because EVERY User Agent works that is not on the (weirdly parsing) blacklist.
On a serious note, someone should tell them about about rate limiting or some other ways to protect against over the top traffic.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论