英文:
Trying to redirect country specific users without affecting search engine crawlers
问题
我试图通过使用Cloudflare Worker将美国访问者重定向到非美国页面,但仍希望允许搜索引擎爬取整个站点,以便它们可以绕过重定向。
Cloudflare提供了访问者的国家信息([链接](https://developers.cloudflare.com/workers/runtime-apis/request/#incomingrequestcfproperties)),以及访问者是否是已验证的机器人([链接](https://radar.cloudflare.com/traffic/verified-bots))。
网站的所有美国页面URL都以'/us/'为前缀,因此我可以使用基本的正则表达式来检查他们试图访问的站点的哪个部分。
所以这是我实现的内容...
```js
export default {
async fetch(request) {
// 获取访问者的国家代码。
// @link https://developers.cloudflare.com/workers/runtime-apis/request/
const visitorCountry = request.cf?.country;
// 获取机器人管理状态。
// @link https://developers.cloudflare.com/bots/reference/bot-management-variables/#workers-variables
// @link https://radar.cloudflare.com/traffic/verified-bots
const requestIsVerifiedBot = request?.cf?.botManagement?.verifiedBot;
const requestUrl = new URL(request.url);
const requestUrlIsUs = requestUrl.pathname.match(/^\/us\/?$|^\/us\/.*$/i)?.length;
// 如果访问者来自美国,并且他们正在访问非美国页面,并且他们不是已验证的机器人。
if (visitorCountry === 'US' && !requestUrlIsUs && !requestIsVerifiedBot) {
return Response.redirect('https://example.com/us/', 301); // 返回到美国首页。
}
// 继续执行。
return fetch(request);
}
}
重定向按预期工作,但已验证的机器人条件似乎总是失败,所以爬虫也被重定向了。我不确定为什么会发生这种情况,根据Cloudflare文档,这应该按预期工作。
任何帮助将不胜感激!
<details>
<summary>英文:</summary>
I'm trying prevent US visitors from accessing non-US pages by redirecting them using a Cloudflare Worker, however, I still want to allow search engines to crawl the entire site so they will bypass the redirect.
Cloudflare gives me the [visitors country](https://developers.cloudflare.com/workers/runtime-apis/request/#incomingrequestcfproperties) and if the visitor is a [verified bot](https://radar.cloudflare.com/traffic/verified-bots).
The site has all US page URLs prefixed with `/us/` so I can perform a basic regex to check what part of the site they are trying to access.
So this is what I have implemented...
```js
export default {
async fetch(request) {
// Get the visitors country code.
// @link https://developers.cloudflare.com/workers/runtime-apis/request/
const visitorCountry = request.cf?.country;
// Get the bot management status.
// @link https://developers.cloudflare.com/bots/reference/bot-management-variables/#workers-variables
// @link https://radar.cloudflare.com/traffic/verified-bots
const requestIsVerifiedBot = request?.cf?.botManagement?.verifiedBot;
const requestUrl = new URL(request.url);
const requestUrlIsUs = requestUrl.pathname.match(/^\/us\/?$|^\/us\/.*$/i)?.length;
// If the visitor is from the US, and they are accessing a non-US page, and they are not a verified robot.
if (visitorCountry === 'US' && !requestUrlIsUs && !requestIsVerifiedBot) {
return Response.redirect('https://example.com/us/', 301); // Go back to the US homepage.
}
// Continue through.
return fetch(request);
}
}
The redirect is working as expected, however, the verified bot condition always appears to fail so crawlers are also being redirected. I'm not sure why this is happening, based on the Cloudflare documentation this should work as expected.
Any help would be appreciated!
答案1
得分: 0
你需要在你的账户上启用Bot管理。这是Cloudflare的付费功能。
不幸的是,由于历史原因,即使你没有订阅Bot管理,request.cf.botManagement
也会出现,但在这种情况下,它包含虚拟值 -- 无论请求如何,内容都将相同。我们(Cloudflare)希望在你没有订阅时完全删除该属性,但一些在野外的Worker脚本意外地依赖于该字段的存在,尽管他们尚未订阅该功能,所以摆脱它是复杂的。
英文:
You will need to enable Bot Management on your account. This is a paid Cloudflare feature.
Unfortunately, due to a historical accident, request.cf.botManagement
shows up even if you have not subscribed to Bot Management, but in this case it contains dummy values -- the content will be the same regardless of the request. We (Cloudflare) would like to remove the property entirely when you don't have a subscription, but some Worker scripts in the wild are accidentally depending on this field existing even though they haven't subscribed to the feature, so getting rid of it is complicated.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论