我不理解Cloudflare的缓存。

huangapple go评论62阅读模式
英文:

I don't understand the cache of Cloudflare

问题

我有一个托管的网络服务器,有约4000张图片,总共7.4GB,我已经在上面启用了Cloudflare CDN。

我运行了一个curl脚本来下载所有的图片。我预期这些图片之后会被放入Cloudflare(免费计划)的缓存中,但当我再次通过curl下载它们时,速度仍然非常慢。我还可以在Web界面中看到,大多数请求都没有被缓存。

然后我用 -I 选项运行了curl来检查头部信息,几分钟后再次运行,结果完全不同:

$ cat curl-output-headers.txt | grep 'cf-cache-status: REVALIDATED'|wc
    303     606    9090
$ cat curl-output-headers.txt | grep 'cf-cache-status: MISS'|wc
    549    1098   12627
$ cat curl-output-headers.txt | grep 'cf-cache-status: HIT'|wc
   3267    6534   71874

# 几分钟后

$ cat curl-output-headers2.txt | grep 'cf-cache-status: HIT'|wc
    457     914   10054
$ cat curl-output-headers2.txt | grep 'cf-cache-status: MISS'|wc
   2428    4856   55844
$ cat curl-output-headers2.txt | grep 'cf-cache-status: REVALIDATED'|wc
   1234    2468   37020

当我通过GET方式下载文件时,需要大约15分钟。

简单总结一下我的尝试:通过Excel文件批量上传到亚马逊卖家中。但它不起作用,因为亚马逊试图同时下载太多图片,甚至是100行。

在我的第一次尝试中,我使用的是制造商网络服务器的图片URL。如果我手动下载它们,需要7分钟,但对于亚马逊来说仍然太慢(但我猜想亚马逊是并行下载,而不是像我用curl串行下载)。

然而,我的想法是使用CDN,但不知何故情况变得更糟,但我确定我做错了些什么。

我还有一个带有专用IP的VPS服务器,也许我应该使用那台服务器?

英文:

I have a managed webserver with ~ 4000 images with 7,4 GB and I've enabled Cloudflare CDN for it.

I've run a curl script to download all images. I expected that the image will be put into Cloudflare (free plan) cache afterwards, but when I download them again via curl, it's still very slow. I can also see in the webinterface, that most requests are not cached.

我不理解Cloudflare的缓存。

Then I've run curl with -I to check the headers and I run it after a few minutes again and the result is totally different

$ cat curl-output-headers.txt | grep 'cf-cache-status: REVALIDATED'|wc
    303     606    9090
$ cat curl-output-headers.txt | grep 'cf-cache-status: MISS'|wc
    549    1098   12627
$ cat curl-output-headers.txt | grep 'cf-cache-status: HIT'|wc
   3267    6534   71874

#a few minutes later

$ cat curl-output-headers2.txt | grep 'cf-cache-status: HIT'|wc
    457     914   10054
$ cat curl-output-headers2.txt | grep 'cf-cache-status: MISS'|wc
   2428    4856   55844
$ cat curl-output-headers2.txt | grep 'cf-cache-status: REVALIDATED'|wc
   1234    2468   37020

When I download the files via GET, it takes ~15 minutes.

A short summary what I try to do: do a bulk upload to amazon seller via an excel file.
It doesn't work because amazon tries to download too much images at the same time, even 100 rows.
In my first attempt I was using images URLs by the manufacturer webserver. If I download them manually, it takes 7 minutes, but for amazon still to slow (but I guess amazon is downloading them in parallel, not like me in serial via curl).

Hoever, my idea was to use a CDN, but somehow it become even worse but I'm sure I'm doing something wrong.

I have also a VPS with a dedicated IP, maybe I should use that server?

答案1

得分: 3

Cloudflare不能保证每个服务器响应都会被缓存,尤其是对于免费账户而言。对于免费账户,他们可能不会缓存某些内容,直到开始收到许多请求该资源的请求为止。

此外,默认情况下,您的Cloudflare账户在每个Cloudflare边缘位置都会有一个不同的缓存,这意味着如果请求最终被路由到不同的边缘位置,缓存将不同。

您需要启用Tiered Caching来解决“每个边缘位置的多个缓存”问题。并且您需要启用Cache Reserve来确保您的服务器响应始终被缓存。

英文:

Cloudflare doesn't guarantee every server response will always be cached, especially for a free account. With a free account they may not cache something until many requests for that resource start coming in.

Also, by default your Cloudflare account will have a different cache on each Cloudflare edge location, which means if requests end up routed to different edge locations the cache will be different.

You would need to enable Tiered Caching to fix the "multiple caches per edge" issue. And you would need to enable Cache Reserve to guarantee your server response is always cached.

huangapple
  • 本文由 发表于 2023年8月10日 21:24:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76876160.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定