英文:
Browser performance for advertising or tracking
问题
有没有关于现在平均网站在广告或跟踪/分析等方面对Web浏览器的占用情况的数据?例如,内存使用或HTTP流量中多少百分比归因于跟踪/广告?
英文:
Is there data on how much the average website today busies a web browser with things like advertising or tracking/analytics? For example, what percentage of RAM use or HTTP traffic is down to tracking/ads?
答案1
得分: 2
短答案:低使用率和高使用率之间的差异如此之大,以至于平均或中位数毫无意义。你必须考虑将网站分成不同的段,并进行段分析。如果不这样做,你将会发现市场营销/分析对平均值几乎没有影响。全球平均/中位数影响如此之低,以至于对其进行任何统计数据要么是显而易见的,要么是误导性的。
但即使是"高使用率"的分析和市场营销通常也很难察觉或显著。这主要是因为我们的硬件性能如此强大,而市场营销/分析需要跟踪事物并传递数据的量很少。你可能会认为数据意味着大小,但实际情况并非如此。即使在这里,Stack Overflow 上的分析调用大约比页面的大小小一千倍,不包括所有核心 JS 和补充的 JS 库。在绝大多数情况下,市场营销和分析都不是提高页面速度的正确途径,即使 Lighthouse 反对你喜欢的标签管理系统中的"未使用 JS"。
但我对此感到好奇,所以让我们深入一点?
基础
快速搜索后发现:
- 互联网上有约10亿个网站。
- 使用最基本和简单的分析工具Google Analytics的网站大约有2800万个。
如果以10亿作为基数,那么一个平均网站使用的RAM/网络资源非常微小,我们不要这样做。
让我们以2800万个GA网站为基数,再加上大约10%作为基数,因为GA几乎占据了90%的市场份额(作为几乎唯一的免费解决方案来托管你的数据)。除了GA,还有其他分析系统,比如开源的Matomo,人们主要不使用它,因为自己托管分析解决方案非常困难。还有一些像Adobe Analytics这样的不太重要的系统,它主要由财富500强企业使用。还有像snowplow、tealium、ensighten等不太重要的系统。通常,GA会与其他系统并行使用。
基本跟踪
绝大多数网站只会使用非常基本的跟踪,其中包括analytics.js或gtag.js。它们的大小都约为80KB。我们的库只会在每个网站上加载一次,而且会被缓存。它们使用的内存量可能与它们的大小相当,因为它们只会以异步、非阻塞的方式发送每个页面查看的非常基本的跟踪信息。它们通常会在每个页面或历史更改时发送一个网络请求,而在gtag.js的情况下,这个非阻塞网络请求大约为60字节。不是千字节,而是字节。以下是Stack Overflow的一个示例:
Stack Overflow的跟踪并不是非常复杂,但它绝对比默认跟踪要多,因为从默认跟踪切换到自定义跟踪要困难得多,所以默认跟踪绝对是其中的中位数。
其他分析系统在这方面不会有太大不同。
基本跟踪结论:
在10亿个网站中,有2800万个用于市场营销/分析的网站:
- 大约使用100KB的内存
- 初始加载时大约使用80KB的网络资源
- 后续页面查看时使用大约60字节的网络资源
低成本跟踪
好的,我们已经完成了基础知识。现在让我们来看看下一个阶段:Google Tag Manager (GTM) 或初中级分析。再次,Google的统计数据显示,大约有630万个网站使用GTM。其中大多数可能只发布了几个标签,或者根本没有发布标签,或者仅用于测试,或者在配置CMS时通过API创建,然后再没有使用。
GTM的大小可能会有所不同,因为它是一个非常动态的库,它严重依赖于其中使用的实体数量。每次添加标签、触发器、变量或更改内容时,都需要"发布"更改,这实际上意味着重建库,并且库的端点开始提供更新的版本。大小范围从大约100KB到200KB不等。但GTM能够动态从第三方端点按需加载其他库。
GTM是一个标签管理系统(TMS),它有竞争对手。由于GTM使用非常方便且免费,所以其市场份额与GA相似。它的市场份额约为85%,我们可以安全地假设,如果一个网站有TMS,但不是GTM,那么它现在是一个高级设置。
有多少GTM可能会被认为正在加载额外的脚本?大多数人只会强制GTM加载gtag.js或analytics.js,这是我们在基础部分讨论过的。现在,在gtag.js或analytics.js之后,最有可能通过GTM加载的最流行的分析库是Facebook像素库。
看起来Facebook不愿意分享其像素使用统计数据,但有估计大约有200万个网站在使用。当
英文:
Short answer: The difference between low and high usage is so big that average or median makes no sense. You have to consider breaking the sites into segments and analyze segments. If you don't, you will see that marketing/analytics has virtually no impact on average. The global average/median impact is so low, that any stats on it would be either obvious or misleading.
But even the "high usage" of analytics and marketing is often barely noticeable or significant. This is mostly because of how powerful our hardware is and how little marketing/analytics need to track things and pass the data over. You would think that the data implies size, but it doesn't. Even here, on SO an analytics call is about a thousand times smaller than the size of a page, not counting all the core JS and supplementing JS libraries. In vast majority of cases, marketing and analytics are wrong places to go for page speed improvements even when lighthouse frowns upon "unused JS" in your favourite tag management system.
But I got curious and so let's dive a bit deeper?
Base
After a quick search, it appears that:
There are about 1 billion websites on the web.
There are about 28 million of websites that use the most primitive and easy analytics: google analytics.
If you use the 1 billion as a base, then the amount of RAM/network an average site uses is miniscule. Let's not do that.
Let's take the 28 million of GA websites plus about 10% over that as the base because GA is almost at 90% market share (as pretty much the only free solution that hosts your data). Besides GA, there are other analytics systems. There's the open-source Matomo which people don't use mostly due to how difficult it is to host your own analytics solution. There are things like Adobe Analytics, which is the next step in analytics solutions; it's used mostly by fortune 500 corps. And there are less significant systems like snowplow, tealium, ensighten and so on. Often GA is used in parallel with the others.
Basic Tracking
Vast majority of sites would just have the very basic tracking consisting of either analytics.js or gtag.js. They both are about 80 kb. Our library is loaded once per site and it is kept cached. Amount of memory they use is probably comparable to their size since they only do whatever is needed to send a very basic tracking info on every pageview in an async, non-blocking way. And they would typically send one network request per page or history change in case of gtag.js. And that non-blocking network request would be about 60 bytes. Not kilobytes. Bytes. Here's an example from SO:
SO's tracking is not very elaborate, but it's definitely more than default tracking and default is definitely median out there due to how much harder it is to shift from default to custom tracking.
Other analytics systems wouldn't be much different in these terms.
Basic tracking conclusion:
Out of a billion sites, 28 million use for marketing/analytics:
- About 100kb of memory usage
- About 80kb of network usage on the initial hit
- About 60 bytes of network usage on the subsequent pageviews
Low-effort tracking
Good, we're done with basics. Now let's get to our next stop: GTM. Or Junior-Mid level analytics. Again, Google's stats show that about 6.3 million sites use GTM. Most of them are likely have just a few tags published or are just staying empty, or were created to play around, or were created via the API during the configuration of a CMS and never touched again.
GTM's size can range since it's a very dynamic library and heavily depends on the number of entities used in it. Every time you add tags, triggers, variables, or change stuff, it requires you to "Publish" the changes, which actually implies rebuilding the library and the library's endpoint starts serving an updated version. The size varies from about 100 to 200kb. However, GTM is capable of dynamically loading other libraries on demand from third party endpoints.
GTM is a TMS (tag management system), and it has competitors. GTM's market share is similar to GA's given how GTM is quite comfortable and free to use. It's share is around 85% and we can safely presume that if a site has a TMS, but it's not a GTM, it's now an advanced set up.
How many GTMs can be considered to be loading extra scripts? Most people would just force GTM to load gtag.js or analytics.js, which we've went through in basics. Now, after gtag.js or analytics.js, the most popular analytics library to be loaded via GTM is most likely the facebook pixel library.
Looks like facebook is reluctant to share its pixel usage statistics, but there are estimations that land in around 2 million of sites. Sure, some sites would just use it from JS and ignore GTM, but it's hard to estimate how many do that and it's definitely not a good practice to delegate your marketing tracking to your front-end developers. Let's be conservative in our estimations and err on the side of caution. Let's presume all 2 million sites use GTM in parallel or serve their FB pixels through GTM.
Medium tracking conclusion:
Out of 28 million sites that use GA, 6.4 million add the following usage:
- About 200kb more of memory usage (due to GTM)
- About 100kb of network usage on the initial hit (due to GTM)
- About 60 bytes of network usage on the subsequent pageviews
Out of 6.4 million GTM sites, 2 million add the following usage:
- About 100 kb for fbevents.js and whatever it downloads. Only on the initial pageview.
- About 50 bytes per pageview
- At this point, GTMs would likely have additional tracking, so add 50bytes for every scroll, and meaningful CTA click.
- Add 100 bytes per e-commerce action: add/remove from cart, checkout steps, pdp view, purchase.
High-effort tracking.
This is basically large to enterprise businesses that easily spend over $100k per month to buy traffic. Hard to estimate the number of these sites without internal ad platforms stats.
Because they would buy traffic from various vendors at the same time, for multiple marketing campaigns running in parallel, they end up using more libraries and pixels. It's common to see about five different vendors tracking in parallel in addition to what we already have, so:
Out of 2 million of facebook pixel users, let's say for 1% of them add the following usage:
- About 500 kb for different third party marketing libraries, for the initial pageview only.
- About 250 bytes per pageview
- About 250 bytes for each meaningful user conversion. Mostly starts of the important funnels, then meaningful milestones in them and completions.
Now note that it may seem like loading a megabyte worth of libraries on the initial page load is likely to delay the page. It's not. Very unlikely. These libraries are loaded asyncly and usually the loading starts after the initial rendering of the DOM is done. Also, even a megabyte is nothing in terms of modern internet speeds. You can test it. Try loading sites with a bunch of pixels blocking the pixels' libraries and compare it to the normal loads. You'd be very unlikely to notice any significant difference.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论