英文:
How to count page Views and Exits in Google Big Query (linked to GA4)?
问题
在Google BigQuery中,我正在尝试通过查询来计算页面浏览次数和“退出”次数。
我编写了以下代码来按页面路径计算页面浏览次数,但仍在尝试计算“退出”次数,其中“退出”计算了会话的最后一个事件发生在特定屏幕上的次数,换句话说,结束在页面路径上的会话:
SELECT
event_params.value.string_value AS page_path,
COUNT(*) AS page_views
FROM
`MY_ga4_dataset.events_*`,
UNNEST(event_params) AS event_params
WHERE
_table_suffix BETWEEN '20230207' AND '20230207'
AND event_name = 'page_view'
AND event_params.key = 'page_location'
GROUP BY
page_path
ORDER BY
page_views DESC
能否提出解决方案?
这是我的模式:
(您提供的模式内容已经超出了一次性翻译的字符限制,请分开提问或缩减文本后重新提问)
英文:
In Google BigQuery linked to GA4, I am trying to count the number of views and "Exits" via query
I wrote the following code to count the views by page path, however still struggling to count the "Exits", where where “Exits” counts the number of times that the last event recorded for a session occurred on a particular screen, in other words sessions that ended on a page path
SELECT
event_params.value.string_value AS page_path,
COUNT(*) AS page_views
FROM
`MY_ga4_dataset.events_*`,
UNNEST(event_params) AS event_params
WHERE
_table_suffix BETWEEN '20230207' AND '20230207'
AND event_name = 'page_view'
AND event_params.key = 'page_location'
GROUP BY
page_path
ORDER BY
page_views DESC
Could you propose the solution?
This is my Schema:
fullname mode type description
event_date NULLABLE STRING
event_timestamp NULLABLE INTEGER
event_name NULLABLE STRING
event_params REPEATED RECORD
event_previous_timestamp NULLABLE INTEGER
event_value_in_usd NULLABLE FLOAT
event_bundle_sequence_id NULLABLE INTEGER
event_server_timestamp_offset NULLABLE INTEGER
user_id NULLABLE STRING
user_pseudo_id NULLABLE STRING
privacy_info NULLABLE RECORD
user_properties REPEATED RECORD
user_first_touch_timestamp NULLABLE INTEGER
user_ltv NULLABLE RECORD
device NULLABLE RECORD
geo NULLABLE RECORD
app_info NULLABLE RECORD
traffic_source NULLABLE RECORD
stream_id NULLABLE STRING
platform NULLABLE STRING
event_dimensions NULLABLE RECORD
ecommerce NULLABLE RECORD
items REPEATED RECORD
event_params.key NULLABLE STRING
event_params.value NULLABLE RECORD
event_params.value.string_value NULLABLE STRING
event_params.value.int_value NULLABLE INTEGER
event_params.value.float_value NULLABLE FLOAT
event_params.value.double_value NULLABLE FLOAT
privacy_info.analytics_storage NULLABLE STRING
privacy_info.ads_storage NULLABLE STRING
privacy_info.uses_transient_token NULLABLE STRING
user_properties.key NULLABLE STRING
user_properties.value NULLABLE RECORD
user_properties.value.string_value NULLABLE STRING
user_properties.value.int_value NULLABLE INTEGER
user_properties.value.float_value NULLABLE FLOAT
user_properties.value.double_value NULLABLE FLOAT
user_properties.value.set_timestamp_micros NULLABLE INTEGER
user_ltv.revenue NULLABLE FLOAT
user_ltv.currency NULLABLE STRING
device.category NULLABLE STRING
device.mobile_brand_name NULLABLE STRING
device.mobile_model_name NULLABLE STRING
device.mobile_marketing_name NULLABLE STRING
device.mobile_os_hardware_model NULLABLE STRING
device.operating_system NULLABLE STRING
device.operating_system_version NULLABLE STRING
device.vendor_id NULLABLE STRING
device.advertising_id NULLABLE STRING
device.language NULLABLE STRING
device.is_limited_ad_tracking NULLABLE STRING
device.time_zone_offset_seconds NULLABLE INTEGER
device.browser NULLABLE STRING
device.browser_version NULLABLE STRING
device.web_info NULLABLE RECORD
device.web_info.browser NULLABLE STRING
device.web_info.browser_version NULLABLE STRING
device.web_info.hostname NULLABLE STRING
geo.continent NULLABLE STRING
geo.country NULLABLE STRING
geo.region NULLABLE STRING
geo.city NULLABLE STRING
geo.sub_continent NULLABLE STRING
geo.metro NULLABLE STRING
app_info.id NULLABLE STRING
app_info.version NULLABLE STRING
app_info.install_store NULLABLE STRING
app_info.firebase_app_id NULLABLE STRING
app_info.install_source NULLABLE STRING
traffic_source.name NULLABLE STRING
traffic_source.medium NULLABLE STRING
traffic_source.source NULLABLE STRING
event_dimensions.hostname NULLABLE STRING
ecommerce.total_item_quantity NULLABLE INTEGER
ecommerce.purchase_revenue_in_usd NULLABLE FLOAT
ecommerce.purchase_revenue NULLABLE FLOAT
ecommerce.refund_value_in_usd NULLABLE FLOAT
ecommerce.refund_value NULLABLE FLOAT
ecommerce.shipping_value_in_usd NULLABLE FLOAT
ecommerce.shipping_value NULLABLE FLOAT
ecommerce.tax_value_in_usd NULLABLE FLOAT
ecommerce.tax_value NULLABLE FLOAT
ecommerce.unique_items NULLABLE INTEGER
ecommerce.transaction_id NULLABLE STRING
items.item_id NULLABLE STRING
items.item_name NULLABLE STRING
items.item_brand NULLABLE STRING
items.item_variant NULLABLE STRING
items.item_category NULLABLE STRING
items.item_category2 NULLABLE STRING
items.item_category3 NULLABLE STRING
items.item_category4 NULLABLE STRING
items.item_category5 NULLABLE STRING
items.price_in_usd NULLABLE FLOAT
items.price NULLABLE FLOAT
items.quantity NULLABLE INTEGER
items.item_revenue_in_usd NULLABLE FLOAT
items.item_revenue NULLABLE FLOAT
items.item_refund_in_usd NULLABLE FLOAT
items.item_refund NULLABLE FLOAT
items.coupon NULLABLE STRING
items.affiliation NULLABLE STRING
items.location_id NULLABLE STRING
items.item_list_id NULLABLE STRING
items.item_list_name NULLABLE STRING
items.item_list_index NULLABLE STRING
items.promotion_id NULLABLE STRING
items.promotion_name NULLABLE STRING
items.creative_name NULLABLE STRING
items.creative_slot NULLABLE STRING
答案1
得分: 0
以下是代码的翻译部分:
"BigQuery does not count exits so it has to guess based on last event, which is a different methodology to GA4 which means the number of exits will always be slightly different":
"BigQuery 不计算退出,因此它必须根据最后一个事件来猜测,这与 GA4 不同,这意味着退出的数量始终会略有不同"
"Here is a code to show the previous and next page visited showing entrances and exits":
"以下是一个代码,用于显示先前和后续访问的页面,显示入口和退出"
接下来是代码的详细翻译,但我只会提供代码中的注释部分的翻译:
-
"with events as (":代码部分开始,定义一个名为 "events" 的临时数据表。
-
"select user_pseudo_id,":选择用户伪 ID。
-
"concat(":将字符串连接起来。
-
"(case when(":开始一个条件语句。
-
"select value.string_value":选择值为字符串类型的数据。
-
"from unnest(event_params)":从事件参数中获取数据。
-
"where key = 'session_engaged' AND event_name = 'page_view'":条件:键为 'session_engaged' 且事件名称为 'page_view'。
-
")=1 then 1 else 0 end) as engaged_sessions_id,":如果条件满足,则 engaged_sessions_id 为 1,否则为 0。
-
"event_name,":选择事件名称。
-
"event_timestamp,":选择事件时间戳。
-
"regexp_replace(":使用正则表达式替换。
-
"regexp_replace(":再次使用正则表达式替换。
-
"select p.value.string_value":选择值为字符串类型的数据。
-
"from unnest(event_params) as p":从事件参数中获取数据并将其命名为 "p"。
-
"where p.key = 'page_location'":条件:键为 'page_location'。
-
"), r'^https?://[^/]+', ''":替换正则表达式匹配的部分为 ''。
-
"r'[?].*', ''":替换正则表达式匹配的部分为 ''。
-
") as page_path,":将结果命名为 "page_path"。
-
"event_date as date":选择事件日期并将其命名为 "date"。
-
"FROM
bigquery-3.analytics_2.events_*
":从指定的 BigQuery 表格中获取数据。 -
"where _table_suffix between '20221129' and '20221130')":筛选数据,使其在指定日期范围内。
-
"select":选择以下列。
-
"page_path,":选择页面路径。
-
"date,":选择日期。
-
"event_timestamp,":选择事件时间戳。
-
"if(":开始条件语句。
-
"event_name = 'page_view',":如果事件名称为 'page_view'。
-
"coalesce(":如果前一个条件满足,则返回第一个非 NULL 值。
-
"last_value(":获取最后一个值。
-
"if(event_name = 'page_view', page_path, null) ignore nulls":如果事件名称为 'page_view',则返回页面路径,否则返回 NULL。
-
"over(":在指定窗口内。
-
"partition by unique_session_id":按唯一会话 ID 进行分区。
-
"order by event_timestamp asc rows between unbounded preceding and 1 preceding":按事件时间戳升序排序,考虑未限定的前导行和前 1 行。
-
"), '(entrance)'":如果未满足前一个条件,则返回 '(entrance)'。
-
"null":否则返回 NULL。
-
"as previous_page,":将结果命名为 "previous_page"。
-
"if(":开始条件语句。
-
"event_name = 'page_view',":如果事件名称为 'page_view'。
-
"coalesce(":如果前一个条件满足,则返回第一个非 NULL 值。
-
"first_value(":获取第一个值。
-
"if(event_name = 'page_view', page_path, null) ignore nulls":如果事件名称为 'page_view',则返回页面路径,否则返回 NULL。
-
"over(":在指定窗口内。
-
"partition by unique_session_id":按唯一会话 ID 进行分区。
-
"order by event_timestamp asc rows between 1 following and unbounded following":按事件时间戳升序排序,考虑后 1 行和未限定的后续行。
-
"), '(exit)'":如果未满足前一个条件,则返回 '(exit)'。
-
"null":否则返回 NULL。
-
"as next_page":将结果命名为 "next_page"。
-
"from":从以下数据表获取数据。
-
"events":数据表名称。
-
"where event_name = 'page_view'":筛选事件名称为 'page_view' 的数据。
英文:
BigQuery does not count exits so it has to guess based on last event, which is a different methodology to GA4 which means the number of exits will always be slightly different
Here is a code to show the previous and next page visited showing entrances and exits
with events as (
select
user_pseudo_id,
--get seesion ID
concat(
user_pseudo_id,
(
select
value.int_value
from
unnest(event_params)
where
key = 'ga_session_id'
)
) as unique_session_id,
(case when(
select
value.string_value
from
unnest(event_params)
where
key = 'session_engaged' AND event_name = 'page_view'
) ="1" then 1 else 0 end) as engaged_sessions_id,
event_name,
event_timestamp,
regexp_replace(
regexp_replace(
(
select
p.value.string_value
from
unnest(event_params) as p
where
p.key = 'page_location'
),
r'^https?://[^/]+',
''
),
r'[\?].*',
''
) as page_path,
event_date as date
FROM
`bigquery-3.analytics_2.events_*`
where
_table_suffix between '20221129'and '20221130')
select
page_path,
date,
event_timestamp,
-- look for the previous page_path
if(
event_name = 'page_view',
coalesce(
last_value(
if(event_name = 'page_view', page_path, null) ignore nulls
) over(
partition by unique_session_id
order by
event_timestamp asc rows between unbounded preceding
and 1 preceding
),
'(entrance)'
),
null
) as previous_page,
-- look for the next page_path
if(
event_name = 'page_view',
coalesce(
first_value(
if(event_name = 'page_view', page_path, null) ignore nulls
) over(
partition by unique_session_id
order by
event_timestamp asc rows between 1 following
and unbounded following
),
'(exit)'
),
null
) as next_page
from
events
where
event_name = 'page_view'
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论