英文:
Bigquery SQL fails with Correlated subqueries that reference other tables are not supported unless they can be de-correlated
问题
以下是您要翻译的内容:
BigQuery fails on my query with the following error, and I'm not sure how to mitigate this.
> Query error: Correlated subqueries that reference other tables are not
> supported unless they can be de-correlated, such as by transforming
> them into an efficient JOIN. at [31:1]
#### Standalone SQL Query (and temporary tables) to reproduce:
create temporary table records (
ID int64,
Events array<struct<
Tag string,
Info string,
Citations array<struct<
Tag string,
SourceID int64>>>>);
insert into records values
(1, [
('A', 'C', [('AA', 1000), ('AB', 1001)]),
('A', 'C', [('AA', 1000), ('AB', 1001)]),
('B', 'D', [('BA', 1000), ('BB', 1001)])]);
create temporary table sources (
ID int64,
Title string);
insert into sources values
(1000, "ABCD"),
(1001, "EFGH");
select
Record.ID,
array(
select as struct
Event.Tag,
Event.Info,
array(
select as struct
Citation.SourceID,
Citation.Tag,
Source.Title
from unnest(Event.Citations) as Citation
left join sources as Source on Citation.SourceID = Source.ID
) as Citations
from unnest(Record.Events) as Event) as Events
from records as Record;
The events
table looks like (in json):
[{
"ID": "1",
"Events": [{
"Tag": "A",
"Info": "C",
"Citations": [{
"Tag": "AA",
"SourceID": "1000"
}, {
"Tag": "AB",
"SourceID": "1001"
}]
}, {
"Tag": "A",
"Info": "C",
"Citations": [{
"Tag": "AA",
"SourceID": "1000"
}, {
"Tag": "AB",
"SourceID": "1001"
}]
}, {
"Tag": "B",
"Info": "D",
"Citations": [{
"Tag": "BA",
"SourceID": "1000"
}, {
"Tag": "BB",
"SourceID": "1001"
}]
}]
}]
The sources
table looks like:
[{
"ID": "1000",
"Title": "ABCD"
}, {
"ID": "1001",
"Title": "EFGH"
}]
I'd like the output to look like:
[{
"ID": "1",
"Events": [{
"Tag": "A",
"Citations": [{
"Tag": "AA",
"SourceID": "1000",
"Title": "ABCD"
}, {
"Tag": "AB",
"SourceID": "1001",
"Title": "EFGH"
}]
}, {
"Tag": "B",
"Citations": [{
"Tag": "BA",
"SourceID": "1000",
"Title": "ABCD"
}, {
"Tag": "BB",
"SourceID": "1001",
"Title": "EFGH"
}]
}]
英文:
BigQuery fails on my query with the following error, and I'm not sure how to mitigate this.
> Query error: Correlated subqueries that reference other tables are not
> supported unless they can be de-correlated, such as by transforming
> them into an efficient JOIN. at [31:1]
Standalone SQL Query (and temporary tables) to reproduce:
create temporary table records (
ID int64,
Events array<struct<
Tag string,
Info string,
Citations array<struct<
Tag string,
SourceID int64>>>>);
insert into records values
(1, [
('A', 'C', [('AA', 1000), ('AB', 1001)]),
('A', 'C', [('AA', 1000), ('AB', 1001)]),
('B', 'D', [('BA', 1000), ('BB', 1001)])]);
create temporary table sources (
ID int64,
Title string);
insert into sources values
(1000, "ABCD"),
(1001, "EFGH");
select
Record.ID,
array(
select as struct
Event.Tag,
Event.Info,
array(
select as struct
Citation.SourceID,
Citation.Tag,
Source.Title
from unnest(Event.Citations) as Citation
left join sources as Source on Citation.SourceID = Source.ID
) as Citations
from unnest(Record.Events) as Event) as Events
from records as Record;
The events
table looks like (in json):
[{
"ID": "1",
"Events": [{
"Tag": "A",
"Info": "C",
"Citations": [{
"Tag": "AA",
"SourceID": "1000"
}, {
"Tag": "AB",
"SourceID": "1001"
}]
}, {
"Tag": "A",
"Info": "C",
"Citations": [{
"Tag": "AA",
"SourceID": "1000"
}, {
"Tag": "AB",
"SourceID": "1001"
}]
}, {
"Tag": "B",
"Info": "D",
"Citations": [{
"Tag": "BA",
"SourceID": "1000"
}, {
"Tag": "BB",
"SourceID": "1001"
}]
}]
}]
The sources
table looks like:
[{
"ID": "1000",
"Title": "ABCD"
}, {
"ID": "1001",
"Title": "EFGH"
}]
I'd like the output to look like:
[{
"ID": "1",
"Events": [{
"Tag": "A",
"Citations": [{
"Tag": "AA",
"SourceID": "1000",
"Title": "ABCD"
}, {
"Tag": "AB",
"SourceID": "1001",
"Title": "EFGH"
}]
}, {
"Tag": "B",
"Citations": [{
"Tag": "BA",
"SourceID": "1000",
"Title": "ABCD"
}, {
"Tag": "BB",
"SourceID": "1001",
"Title": "EFGH"
}]
}]
}]
答案1
得分: 1
你可以尝试以下的代码:
SELECT
ID,
ARRAY_AGG((SELECT AS STRUCT Event.* EXCEPT(Citations), Citations)) AS Events
FROM (
SELECT
r.ID,
ANY_VALUE(e) Event, --> will discard *e.Citations* in outer query whatever it is
ARRAY_AGG((SELECT AS STRUCT c.*, s.Title)) Citations
FROM records r
CROSS JOIN UNNEST(r.events) e WITH offset
CROSS JOIN UNNEST(e.citations) c
LEFT JOIN sources s ON s.ID = c.SourceID
GROUP BY r.ID, offset
) GROUP BY ID;
查询结果
英文:
You can try below instead.
SELECT
ID,
ARRAY_AGG((SELECT AS STRUCT Event.* EXCEPT(Citations), Citations)) AS Events
FROM (
SELECT
r.ID,
ANY_VALUE(e) Event, --> will discard *e.Citations* in outer query whatever it is
ARRAY_AGG((SELECT AS STRUCT c.*, s.Title)) Citations
FROM records r
CROSS JOIN UNNEST(r.events) e WITH offset
CROSS JOIN UNNEST(e.citations) c
LEFT JOIN sources s ON s.ID = c.SourceID
GROUP BY r.ID, offset
) GROUP BY ID;
Query results
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论