Bigquery SQL失败,带有引用其他表的相关子查询,除非可以将其解开。

huangapple go评论59阅读模式
英文:

Bigquery SQL fails with Correlated subqueries that reference other tables are not supported unless they can be de-correlated

问题

以下是您要翻译的内容:

BigQuery fails on my query with the following error, and I'm not sure how to mitigate this.

> Query error: Correlated subqueries that reference other tables are not
> supported unless they can be de-correlated, such as by transforming
> them into an efficient JOIN. at [31:1]

#### Standalone SQL Query (and temporary tables) to reproduce:

create temporary table records (
  ID int64,
  Events array<struct<
    Tag string,
    Info string,
    Citations array<struct<
      Tag string,
      SourceID int64>>>>);

insert into records values
(1, [
    ('A', 'C', [('AA', 1000), ('AB', 1001)]),
    ('A', 'C', [('AA', 1000), ('AB', 1001)]),
    ('B', 'D', [('BA', 1000), ('BB', 1001)])]);

create temporary table sources (
  ID int64,
  Title string);

insert into sources values
(1000, "ABCD"),
(1001, "EFGH");

select 
  Record.ID,
  array(
    select as struct
      Event.Tag,
      Event.Info,
      array(
        select as struct
          Citation.SourceID,
          Citation.Tag,
          Source.Title
        from unnest(Event.Citations) as Citation
        left join sources as Source on Citation.SourceID = Source.ID
      ) as Citations
    from unnest(Record.Events) as Event) as Events
from records as Record;

The events table looks like (in json):

[{
"ID": "1",
"Events": [{
"Tag": "A",
"Info": "C",
"Citations": [{
"Tag": "AA",
"SourceID": "1000"
}, {
"Tag": "AB",
"SourceID": "1001"
}]
}, {
"Tag": "A",
"Info": "C",
"Citations": [{
"Tag": "AA",
"SourceID": "1000"
}, {
"Tag": "AB",
"SourceID": "1001"
}]
}, {
"Tag": "B",
"Info": "D",
"Citations": [{
"Tag": "BA",
"SourceID": "1000"
}, {
"Tag": "BB",
"SourceID": "1001"
}]
}]
}]

The sources table looks like:
[{
"ID": "1000",
"Title": "ABCD"
}, {
"ID": "1001",
"Title": "EFGH"
}]


I'd like the output to look like:

[{
  "ID": "1",
  "Events": [{
    "Tag": "A",
    "Citations": [{
      "Tag": "AA",
      "SourceID": "1000",
      "Title": "ABCD"
    }, {
      "Tag": "AB",
      "SourceID": "1001",
      "Title": "EFGH"
    }]
  }, {
    "Tag": "B",
    "Citations": [{
      "Tag": "BA",
      "SourceID": "1000",
      "Title": "ABCD"
    }, {
      "Tag": "BB",
      "SourceID": "1001",
      "Title": "EFGH"
    }]
}]
英文:

BigQuery fails on my query with the following error, and I'm not sure how to mitigate this.

> Query error: Correlated subqueries that reference other tables are not
> supported unless they can be de-correlated, such as by transforming
> them into an efficient JOIN. at [31:1]

Standalone SQL Query (and temporary tables) to reproduce:

create temporary table records (
  ID int64,
  Events array&lt;struct&lt;
    Tag string,
    Info string,
    Citations array&lt;struct&lt;
      Tag string,
      SourceID int64&gt;&gt;&gt;&gt;);

insert into records values
(1, [
    (&#39;A&#39;, &#39;C&#39;, [(&#39;AA&#39;, 1000), (&#39;AB&#39;, 1001)]),
    (&#39;A&#39;, &#39;C&#39;, [(&#39;AA&#39;, 1000), (&#39;AB&#39;, 1001)]),
    (&#39;B&#39;, &#39;D&#39;, [(&#39;BA&#39;, 1000), (&#39;BB&#39;, 1001)])]);

create temporary table sources (
  ID int64,
  Title string);

insert into sources values
(1000, &quot;ABCD&quot;),
(1001, &quot;EFGH&quot;);

select 
  Record.ID,
  array(
    select as struct
      Event.Tag,
      Event.Info,
      array(
        select as struct
          Citation.SourceID,
          Citation.Tag,
          Source.Title
        from unnest(Event.Citations) as Citation
        left join sources as Source on Citation.SourceID = Source.ID
      ) as Citations
    from unnest(Record.Events) as Event) as Events
from records as Record;

The events table looks like (in json):

[{
  &quot;ID&quot;: &quot;1&quot;,
  &quot;Events&quot;: [{
    &quot;Tag&quot;: &quot;A&quot;,
    &quot;Info&quot;: &quot;C&quot;, 
    &quot;Citations&quot;: [{
      &quot;Tag&quot;: &quot;AA&quot;,
      &quot;SourceID&quot;: &quot;1000&quot;
    }, {
      &quot;Tag&quot;: &quot;AB&quot;,
      &quot;SourceID&quot;: &quot;1001&quot;
    }]
  }, {
    &quot;Tag&quot;: &quot;A&quot;,
    &quot;Info&quot;: &quot;C&quot;, 
    &quot;Citations&quot;: [{
      &quot;Tag&quot;: &quot;AA&quot;,
      &quot;SourceID&quot;: &quot;1000&quot;
    }, {
      &quot;Tag&quot;: &quot;AB&quot;,
      &quot;SourceID&quot;: &quot;1001&quot;
    }]
  }, {
    &quot;Tag&quot;: &quot;B&quot;,
    &quot;Info&quot;: &quot;D&quot;,
    &quot;Citations&quot;: [{
      &quot;Tag&quot;: &quot;BA&quot;,
      &quot;SourceID&quot;: &quot;1000&quot;
    }, {
      &quot;Tag&quot;: &quot;BB&quot;,
      &quot;SourceID&quot;: &quot;1001&quot;
    }]
  }]
}]

The sources table looks like:

[{
  &quot;ID&quot;: &quot;1000&quot;,
  &quot;Title&quot;: &quot;ABCD&quot;
}, {
  &quot;ID&quot;: &quot;1001&quot;,
  &quot;Title&quot;: &quot;EFGH&quot;
}]

I'd like the output to look like:

[{
  &quot;ID&quot;: &quot;1&quot;,
  &quot;Events&quot;: [{
    &quot;Tag&quot;: &quot;A&quot;,
    &quot;Citations&quot;: [{
      &quot;Tag&quot;: &quot;AA&quot;,
      &quot;SourceID&quot;: &quot;1000&quot;,
      &quot;Title&quot;: &quot;ABCD&quot;
    }, {
      &quot;Tag&quot;: &quot;AB&quot;,
      &quot;SourceID&quot;: &quot;1001&quot;,
      &quot;Title&quot;: &quot;EFGH&quot;
    }]
  }, {
    &quot;Tag&quot;: &quot;B&quot;,
    &quot;Citations&quot;: [{
      &quot;Tag&quot;: &quot;BA&quot;,
      &quot;SourceID&quot;: &quot;1000&quot;,
      &quot;Title&quot;: &quot;ABCD&quot;
    }, {
      &quot;Tag&quot;: &quot;BB&quot;,
      &quot;SourceID&quot;: &quot;1001&quot;,
      &quot;Title&quot;: &quot;EFGH&quot;
    }]
  }]
}]

答案1

得分: 1

你可以尝试以下的代码:

SELECT
  ID,
  ARRAY_AGG((SELECT AS STRUCT Event.* EXCEPT(Citations), Citations)) AS Events
FROM (
  SELECT
    r.ID,
    ANY_VALUE(e) Event, --&gt; will discard *e.Citations* in outer query whatever it is
    ARRAY_AGG((SELECT AS STRUCT c.*, s.Title)) Citations
  FROM records r
  CROSS JOIN UNNEST(r.events) e WITH offset
  CROSS JOIN UNNEST(e.citations) c
  LEFT JOIN sources s ON s.ID = c.SourceID
  GROUP BY r.ID, offset
) GROUP BY ID;

查询结果

Bigquery SQL失败,带有引用其他表的相关子查询,除非可以将其解开。

英文:

You can try below instead.

SELECT
  ID,
  ARRAY_AGG((SELECT AS STRUCT Event.* EXCEPT(Citations), Citations)) AS Events
FROM (
  SELECT
    r.ID,
    ANY_VALUE(e) Event, --&gt; will discard *e.Citations* in outer query whatever it is
    ARRAY_AGG((SELECT AS STRUCT c.*, s.Title)) Citations
  FROM records r
  CROSS JOIN UNNEST(r.events) e WITH offset
  CROSS JOIN UNNEST(e.citations) c
  LEFT JOIN sources s ON s.ID = c.SourceID
  GROUP BY r.ID, offset
) GROUP BY ID;

Query results

Bigquery SQL失败,带有引用其他表的相关子查询,除非可以将其解开。

huangapple
  • 本文由 发表于 2023年3月1日 14:04:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/75600081.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定