重复使用 BigQuery 查询作业作为基础查询,以供进一步操作使用。

huangapple go评论65阅读模式
英文:

Reuse bigquery queryJob as base query to use for further operation

问题

The translated content is as follows:

正如标题所说,我不知道是否可以重复使用通过执行查询获得的queryjob来执行其他SQL操作。以下是我所指的示例:

from google.cloud import bigquery

client = bigquery.Client()
myquery = "SELECT * FROM mytable"
qjob = client.query(myquery)

qjob包含了查询的结果。我想要使用这个结果来执行额外的筛选,在伪代码中 ...

...
mynewquery = "SELECT c1 FROM <qjob> WHERE C1=1"
qjob2 = client.query(mynewquery)

希望这样清楚了。这与以下构造有些类似

WITH <query name> AS (SELECT ...)
SELECT a FROM <query name> WHERE...
英文:

As the title says, I don't know if it is possible to reuse the queryjob obtained by the execution of a query to perform additional SQL operations. Below an example of what I mean

from google.cloud import bigquery

client = bigquery.Client()
myquery = &quot;SELECT * FROM mytable&quot;
qjob = client.query(myquery)

qjob contains the query results. I want to use this result to perform additional filtering, in pseudo code ...

...
mynewquery = &quot;SELECT c1 FROM &lt;qjob&gt; WHERE C1=1&quot;
qjob2 = client.query(mynewquery)

Hope it is clear. This is somehow similar to the construct

WITH &lt;query name&gt; AS (SELECT ...)
SELECT a FROM &lt;query name&gt; WHERE...

答案1

得分: 1

我相信使用pandas是处理你的查询的正确方法。你可以将一个queryjob对象转换成数据帧,一旦存储到数据帧中,你可以对其进行操作。

参考代码示例:

from google.cloud import bigquery
client = bigquery.Client()

sql = """
    SELECT name, SUM(number) as count
    FROM `bigquery-public-data.usa_names.usa_1910_current`
    GROUP BY name
    ORDER BY count DESC
    LIMIT 10
"""

df = client.query(sql).to_dataframe()

参考链接: https://cloud.google.com/bigquery/docs/samples/bigquery-query-results-dataframe

Pandas: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html

英文:

I believe pandas is the way to go to your inquiry. You can convert a queryjob object into dataframes, once stored into a dataframe you can manipulate it.

See Code Sample:

from google.cloud import bigquery
client = bigquery.Client()

sql = &quot;&quot;&quot;
    SELECT name, SUM(number) as count
    FROM `bigquery-public-data.usa_names.usa_1910_current`
    GROUP BY name
    ORDER BY count DESC
    LIMIT 10
&quot;&quot;&quot;

df = client.query(sql).to_dataframe()

Reference: https://cloud.google.com/bigquery/docs/samples/bigquery-query-results-dataframe

Pandas: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html

huangapple
  • 本文由 发表于 2023年5月24日 18:12:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/76322400.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定