将BigQuery的输出从Python保存为JSON。

huangapple go评论78阅读模式
英文:

Save the output of bigquery in JSON from python

问题

以下是修改后的代码,以将结果保存为JSON格式:

from google.cloud import bigquery
import json

def query_stackoverflow(project_id="gwas-386212"):
    client = bigquery.Client()
    query = """
    WITH SNP_info AS (
      SELECT
        CONCAT(CAST(rs_id AS string)) AS identifier
      FROM
        `gwas-386212.gwas_dataset_1.SNPs_intergenic_vep_pha005199`
    )
    SELECT
      SNP_info.identifier AS identifier,
      variants.identifier AS identifier_1,
      variants.chr_id AS chr_id,
      variants.position AS position,
      variants.ref AS ref,
      variants.alt AS alt,
      variants.most_severe_consequence AS most_severe_consequence,
      variants.gene_id_any_distance AS gene_id_any_distance,
      variants.gene_id_any AS gene_id_any,
      variants.gene_id_prot_coding_distance AS gene_id_prot_coding_distance,
      variants.gene_id_prot_coding AS gene_id_prot_coding
    FROM SNP_info
    JOIN (
      SELECT
        CONCAT(CAST(rs_id AS string)) AS identifier,
        chr_id AS chr_id,
        position AS position,
        ref_allele AS ref,
        alt_allele AS alt,
        most_severe_consequence AS most_severe_consequence,
        gene_id_any_distance AS gene_id_any_distance,
        gene_id_any AS gene_id_any,
        gene_id_prot_coding_distance AS gene_id_prot_coding_distance,
        gene_id_prot_coding AS gene_id_prot_coding
      FROM
        `bigquery-public-data.open_targets_genetics.variants`
    ) variants
    ON SNP_info.identifier = variants.identifier
    """
    
    results = client.query(query)

    # Convert the results to a list of dictionaries
    result_list = [dict(row) for row in results]

    # Save the results as JSON
    with open('output.json', 'w') as json_file:
        json.dump(result_list, json_file)

# Call the function to execute the query and save the results as JSON
query_stackoverflow()

这段代码会执行查询并将结果保存为名为"output.json"的JSON文件。

英文:

How can I modify this script to be able to see/print some of the results and write the output in JSON :

from google.cloud import bigquery


def query_stackoverflow(project_id="gwas-386212"):
    client = bigquery.Client()
query_job = client.query(
        """
        WITH
  SNP_info AS (
  SELECT
    CONCAT(CAST(rs_id AS string)) AS identifier
  FROM
  `gwas-386212.gwas_dataset_1.SNPs_intergenic_vep_pha005199`)
SELECT
  *
FROM
  SNP_info
JOIN (
  SELECT
    CONCAT(CAST(rs_id AS string)) AS identifier,
chr_id AS chr_id,
position AS position, 
ref_allele AS ref,
alt_allele AS alt,
most_severe_consequence AS most_severe_consequence, 
gene_id_any_distance AS gene_id_any_distance,
gene_id_any AS gene_id_any,
gene_id_prot_coding_distance AS gene_id_prot_coding_distance, 
gene_id_prot_coding AS gene_id_prot_coding
 FROM
    `bigquery-public-data.open_targets_genetics.variants`) variants
ON
  SNP_info.identifier = variants.identifier""" 
  )
results = client.query(query)

for row in results:
    title = row['identifier']
    identifier = row['identifier']
    #print(f'{identifier}')

This is just printing a column the intentifier. i want to save the resulted table in JSON format. The JSOn from the google cloud platform should look something like this:

[{
  "identifier": "rs62063022",
  "identifier_1": "rs62063022",
  "chr_id": "17",
  "position": "51134537",
  "ref": "T",
  "alt": "G",
  "most_severe_consequence": "intergenic_variant",
  "gene_id_any_distance": "13669",
  "gene_id_any": "ENSG00000008294",
  "gene_id_prot_coding_distance": "13669",
  "gene_id_prot_coding": "ENSG00000008294"
}, {
  "identifier": "rs12944420",
  "identifier_1": "rs12944420",
  "chr_id": "17",
  "position": "42640692",
  "ref": "T",
  "alt": "C",
  "most_severe_consequence": "intergenic_variant",
  "gene_id_any_distance": "18592",
  "gene_id_any": "ENSG00000037042",
  "gene_id_prot_coding_distance": "18592",
  "gene_id_prot_coding": "ENSG00000037042"
},

答案1

得分: 0

查看 json 文档以获取更多信息。

records = [dict(row) for row in results]
out_file = open("bigquery_response.json", "w")
json.dump(records, out_file, indent=6)
out_file.close()
英文:

Check out json documentation for further information.

records = [dict(row) for row in results]
out_file = open("bigquery_response.json", "w")
json.dump(records , out_file, indent = 6)
out_file.close()

huangapple
  • 本文由 发表于 2023年6月1日 21:44:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76382547.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定