英文:
How to connect pyspark (in local mode) to bigquery?
问题
我正在本地模式下运行pyspark,需要连接到bigquery。我找到了这个链接:https://cloud.google.com/dataproc/docs/tutorials/bigquery-connector-spark-example 但它们关注dataproc,而我的spark设置在本地机器上。
请有人帮助我以要点的方式高层次理解,我需要设置连接并将数据查询到数据框的确切步骤是什么?
谢谢
英文:
I am running pyspark in local mode, and I need to connect to bigquery. I have found this: https://cloud.google.com/dataproc/docs/tutorials/bigquery-connector-spark-example but they focus on dataproc, and my spark is set up on a local machine.
Could someone please help me understand at a high level, in points, what exactly are the things I need to set up the connection and query the data into dataframes?
Thank you
答案1
得分: 1
将此内容翻译如下:
根据这个SO帖子,您可以通过以下方式在不使用Dataproc的情况下将pyspark连接到BigQuery:
spark.read.format("bigquery").option("credentialsFile", "</path/to/key/file>").option("table", "<table>").load()
英文:
Posting this as a community wiki.
As per this SO post, you can connect pysparkto bigquery without using dataproc by running :
spark.read.format("bigquery").option("credentialsFile", "</path/to/key/file>").option("table", "<table>").load()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论