问题

我正在使用Python 3.7中的Spark 3.2.1查询Spark（Hive）数据库表，代码如下：

这些表可以完全访问和操作，其他系统如DBeaver、PowerBI和SSRS也可以正常使用。甚至在R中使用类似的脚本可以正确返回数据。但是，当我尝试使用这个Python脚本时，由jdbc返回的所有行都只包含列名，而不包含数据。

以下是代码：

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

url = 'jdbc:hive2://1.1.1.1:10000/default;transportMode=http;httpPath=cliservice'
table = 'schema.table_name'
username = 'username'
password = '123456'

remote_table = spark.read\
                    .format("jdbc")\
                    .option("driver", "org.apache.hive.jdbc.HiveDriver")\
                    .option("url", url)\
                    .option("dbtable", table)\
                    .option("user", username)\
                    .option("password", password)\
                    .load()\
                    .limit(2)

remote_table.show()
spark.stop()

PySpark从我的Hive表中返回数据。

英文:

I'm querying a Spark's (Hive) database table using Spark 3.2.1 in Python 3.7 with the below code.

This tables are fully and accessible and manipulable with other system like DBeaver, PowerBI and SSRS. Even a similar script in R return the data correctly. But when i tryin use this Python script all rows returned by the jdbc contains only the column name instead the data.

This is the code:

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

url = &#39;jdbc:hive2://1.1.1.1:10000/default;transportMode=http;httpPath=cliservice&#39;
table = &#39;schema.table_name&#39;
username = &#39;username&#39;
password = &#39;123456&#39;

remote_table = spark.read\
                    .format(&quot;jdbc&quot;)\
                    .option(&quot;driver&quot;, &quot;org.apache.hive.jdbc.HiveDriver&quot;)\
                    .option(&quot;url&quot;, url)\
                    .option(&quot;dbtable&quot;, table)\
                    .option(&quot;user&quot;, username)\
                    .option(&quot;password&quot;, password)\
                    .load()\
                    .limit(2)

remote_table.show()
spark.stop()

PySpark return the data from my Hive tables.

答案1

得分: 0

这段代码解决了以下问题：

# -*- coding: utf-8 -*-
"""
Created on Fri Jun  2 07:32:51 2023

作者：yfdantas
"""
import os
import jaydebeapi

def spark_connect():

    jdbc_url = 'jdbc:hive2://1.1.1.1:10000/default;transportMode=http;httpPath=cliservice'
    jdbc_driver_class = "com.cloudera.hive.jdbc.HS2Driver"
    jdbc_user = '用户名'
    jdbc_password = '123456'
    jdbc_jar = "E:/scripts/libs/HiveJDBC42.jar"

    conn = jaydebeapi.connect(
        jclassname=jdbc_driver_class,
        url=jdbc_url,
        driver_args=[jdbc_user, jdbc_password],
        jars=jdbc_jar
    )

    return conn

def oracle_close(conn):
    conn.close()

英文:

This code resolve the problem:

# -*- coding: utf-8 -*-
&quot;&quot;&quot;
Created on Fri Jun  2 07:32:51 2023

@author: yfdantas
&quot;&quot;&quot;


import os
import jaydebeapi

def spark_connect():

    jdbc_url = &#39;jdbc:hive2://1.1.1.1:10000/default;transportMode=http;httpPath=cliservice&#39;
    jdbc_driver_class = &quot;com.cloudera.hive.jdbc.HS2Driver&quot;
    jdbc_user = &#39;username&#39;
    jdbc_password = &#39;123456&#39;
    jdbc_jar = &quot;E:/scripts/libs/HiveJDBC42.jar&quot;

    conn = jaydebeapi.connect(
        jclassname=jdbc_driver_class,
        url=jdbc_url,
        driver_args=[jdbc_user, jdbc_password],
        jars=jdbc_jar
    )
    
    return conn

def oracle_close(conn):
    conn.close()

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Pyspark JDBC 返回带有列名的所有行

问题

答案1

Python: Print the output N-1 number of times showing the swapswhile doing a selection sort on a random list of numbers

通过一个 .bat 文件激活 Activate.ps1，自动打开一个带有 venv 指令的笔记本。

筛选 pandas 数据框，使用多个不同列的等值检查。

如何在Python中知道下载文件的扩展名？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论