英文:
How to Connect Databricks to SFTP Server with PySpark
问题
Is it possible to connect to an SFTP Server from Databricks? I have looked at previous questions/answers and according to the a SO question here
Issue while using above code, have a glance
英文:
Is it possible to connect to an SFTP Server from Databricks? I have looked at previous questions/answers and according to the a SO question here
It would it isn't possible to connect using Spark (at least it wasn't possible over a year ago according to @AlexOtt)
Is this still the case?
答案1
得分: 1
首先,在您的Databricks中安装paramiko
包并按照以下步骤操作。
运行以下代码以连接到SFTP服务器。
import paramiko
host = "test.rebex.net"
port = 22
username = "demo"
password = "password"
client = paramiko.SSHClient()
client.load_system_host_keys()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(host, port=port, username=username, password=password)
sftp = client.open_sftp()
然后,使用get
函数通过指定路径获取所需的文件。
local_path = "/dbfs/FileStore/tables/rd.txt"
remote_path = "/pub/example/readme.txt"
sftp.get(remote_path, local_path)
spark.read.text("/FileStore/tables/rd.txt").show()
确保像上面那样提及本地路径,不要像这样使用
dbfs:/FileStore/tables/rd.txt
输出:
然后关闭连接。
sftp.close()
client.close()
英文:
First, install paramiko
package in your databricks and follow below steps.
Run below code for connecting to sftp server.
import paramiko
host = "test.rebex.net"
port = 22
username = "demo"
password = "password"
client = paramiko.SSHClient()
client.load_system_host_keys()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(host, port=port, username=username, password=password)
sftp = client.open_sftp()
Then using get
function you can the files you want by specifying the path as below.
local_path = "/dbfs/FileStore/tables/rd.txt"
remote_path = "/pub/example/readme.txt"
sftp.get(remote_path, local_path)
spark.read.text("/FileStore/tables/rd.txt").show()
Make sure you mention local path as above, don't use like this
dbfs:/FileStore/tables/rd.txt
Output:
Then close the connection.
sftp.close()
client.close()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论