Synapse Spark写入不同的/mount点[或]容器

2023年5月15日 11:46:49go评论130阅读模式

英文:

Synapse Spark write to different /mount point [or] container

问题

I understand your request. Here's the translated code part:

我试图从 Synapse Spark 将 DataFrame 以 .delta 格式写入容器。

以下是示例代码：

sqlstmt = 'select * from tableblah limit 5'
df = spark.sql(sqlstmt)
df.write.format("delta").mode("overwrite").save("/bronze/account")

这是在将数据写入 Synapse 的 Primay ADLS Gen2 文件系统（例如：容器名称 'synapsews'）中的 synapsews 容器 /bronze/account。

如何挂载不同的容器并将数据写入其中。

我尝试挂载 'devbronze' 容器如下：

mssparkutils.fs.mount(
    "abfss://devbronze@dlsdataxxxxdirdev001.dfs.core.windows.net",
    "/devbronze",
    { "linkedService":"LS_synapsews" }
)

现在，当尝试将数据写入此挂载点 'devbronze' 时：

df.write.format("delta").mode("overwrite").save("/devbronze/account")

它仍然会将数据写入 Primay ADLS Gen2 文件系统，即 synapsews，创建 /devbronze/account 文件夹。

如何更改以将数据写入挂载点 'devbronze'？

谢谢

英文:

Am trying to write DataFrame from Synapse spark as .delta into a container.

Here is the example code:

sqlstmt = &#39;select * from tableblah limit 5&#39;
df = spark.sql(sqlstmt)
df.write.format(&quot;delta&quot;).mode(&quot;overwrite&quot;).save(&quot;/bronze/account&quot;)

What this is doing is it's into Synapse's Primay ADLS Gen2 file system (ex: container name 'synapsews', now file written is in synapsews container /bronze/account.

How can i mount different container and write into it.

I tried mounting 'devbronze' container as below:

mssparkutils.fs.mount(
&quot;abfss://devbronze@dlsdataxxxxdirdev001.dfs.core.windows.net&quot;,
&quot;/devbronze&quot;,
{ &quot;linkedService&quot;:&quot;LS_synapsews&quot; }

)

now when tried to write this mount as below:

df.write.format(&quot;delta&quot;).mode(&quot;overwrite&quot;).save(&quot;/devbronze/account&quot;)

it still write to Primay ADLS Gen2 file system which is synapsews by creating /devbronze/account folders.

How to change to write to mounted point 'devbronze'

Thanks

答案1

得分: 1

I tried the same way as yours in my environment and ended up with same result. Here synapsedata is my primary file system for the synapse workspace.

Synapse Spark写入不同的/mount点[或]容器

To store the files in the mount point other than the primary file system, we can use the path like synfs:/<jobid>/<mountpoint>/<path>.

Use the below code which is working for me to achieve your requirement.

path=&#39;synfs:/&#39;+jobid+&#39;/devbronze/account&#39;
print(path)

df1.write.format(&quot;delta&quot;).mode(&quot;overwrite&quot;).save(path)

First get the job id from mssparkutils and build the path with your mount location. Here devbronze is my mount point and container name which was created by using linked service.

Synapse Spark写入不同的/mount点[或]容器

Code Execution:

Synapse Spark写入不同的/mount点[或]容器

Result files in mount path:

Synapse Spark写入不同的/mount点[或]容器

英文:

I tried the same way as yours in my environment and ended up with same result. Here synapsedata is my primary file system for the synapse workspace.

Synapse Spark写入不同的/mount点[或]容器

To store the files in the mount point other than the primary file system, we can use the path like synfs:/<jobid>/<mountpoint>/<path>.

Use the below code which is working for me to achieve your requirement.

jobid=mssparkutils.env.getJobId()
path=&#39;synfs:/&#39;+jobid+&#39;/devbronze/account&#39;
print(path)

df1.write.format(&quot;delta&quot;).mode(&quot;overwrite&quot;).save(path)

First get the job id from mssparkutils and build the path with your mount location. Here devbronze is my mount point and container name which was created by using linked service.

Synapse Spark写入不同的/mount点[或]容器

Code Execution:

Synapse Spark写入不同的/mount点[或]容器

Result files in mount path:

Synapse Spark写入不同的/mount点[或]容器

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

本文由 huangapple 发表于 2023年5月15日 11:46:49
转载请务必保留本文链接：https://go.coder-hub.com/76250774.html

azure
azure-synapse
pyspark

发布 Azure 函数（模型 v2）不设置触发器（函数）？

go 170 06/06

抑制Spark中的追踪消息

go 107 02/27

Azure SQL数据库的交易单位是什么？

go 149 07/11

将MMM DD YYYY HH:mm:ss日期时间格式转换为dd-mm-yyyy HH:mm:ss在Azure DataFlow中

go 163 06/18

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开