英文:
Should you use the primary storage of Azure Synapse as your data lake?
问题
Azure Synapse Analytics 在创建工作区时需要一个 ADLSGEN2 帐户。文档对于这个存储的用途如下所示:
您的 Azure Synapse 工作区将使用此存储帐户作为“主要”存储帐户,用于存储工作区数据。工作区将数据存储在 Apache Spark 表中。它将 Spark 应用程序日志存储在名为 /synapse/workspacename 的文件夹下。
您应该使用这个存储帐户来构建数据湖吗?还是应该使用额外的 ADLSGEN2 帐户以避免干扰 Synapse 工作区的数据?
英文:
Azure Synapse Analytics requires an ADLSGEN2 account to create a workspace. The documentation says the following about the purpose of this storage:
> Your Azure Synapse workspace will use this storage account as the "primary" storage account and the container to store workspace data. The workspace stores data in Apache Spark tables. It stores Spark application logs under a folder called /synapse/workspacename.
Should you use this storage account to build a data lake? Or should you use an additional ADLSGEN2 account not to interfere with Synapse's workspace data?
答案1
得分: 1
The synapse metadata (and Spark database) are housed in the file system (container) you specified at workspace creation time. Our convention it to name this "synapseroot". You should not use this container for any other purpose, let the system manage its contents. But you can absolutely create other containers to work with your own data, so you should not need an additional ADLS account.
英文:
The synapse metadata (and Spark database) are housed in the file system (container) you specified at workspace creation time. Our convention it to name this "synapseroot". You should not use this container for any other purpose, let the system manage its contents. But you can absolutely create other containers to work with your own data, so you should not need an additional ADLS account.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论