在Azure Synapse中,.csv文件中的转义字符似乎会消失?

huangapple go评论64阅读模式
英文:

Escape characters in .csv for Azure Synapse seems to dissapear?

问题

以下是您提供的文本的翻译部分:

我有一个.csv 文件,看起来像这样:

    "ID", "Name", "额外信息"
    "1", "约翰", "{\"事件\": \"点击\", \"按钮名称\": \"接受\"}"
    "2", "亚当", "{\"事件\": \"点击\", \"按钮名称\": \"接受\"}"

我尝试使用 Synapse 中的以下代码加载此文件:

    SELECT
        TOP 2 *
    FROM
        OPENROWSET(
            BULK 'https://[我的存储帐户].dfs.core.windows.net/[文件路径]/[...]/*.csv',
            FORMAT = 'CSV',
            PARSER_VERSION = '2.0'
        ) 
    AS [result]

期望得到以下结果:

| ID | Name | 额外信息 |
| -------- | -------------- |-------------- |
| 1| 约翰| {"事件": "点击", "按钮名称": "接受"}|
| 2| 亚当| {"事件": "点击", "按钮名称": "接受"}|

但我一直收到以下错误:

处理外部文件错误:'不期望的标记 'Event' 在 [字节: XXX]。
期望标记 ',', ' ' 或 '"'。'.
文件/外部表名称:'https://[我的存储帐户].dfs.core.windows.net/[文件路径]/[...]/[特定文件名称].csv'。


看起来它忽略了额外信息列中的第一个引号(")和转义字符?导致它认为 \Event\ 是某种特殊标记?

我只是不明白为什么或我可以做什么来解决这个问题?
英文:

I have .csv file that looks like this:

"ID", "Name", "Extra Info"
"1", "John", "{\"Event\": \"Click\", \"Button Name\": \"Accept\"}
"2", "Adam", "{\"Event\": \"Click\", \"Button Name\": \"Accept\"}

I'm trying to load this file using this code in Synapse:

SELECT
    TOP 2 *
FROM
    OPENROWSET(
        BULK 'https://[MY STORAGE ACCOUNT].dfs.core.windows.net/[FILE PATH]/[...]/*.csv',
        FORMAT = 'CSV',
        PARSER_VERSION = '2.0'
    ) 
AS [result]

Expecting this result:

ID Name Extra Info
1 John {"Event": "Click", "Button Name": "Accept"}
2 Adam {"Event": "Click", "Button Name": "Accept"}

But I keep getting this error:

Error handling external file: 'Unexpected token 'Event\' at [byte: XXX]. 
Expecting tokens ',', ' ', or '"'. '. 
File/External table name: 'https://[MY STORAGE ACCOUNT].dfs.core.windows.net/[FILE PATH]/[...]/[SPECIFIC FILE NAME].csv'.

It looks like it's ignoring the first quote (") and Escape character in the Extra Info column? Leading to it think that \Event\ is some special token?

I just don't understand why or what I can do to fix this?

答案1

得分: 0

根据这篇帖子和一些Azure文档,我认为我找到了答案:

  • Field Quote 的工作原理:https://stackoverflow.com/questions/73404603/is-my-understanding-of-how-fieldquote-works-correct
  • 在Azure文档中转义引号:https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/query-single-csv-file#escape-quoting-characters

似乎唯一有效的转义引号方式是使用双引号。

这意味着我的 .csv 应该按以下格式进行格式化:

"ID", "Name", "Extra Info"
"1", "John", "{""Event"": ""Click"", ""Button Name"": ""Accept""}
"2", "Adam", "{""Event"": ""Click"", ""Button Name"": ""Accept""}

而不是原始格式(使用 ):

"ID", "Name", "Extra Info"
"1", "John", "{\"Event\": \"Click\", \"Button Name\": \"Accept\"}
"2", "Adam", "{\"Event\": \"Click\", \"Button Name\": \"Accept\"}

不幸的是,除了批量编辑所有我的 .csv 文件之外,我看不到其他解决方法...

英文:

I think I found the answer based on this post and some of the Azure documentation:

It seems that the only valid way to escape Quotes is by using double quotes.

This means my .csv should be formatted like this:

"ID", "Name", "Extra Info"
"1", "John", "{""Event"": ""Click"", ""Button Name"": ""Accept""}
"2", "Adam", "{""Event"": ""Click"", ""Button Name"": ""Accept""}

Instead of the original (which uses ):

"ID", "Name", "Extra Info"
"1", "John", "{\"Event\": \"Click\", \"Button Name\": \"Accept\"}
"2", "Adam", "{\"Event\": \"Click\", \"Button Name\": \"Accept\"}

Unfortunately I don't see a way around this other than BULK editing all my .csv files...

huangapple
  • 本文由 发表于 2023年2月9日 00:55:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/75389144.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定