英文:
AWS Athena is creating an extra empty row per result
问题
我在S3存储桶中有4个JSON文件,我正在使用它们来填充Athena中的表;然而,当我查询所有数据(select * from [table_name];
)时,每个JSON都会创建一个额外的空行。我想知道为什么会发生这种情况,以及是否可以防止这种情况发生(除了在所有查询中附加where [column_name] is not null;
之外 :))。
请注意,故意使用单独的JSON文件(尽管我对Athena不太熟悉,所以这可能不明智),因为我正在尝试解决一些并发性问题。
我想知道是否是因为S3中有隐藏的元数据或类似的文件,但找不到任何原因会出现这种情况。
英文:
I have 4 JSONs stored in an S3 bucket which I'm using to populate a table in Athena; however when I query all (select * from [table_name];
, an extra empty row is created for each of these JSONs. I was wondering why this happens, and if I can prevent this (beyond appending where [column_name] is not null;
to all my queries :)).
NB the use of individual JSONs is deliberate (though I'm new to Athena, so it may be ill-advised) as I am trying to get past some issues with concurrency
See the attached images for reference
I was wondering whether it was because of hidden metadata or similar files in S3, but can't find any reason that this would be the case
答案1
得分: 1
The alternating blank line suggests that your source file is in CRLF
format (causing 'new lines') rather than CR
format.
You can test this by loading the source file into a modern text editor (e.g., Visual Studio Code) and changing the line-end format (often via a control in the bottom-right).
Windows tends to use CRLF
, while other systems just use CR
.
For some fun history, see: Carriage return - Wikipedia
英文:
The alternating blank line suggests that your source file is in CRLF
format (causing to 'new lines') rather than CR
format.
You can test this by loading the source file into a modern text editor (eg Visual Studio Code) and changing the line-end format (often via a control in the bottom-right).
Windows tends to use CRLF
, while other systems just use CR
.
For some fun history, see: Carriage return - Wikipedia
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论