2023年7月7日 03:10:52go评论61阅读模式

英文:

Redshift Loading Data Points To Different Bucket

问题

这篇帖子原本是关于我在将一些简单的CSV数据加载到Redshift表中时遇到的问题，但在写了一半时，我意识到不知何故，在选择COPY命令中的存储桶时，Redshift指向了错误的存储桶！

有人能解释为什么会这样吗？为了背景，我选择的存储桶是

s3://soccer-project/Player

但Redshift默认选择了

s3://soccer-project/Player_Attributes

这是我的存储桶中的另一个文件

对于Redshift还不太熟悉...有人能帮我理解这个问题吗

谢谢

英文:

This post was going to be about my issues loading some simple csv data in a Redshift table but halfway through writing it I realised that, for whatever reason, when selecting the bucket in the COPY command, Redshift was pointing to the wrong one!

Can someone explain why this is the case? For context, the bucket I selected was

s3://soccer-project/Player

but Redshift defaulted to

s3://soccer-project/Player_Attributes

which is another file in my bucket

New to Redshift... can someone help me understand this

Thanks

答案1

得分: 1

只有S3对象路径的顶部部分是存储桶。在您的两种情况下，这都是"soccer-project"。

现在我预期您显示的只是对象名称的一部分 - 在您的问题中是"Player"和"Player_Attributes"。这些不是对象的完整名称。完整的对象名称包括这些部分以及斜杠和更多文本。Redshift已设置为接受部分对象名称，以便它可以扩展复制的文件，以包括与部分匹配的所有对象名称。如果我对问题的理解有误，请纠正我。

要理解发生了什么，您需要了解S3是一个对象存储而不是文件系统。这意味着所有文件都存储在每个存储桶下，"扁平"存储。只有两个东西标识对象 - 存储桶名称和对象名称。存储中没有真正的层次结构。但是，为了使人们在查看时更加有组织，S3会查看对象名称中的斜杠，并使事物看起来层次化。但实际上，存储桶名称和斜杠之后的一切都是对象名称，包括任何斜杠、"文件夹"名称或您认为具有独特含义的任何内容。这都是对象名称。

现在来看您的情况：您的存储桶中可能有以"Player"或"Player_Attributes"开头的对象名称，对象名称中的下一个字符是斜杠。这只是对象名称的第一部分。我猜测您的COPY命令的FROM子句可能类似于"s3://soccer-project/Player*"。（如果您在问题中提供COPY命令，将有助于更清楚地理解发生了什么。）""是一个通配符，匹配对象名称中的所有后续字符，这将匹配"Player_Attributes"。如果一切都正确，那么您可以通过将FROM子句更改为"s3://soccer-project/Player/"（添加斜杠）来修复此问题。

如我上面所说，这是基于提供的部分信息的最佳猜测。如果这不正确，请更新问题。

英文:

Only the top (left most) part of the S3 object path is the bucket. In both of your cases this is "soccer-project".

Now I expect that what you are showing is only part of the object name - "Player" and "Player_Attributes" in your question. These are not the full names to the objects. The full object names are these parts plus a slash and more text. Redshift is set up to take partial object names so that it can expand the files copied to include all object names that match the partial. Correct me if I'm interpreting the question incorrectly.

To understand what is going on you need to understand that S3 is an object store and not a file system. This means that all files are stored "flat" under each bucket. Only 2 things identify the object - bucket name and object name. There is no real hierarch in the storage. However to make things a little more organized when us humans look S3 will organize the objects by looking at slashes in the object name and make things seem hierarchical. But in reality everything after the bucket-name and slash is the object name, including any slashes, "folder" names, or anything else you think has unique meaning. It is all the object name.

Now to your situation: You likely have object names in your bucket that start with "Player" or "Player_Attributes", with the next character in the name being a slash. This is all just the first part of the object name. I'd also guess that your COPY command has a FROM clause like "s3://soccer-project/Player*". (Providing your COPY command in the question would really help clear up what is going on.) The '' is a wildcard that matches all following characters in the object name which will match "Player_Attributes". If all of this is correct then you can fix this by changing the FROM clause to "s3://soccer-project/Player/" (slash added).

As I said above this is a best guess based on the partial info provided. Please update the question if this is incorrect.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将数据点加载到不同的存储桶

问题

答案1

如何使用Golang客户端调用带有IAM授权的API Gateway端点

如何在Java Spring Boot中获取上传到Amazon S3存储桶的文件的内容类型或扩展名？

在AWS Amplify中启用Lambda函数的XRAY功能。

无法访问 Lambda 函数中的环境变量。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论