Failed to connect to service endpoint when reading file from s3 using Spark and Java.

huangapple go评论92阅读模式
英文:

Failed to connect to service endpoint when reading file from s3 using Spark and Java

问题

你需要从S3存储桶中读取文件并将其加载到Spark数据集中。你已经使用了正确的secretKey和accessKey,并尝试了端点配置,但出现了以下错误:

[main] WARN com.amazonaws.internal.InstanceMetadataServiceResourceFetcher - Fail to retrieve token
com.amazonaws.SdkClientException: Failed to connect to service endpoint:
    at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100)
    at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.getToken(InstanceMetadataServiceResourceFetcher.java:91)

... 74 more

java.nio.file.AccessDeniedException: datalakedbr: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Failed to connect to service endpoint:
    at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:187)
    at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111)
    at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:265)
    at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322)
    at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:261)
Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Failed to connect to service endpoint:
    at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:159)

这是你使用的方法:

SparkSession sparkSession = SparkSession.builder()
    .master("local").appName("readFile")
    .config("fs.s3a.awsAccessKeyId", "key")
    .config("fs.s3a.awsSecretAccessKey", "secretKey")
    .getOrCreate();
JavaSparkContext sparkContext = new JavaSparkContext(sparkSession.sparkContext());
String path = "s3a://bucket/path.json";
Dataset<Row> file = sparkSession.sqlContext().read().load(path);

请问有人可以提供帮助吗?

英文:

I need to read a file from S3 bucket into a Spark dataSet. I'm used the correct secretKey and accessKey and I also tried with endpoint configuration but I get this Error :

com.amazonaws.SdkClientException: Failed to connect to service endpoint: 
 at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100)
 at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.getToken(InstanceMetadataServiceResourceFetcher.java:91)

 ... 74 more



java.nio.file.AccessDeniedException: datalakedbr: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Failed to connect to service endpoint: 

 at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:187)
 at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111)
 at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:265)
 at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322)
 at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:261)
Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Failed to connect to service endpoint: 
 at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:159)

this is the method used :

    parkSession sparkSession = SparkSession.builder()
            .master(&quot;local&quot;).appName(&quot;readFile&quot;)
            .config(&quot;fs.s3a.awsAccessKeyId&quot;, &quot;key&quot;)
            .config(&quot;fs.s3a.awsSecretAccessKey&quot;, &quot;secretKey&quot;)
            .getOrCreate();
    JavaSparkContext sparkContext = new JavaSparkContext(sparkSession.sparkContext());
    String path = &quot;s3a://bucket/path.json&quot;;
    Dataset&lt;Row&gt; file = sparkSession.sqlContext().read().load(path);

Please anyone can help?

答案1

得分: 7

我认为问题出在属性的名称上。

请查看Hadoop文档:
https://hadoop.apache.org/docs/r2.7.2/hadoop-aws/tools/hadoop-aws/index.html

文档中指出,对于S3A,属性的名称应为fs.s3a.access.key / fs.s3a.secret.key,而不是 fs.s3a.awsAccessKeyId / fs.s3a.awsSecretAccessKey

其他选项包括S3的fs.s3.awsAccessKeyId,或者S3N的fs.s3n.awsAccessKeyId

英文:

I believe that the problem is with the name of the property.

Check the Hadoop documentation here:
https://hadoop.apache.org/docs/r2.7.2/hadoop-aws/tools/hadoop-aws/index.html

It says that for S3A, the name of the property should be fs.s3a.access.key / fs.s3a.secret.key, and not fs.s3a.awsAccessKeyId / fs.s3a.awsSecretAccessKey.

Other options are fs.s3.awsAccessKeyId for S3, or fs.s3n.awsAccessKeyId for S3N.

huangapple
  • 本文由 发表于 2020年8月11日 19:15:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/63357022.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定