英文:
Can't connect to fargate task which Executes command even though all permissions are set
问题
我在使用 ECS Execute 命令连接到 Fargate 容器时遇到了问题,并出现以下错误:
> 在调用 ExecuteCommand 操作时发生错误 (TargetNotConnectedException):由于内部错误,执行命令失败。请稍后再试。
我确保我具有正确的权限和设置,通过使用 ecs-checker 进行了检查,并使用以下命令连接到容器:
aws ecs execute-command --cluster {cluster-name} --task {task_id} --container {container name} --interactive --command "/bin/bash".
我注意到,通常在没有必要的权限时会出现这种情况,但如上所述,我已经使用 ecs-checker.sh 进行了检查,以下是其输出:
-------------------------------------------------------------
AWS CLI 使用 ECS Exec 所需的前提条件
-------------------------------------------------------------
AWS CLI 版本 | OK (aws-cli/2.13.4 Python/3.11.4 Darwin/22.4.0 source/arm64 prompt/off)
会话管理插件 | OK (1.2.463.0)
-------------------------------------------------------------
ECS 任务和其他资源的检查
-------------------------------------------------------------
区域: eu-west-2
集群: cluster
任务: 47e51750712a4e1c832dd996c878f38a
-------------------------------------------------------------
集群配置 | 未配置审核日志记录
是否可以执行命令? | arn:aws:iam::290319421751:role/aws-reserved/sso.amazonaws.com/eu-west-2/AWSReservedSSO_PowerUserAccess_01a9cfdb5ba4af7f
ecs:ExecuteCommand: 允许
ssm:StartSession 被拒绝?: 允许
任务状态 | 运行中
启动类型 | Fargate
平台版本 | 1.4.0
是否为任务启用 Exec | OK
容器级检查 |
----------
管理代理状态
----------
1. "WebApp" 运行中
----------
启用初始化进程 (WebAppTaskDefinition:49)
----------
1. 已启用 - "WebApp"
----------
只读根文件系统 (WebAppTaskDefinition:49)
----------
1. 已禁用 - "WebApp"
任务角色权限 | arn:aws:iam::290319421751:role/task-role
ssmmessages:CreateControlChannel: 允许
ssmmessages:CreateDataChannel: 允许
ssmmessages:OpenControlChannel: 允许
ssmmessages:OpenDataChannel: 允许
VPC 终端节点 |
已找到现有的 vpc-11122233444 终端节点:
- com.amazonaws.eu-west-2.monitoring
- com.amazonaws.eu-west-2.ssmmessages
环境变量 | (WebAppTaskDefinition:49)
1. 容器 "WebApp"
- AWS_ACCESS_KEY: 未定义
- AWS_ACCESS_KEY_ID: 未定义
- AWS_SECRET_ACCESS_KEY: 未定义
奇怪的是,这种情况下有 4 个部署环境,除了其中一个环境外,所有其他环境都正常工作。它们都是相同的资源,因为集群是通过 CloudFormation 模板创建的。所有 4 个环境中部署的镜像也相同。
您对可能引起此问题的任何想法吗?
英文:
I'm having trouble connecting to a fargate container with the ECS Execute command and it gives out the following error:
> An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later.
I've made sure I have the right permissions and setup by using ecs-checker and I'm connecting to it using the following command:
aws ecs execute-command --cluster {cluster-name} --task {task_id} --container {container name} --interactive --command "/bin/bash".
I've noticed that this can usually happen when you don't have the necessary permissions but as I've pointed out above I've already checked with the ecs-checker.sh and here is the output from it:
-------------------------------------------------------------
Prerequisites for the AWS CLI to use ECS Exec
-------------------------------------------------------------
AWS CLI Version | OK (aws-cli/2.13.4 Python/3.11.4 Darwin/22.4.0 source/arm64 prompt/off)
Session Manager Plugin | OK (1.2.463.0)
-------------------------------------------------------------
Checks on ECS task and other resources
-------------------------------------------------------------
Region : eu-west-2
Cluster: cluster
Task : 47e51750712a4e1c832dd996c878f38a
-------------------------------------------------------------
Cluster Configuration | Audit Logging Not Configured
Can I ExecuteCommand? | arn:aws:iam::290319421751:role/aws-reserved/sso.amazonaws.com/eu-west-2/AWSReservedSSO_PowerUserAccess_01a9cfdb5ba4af7f
ecs:ExecuteCommand: allowed
ssm:StartSession denied?: allowed
Task Status | RUNNING
Launch Type | Fargate
Platform Version | 1.4.0
Exec Enabled for Task | OK
Container-Level Checks |
----------
Managed Agent Status
----------
1. RUNNING for "WebApp"
----------
Init Process Enabled (WebAppTaskDefinition:49)
----------
1. Enabled - "WebApp"
----------
Read-Only Root Filesystem (WebAppTaskDefinition:49)
----------
1. Disabled - "WebApp"
Task Role Permissions | arn:aws:iam::290319421751:role/task-role
ssmmessages:CreateControlChannel: allowed
ssmmessages:CreateDataChannel: allowed
ssmmessages:OpenControlChannel: allowed
ssmmessages:OpenDataChannel: allowed
VPC Endpoints |
Found existing endpoints for vpc-11122233444:
- com.amazonaws.eu-west-2.monitoring
- com.amazonaws.eu-west-2.ssmmessages
Environment Variables | (WebAppTaskDefinition:49)
1. container "WebApp"
- AWS_ACCESS_KEY: not defined
- AWS_ACCESS_KEY_ID: not defined
- AWS_SECRET_ACCESS_KEY: not defined
What is weird about this situation is that there are 4 environments that the service is deployed to and it works on all of them except on one of them. And they are all the same resources deployed since the clusters are created through a cloudformation template. The image deployed is also the same in all 4 environments.
Any ideas on what could cause this?
答案1
得分: 0
在那个环境中似乎设置了一个不需要的VPC访问点来访问SSM,因为任务已经具有公共网络访问权限。
奇怪的是,当我们移除了VPC端点时,问题就解决了。可能是VPC端点的安全组设置不正确,所以如果你的情况类似,我鼓励你检查一下是否对SSM进行了错误配置的VPC端点,然后根据你的用例移除或修复它们。
英文:
It seems there was a VPC access point setup for SSM in that environment which was not needed in our case since the tasks already had pubilc network access.
Weirdly enough, when we removed the VPC endpoint the problem went away. It might have been not set up correctly with the VPC endpoint security groups and so If you have a situation similar to this one I encourage you to check If you have misconfigured VPC endpoints for SSM and remove or fix them depending on your use case.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论