英文:
EKS -nodes fail when launched through a launch template (terraform)
问题
当我正常启动节点时,一切都正常运作,但当我尝试使用启动模板来启动它时,我在集群内遇到了连接问题。
更具体地说,aws-node pod 出现以下错误:
{"level":"info","caller":"/usr/local/go/src/runtime/proc.go:225","msg":"timeout: failed to connect service \":50051\" within 5s"}
在这里浏览其他帖子后,许多人似乎指向了 IAM 角色的问题,但我的 IAM 角色没有问题,而且我一直在使用相同的角色成功启动了许多其他节点。
以下是我的 Terraform 文件:
resource "aws_eks_node_group" "eth-staking-nodes" {
cluster_name = aws_eks_cluster.staking.name
node_group_name = "ethstaking-nodes-testnet"
node_role_arn = aws_iam_role.nodes.arn
subnet_ids = [
data.aws_subnet.private-1.id,
data.aws_subnet.private-2.id
]
scaling_config {
desired_size = 1
max_size = 5
min_size = 0
}
update_config {
max_unavailable = 1
}
labels = {
role = "general"
}
launch_template {
version = aws_launch_template.staking.latest_version
id = aws_launch_template.staking.id
}
depends_on = [
aws_iam_role_policy_attachment.nodes-AmazonEKSWorkerNodePolicy,
aws_iam_role_policy_attachment.nodes-AmazonEKS_CNI_Policy,
aws_iam_role_policy_attachment.nodes-AmazonEC2ContainerRegistryReadOnly,
]
}
启动模板:
resource "aws_launch_template" "staking" {
name = "${var.stage}-staking-node-launch-template"
instance_type = "m5.2xlarge"
image_id = "ami-08712c7468e314435"
key_name = "nivpem"
block_device_mappings {
device_name = "/dev/xvda"
ebs {
volume_size = 450
volume_type = "gp2"
}
}
lifecycle {
create_before_destroy = false
}
vpc_security_group_ids = [aws_security_group.eks-ec2-sg.id]
user_data = base64encode(templatefile("${path.module}/staking_userdata.sh", {
password = "********"
}))
tags = {
"eks:cluster-name" = aws_eks_cluster.staking.name
"eks:nodegroup-name" = "ethstaking-nodes-testnet"
}
tag_specifications {
resource_type = "instance"
tags = {
Name = "${var.stage}-staking-node"
"eks:cluster-name" = aws_eks_cluster.staking.name
"eks:nodegroup-name" = "ethstaking-nodes-testnet"
}
}
}
安全组:
resource "aws_security_group" "eks-ec2-sg" {
name = "eks-ec2-sg-staking-testnet"
vpc_id = data.aws_vpc.vpc.id
ingress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
tags = {
Name = "allow_tls"
}
}
英文:
when i launch the node normally, everything working fine, but when i try to launch it using a launch template, im having connection issues within the cluster.
more specifically, aws-node pod fails with the error:
{"level":"info","caller":"/usr/local/go/src/runtime/proc.go:225","msg":"timeout: failed to connect service \":50051\" within 5s"}
digging through the other posts here, many people seem to point to iam role issues, but my iam role is fine, and besides ive been using the same role to launch many other nodes and they launched succesfully.
here are my terraform files:
resource "aws_eks_node_group" "eth-staking-nodes" {
cluster_name = aws_eks_cluster.staking.name
node_group_name = "ethstaking-nodes-testnet"
node_role_arn = aws_iam_role.nodes.arn
subnet_ids = [ data.aws_subnet.private-1.id,
data.aws_subnet.private-2.id
]
scaling_config {
desired_size = 1
max_size = 5
min_size = 0
}
update_config {
max_unavailable = 1
}
labels = {
role = "general"
}
launch_template {
version = aws_launch_template.staking.latest_version
id = aws_launch_template.staking.id
}
depends_on = [
aws_iam_role_policy_attachment.nodes-AmazonEKSWorkerNodePolicy,
aws_iam_role_policy_attachment.nodes-AmazonEKS_CNI_Policy,
aws_iam_role_policy_attachment.nodes-AmazonEC2ContainerRegistryReadOnly,
]
}
the launch template:
esource "aws_launch_template" "staking" {
name = "${var.stage}-staking-node-launch-template"
instance_type = "m5.2xlarge"
image_id = "ami-08712c7468e314435"
key_name = "nivpem"
block_device_mappings {
device_name = "/dev/xvda"
ebs {
volume_size = 450
volume_type = "gp2"
}
}
lifecycle {
create_before_destroy = false
}
vpc_security_group_ids = [aws_security_group.eks-ec2-sg.id]
user_data = base64encode(templatefile("${path.module}/staking_userdata.sh", {
password = "********"
}))
tags = {
"eks:cluster-name" = aws_eks_cluster.staking.name
"eks:nodegroup-name" = "ethstaking-nodes-testnet"
}
tag_specifications {
resource_type = "instance"
tags = {
Name = "${var.stage}-staking-node"
"eks:cluster-name" = aws_eks_cluster.staking.name
"eks:nodegroup-name" = "ethstaking-nodes-testnet"
}
}
}
security group:
resource "aws_security_group" "eks-ec2-sg" {
name = "eks-ec2-sg-staking-testnet"
vpc_id = data.aws_vpc.vpc.id
ingress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
tags = {
Name = "allow_tls"
}
}
答案1
得分: 0
Consider adding vpc_config
with vpc_config
and endpoint_public_access
set to true
in your aws_eks_cluster
resource. That should make it work since you're using private subnets.
英文:
Consider adding vpc_config
with vpc_config
and endpoint_public_access
set to true
in your aws_eks_cluster
resource. That should make it work since you're using private subnets.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论