EKS节点在通过启动模板(terraform)启动时失败。

huangapple go评论62阅读模式
英文:

EKS -nodes fail when launched through a launch template (terraform)

问题

当我正常启动节点时,一切都正常运作,但当我尝试使用启动模板来启动它时,我在集群内遇到了连接问题。

更具体地说,aws-node pod 出现以下错误:

{"level":"info","caller":"/usr/local/go/src/runtime/proc.go:225","msg":"timeout: failed to connect service \":50051\" within 5s"}

在这里浏览其他帖子后,许多人似乎指向了 IAM 角色的问题,但我的 IAM 角色没有问题,而且我一直在使用相同的角色成功启动了许多其他节点。

以下是我的 Terraform 文件:

resource "aws_eks_node_group" "eth-staking-nodes" {
  cluster_name    = aws_eks_cluster.staking.name
  node_group_name = "ethstaking-nodes-testnet"
  node_role_arn   = aws_iam_role.nodes.arn

  subnet_ids = [
    data.aws_subnet.private-1.id,
    data.aws_subnet.private-2.id
  ]

  scaling_config {
    desired_size = 1
    max_size     = 5
    min_size     = 0
  }

  update_config {
    max_unavailable = 1
  }

  labels = {
    role = "general"
  }

  launch_template {
    version = aws_launch_template.staking.latest_version
    id      = aws_launch_template.staking.id
  }

  depends_on = [
    aws_iam_role_policy_attachment.nodes-AmazonEKSWorkerNodePolicy,
    aws_iam_role_policy_attachment.nodes-AmazonEKS_CNI_Policy,
    aws_iam_role_policy_attachment.nodes-AmazonEC2ContainerRegistryReadOnly,
  ]
}

启动模板:

resource "aws_launch_template" "staking" {
  name          = "${var.stage}-staking-node-launch-template"
  instance_type = "m5.2xlarge"
  image_id      = "ami-08712c7468e314435"

  key_name = "nivpem"
  
  block_device_mappings {
    device_name = "/dev/xvda"

    ebs {
      volume_size = 450
      volume_type = "gp2"
    }
  }

  lifecycle {
    create_before_destroy = false
  }

  vpc_security_group_ids = [aws_security_group.eks-ec2-sg.id]
  user_data = base64encode(templatefile("${path.module}/staking_userdata.sh", {
        password = "********"
      }))

  tags = {
    "eks:cluster-name"   = aws_eks_cluster.staking.name
    "eks:nodegroup-name" = "ethstaking-nodes-testnet"
  }

  tag_specifications {
    resource_type = "instance"

    tags = {
      Name                 = "${var.stage}-staking-node"
      "eks:cluster-name"   = aws_eks_cluster.staking.name
      "eks:nodegroup-name" = "ethstaking-nodes-testnet"
    }
  }
}

安全组:

resource "aws_security_group" "eks-ec2-sg" {
  name        = "eks-ec2-sg-staking-testnet"
  vpc_id      = data.aws_vpc.vpc.id

  ingress {
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = ["0.0.0.0/0"]
  }

  egress {
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = ["0.0.0.0/0"]
    ipv6_cidr_blocks = ["::/0"]
  }

  tags = {
    Name = "allow_tls"
  }
}
英文:

when i launch the node normally, everything working fine, but when i try to launch it using a launch template, im having connection issues within the cluster.

more specifically, aws-node pod fails with the error:

{"level":"info","caller":"/usr/local/go/src/runtime/proc.go:225","msg":"timeout: failed to connect service \":50051\" within 5s"}

digging through the other posts here, many people seem to point to iam role issues, but my iam role is fine, and besides ive been using the same role to launch many other nodes and they launched succesfully.

here are my terraform files:

resource "aws_eks_node_group" "eth-staking-nodes" {
  cluster_name    = aws_eks_cluster.staking.name
  node_group_name = "ethstaking-nodes-testnet"
  node_role_arn   = aws_iam_role.nodes.arn

  subnet_ids = [    data.aws_subnet.private-1.id,
    data.aws_subnet.private-2.id
  ]

  scaling_config {
    desired_size = 1
    max_size     = 5
    min_size     = 0
  }

  update_config {
    max_unavailable = 1
  }

  labels = {
    role = "general"
  }

  launch_template {
    version = aws_launch_template.staking.latest_version
    id      = aws_launch_template.staking.id
  }

  depends_on = [
    aws_iam_role_policy_attachment.nodes-AmazonEKSWorkerNodePolicy,
    aws_iam_role_policy_attachment.nodes-AmazonEKS_CNI_Policy,
    aws_iam_role_policy_attachment.nodes-AmazonEC2ContainerRegistryReadOnly,
  ]
}

the launch template:

esource "aws_launch_template" "staking" {
  name          = "${var.stage}-staking-node-launch-template"
  instance_type = "m5.2xlarge"
  image_id      = "ami-08712c7468e314435"

  key_name = "nivpem"
  
  block_device_mappings {
    device_name = "/dev/xvda"

    ebs {
      volume_size = 450
      volume_type = "gp2"
    }
  }

  lifecycle {
    create_before_destroy = false
  }

  vpc_security_group_ids = [aws_security_group.eks-ec2-sg.id]
  user_data = base64encode(templatefile("${path.module}/staking_userdata.sh", {
        password = "********"
      }))

  tags = {
    "eks:cluster-name"   = aws_eks_cluster.staking.name
    "eks:nodegroup-name" = "ethstaking-nodes-testnet"
  }

  tag_specifications {
    resource_type = "instance"

    tags = {
      Name                 = "${var.stage}-staking-node"
      "eks:cluster-name"   = aws_eks_cluster.staking.name
      "eks:nodegroup-name" = "ethstaking-nodes-testnet"
    }
  }
}

security group:

resource "aws_security_group" "eks-ec2-sg" {
  name        = "eks-ec2-sg-staking-testnet"
  vpc_id      = data.aws_vpc.vpc.id

  ingress {
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = ["0.0.0.0/0"]
  }

  egress {
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = ["0.0.0.0/0"]
    ipv6_cidr_blocks = ["::/0"]
  }

  tags = {
    Name = "allow_tls"
  }
}

答案1

得分: 0

Consider adding vpc_config with vpc_config and endpoint_public_access set to true in your aws_eks_cluster resource. That should make it work since you're using private subnets.

英文:

Consider adding vpc_config with vpc_config and endpoint_public_access set to true in your aws_eks_cluster resource. That should make it work since you're using private subnets.

huangapple
  • 本文由 发表于 2023年4月10日 21:02:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/75977380.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定