英文:
React app SPA white screen Fargate, healthy checks on task instances
问题
我有一个托管在AWS Fargate上的React SPA。当我在本地测试SPA时,所有测试都通过了。我上个月将应用部署到了Fargate上,没有任何问题。就在上周,一切都正常。
然后昨天,URL显示为空白屏幕。健康检查中的任务实例被评估为健康,并且有三个正在运行。所以我不知道为什么会出现两种可能性:
- 任务实例(容器)正在运行
- 任务实例健康
- React应用程序无响应
为什么应用程序可能会失败,但这不会反映在健康检查中呢?
英文:
I have a React SPA hosted on AWS Fargate. When I tested the SPA locally, all tests passed. I deployed the app last month to Fargate and no issues. As late as last week, everything was normal.
Then yesterday, the URL is a white screen. The task instances are evaluated as healthy on the health checks and three are running. So I am at a loss for how two things could be possible
- Task instances (containers) are running
- Task instances are healthy
- React app unresponsive
How is it that the app could fail but this would not bubble up the health checks?
答案1
得分: 0
你描述的问题(站点无法访问)暗示路由可能是一个潜在的问题。
追踪到服务的流量
如果您正在使用负载均衡器来控制Fargate中ECS任务的流量,请检查以下内容是否发生了变化:
- Route53记录中的目标
- 负载均衡器上的监听器(端口、路由规则或目标组是否发生了变化?)
- 监听器指向的目标组是否显示为健康?是否在循环容器?
- 容器本身...它们是否仍然在相同的端口上监听(并且是否与目标组配置一致)?
如果您正在使用任务的公共IP进行路由...不要这样做。每次任务重新启动时,它都会获得新的公共IP。使用负载均衡器。
安全性
仔细检查:
- NACLs:它们是否已更新?它们是否仍然允许您需要的子网/端口等流量?
- 安全组:它们是否已更新?它们是否仍然允许所需端口(例如80/443)上的传入流量,并且它们是否仍然与您的ECS服务关联?
那通常是我在遇到这种问题时要检查的清单。
英文:
The issue that you described (site can't be reached) implies that routing is a potential issue.
Trace the traffic to the service
If you are using a load balancer to control traffic to your ECS tasks in Fargate, was the route, LB, or Target Group changed at all? Check...
- The target of the record in Route53
- The listener on the load balancer (has the port, routing rules, or target group changed?)
- The target group that the listener points to, does it show healthy? Is it cycling containers?
- The containers themselves... do they still listen on the same ports (and is that consistent with the target group configuration)
If you are using the public IP of the task to route to... don't do that. Each time the task restarts, it'll get a new public IP. Use a load balancer.
Security
Double check:
- NACLs: have they been updated? Do they still allow traffic on the subnets/ports/etc. that you need?
- Security Groups: have they been updated? Do they still allow traffic incoming traffic on the ports you require (ie. 80/443) and are they still associated with your ECS Service?
That's typically the checklist I go through when encountering such an issue.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论