英文:
JAVA GAE: Process terminated because the machine was forced to shutdown
问题
我的Java应用程序长时间运行在GAE上。今天该应用程序返回了500错误,而且appengine日志中只显示了以下消息:
"由于机器被强制关闭,进程已终止"
我无法找到任何解释该消息原因的文档。
关于关闭生命周期的唯一文档是:https://cloud.google.com/appengine/docs/standard/java/how-instances-are-managed#shutdown
关闭
关闭过程可能由各种计划内和非计划事件触发,例如:
您手动停止了一个实例。
您将更新后的版本部署到服务。
实例超出了其配置的instance_class的最大内存。
您的应用程序用完了实例小时配额。
您的实例转移到另一台机器,要么是因为当前运行实例的当前机器被重新启动,要么是因为App Engine将您的实例移动以改善负载分配。
嗯,唯一可能的原因是最后一个,但我仍然不确定该消息是否指的是"App Engine将您的实例移动"。
我的应用程序配置:
<instance-class>B4_HIGHMEM</instance-class>
<runtime>java8</runtime>
<basic-scaling>
<max-instances>1</max-instances>
<idle-timeout>1m</idle-timeout>
</basic-scaling>
有什么想法吗?
谢谢。
英文:
My java application is running on GAE for a long time. Today the application has returned 500 and the appengine log just shows the message:
"Process terminated because the machine was forced to shutdown"
I am not able to find any documentation that explains the reasons for that message.
The only doc about shutdown lifecycle is https://cloud.google.com/appengine/docs/standard/java/how-instances-are-managed#shutdown
Shutdown
The shutdown process might be triggered by a variety of planned and unplanned events, such as:
You manually stop an instance.
You deploy an updated version to the service.
The instance exceeds the maximum memory for its configured instance_class.
Your application runs out of Instance Hours quota.
Your instance is moved to a different machine, either because the current machine that is running the instance is restarted, or App Engine moved your instance to improve load distribution.
Well, the only pausible reason is the last one, but yet, i am not convinced that message refers to "app engine moved your instance..."
My application config:
<instance-class>B4_HIGHMEM</instance-class>
<runtime>java8</runtime>
<basic-scaling>
<max-instances>1</max-instances>
<idle-timeout>1m</idle-timeout>
</basic-scaling>
Any idea?
Regards
答案1
得分: 1
不确定为什么在您的GAE中出现这种情况,但根据App Engine服务级别协议,GAE的承诺月可用性百分比为“至少99.95%”。
“月可用性百分比”是指一个月中的总分钟数,
减去一个月中所有停机期间遭受的停机时间的分钟数,
再除以一个月中的总分钟数。
因此,这可能是预期的行为,但是,如果您认为这种情况发生得更频繁,我建议您联系GCP技术支持团队进行更详细的检查。
此外,您可能希望增加应用程序配置中的*max-instances*参数,因为更多的实例将有助于更好地处理负载,这样,如果一个实例失败,其他实例将继续工作。
英文:
Not sure why exactly it is happening in your GAE, however, according to the App Engine Service Level Agreement, the promised Monthly Uptime Percentage for GAE is "at least 99.95%".
> "Monthly Uptime Percentage" means total number of minutes in a month,
> minus the number of minutes of Downtime suffered from all Downtime
> Periods in a month, divided by the total number of minutes in a month.
Therefore, this could be an expected behaviour, however, if you believe it is happening more often, I'd recommend you to contact the GCP technical support for a more detailed inspection.
Also, you might want to increase the max-instances parameter in your app's configuration as more instances will help in better load process, thus if one instance fails the others will keep working.
答案2
得分: 0
当一个 App Engine 实例无法处理请求,因此崩溃/终止时,会出现500错误。很有可能是 GAE 正在处理可能需要一些时间才能完成的长时间运行的请求。
一些背景信息,当 App Engine 提供请求时,会创建沙箱实例,有时这些实例可能会崩溃、更改或关闭,然后可能会出现其他错误。
尽管不太频繁地会出现这个问题,但造成的原因可能各不相同。我建议查阅您之前分享的文档,仔细检查列表上的项目,以及长时间运行的请求。
英文:
When an App Engine instance is unable to process a request, and therefore crashes/dies, 500s errors will appear. Chances are that the GAE is handling long-running requests that might take a while to complete.
A bit of context, App Engine creates sandbox instances when it serves the requests, sometimes these instances may be subject to crash, change or shutdown, then other errors may appear eventually.
While this issue is not expected to happen frequently, it can be caused by different reasons. I would recommend to look at the documentation that you shared previously and double check for the items on the list, as well as for long-lasting requests.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论