Why do my grafana tempo ingester pods go into Backoff restarting state after max_block_duration?

huangapple go评论69阅读模式
英文:

Why do my grafana tempo ingester pods go into Backoff restarting state after max_block_duration?

问题

I am using grafana-tempo distributed helm chart. It is successfully deployed and its backend is configured on Azure Storage (blob containers) and working fine.

我正在使用grafana-tempo分布式Helm图表。它已成功部署,并且其后端已配置为Azure存储(Blob容器),正常运行。

I have a demo application which is sending traces to grafana-tempo. I can confirm I'm receiving traces.

我有一个演示应用程序,它正在将跟踪信息发送到grafana-tempo。我可以确认我收到了跟踪信息。

The issue I have observed is that exactly after 30m, my ingester pods are going into Back-off restarting state. And I have to manually restart its statefulset.

我观察到的问题是,在30分钟后,我的Ingestor Pod会进入Back-off重新启动状态。我必须手动重新启动其StatefulSet。

While searching the root cause, found that their is one parameter max_block_duration which has a default value of 30m: "max_block_duration: maximum length of time before cutting a block."

在搜索根本原因时,发现有一个参数max_block_duration,其默认值为30分钟:"max_block_duration: 在切割块之前的最大时间长度。"

So I tried to increase the timing, and given value 60m. Now after 60 minutes my ingester pods are going into Back-off restarting state.

因此,我尝试增加时间,将值设为60分钟。现在,在60分钟后,我的Ingestor Pod会进入Back-off重新启动状态。

I have also enabled autoscaling. But no new pods are coming up if all ingester pods are in the same error state.

我还启用了自动缩放。但如果所有Ingestor Pod都处于相同的错误状态,将不会出现新的Pod。

Can someone help me out to understand why its happening like this and the possible solution to eleminate the issue?

有人能帮助我理解为什么会发生这种情况以及可能的解决方案吗?

What value should be passed to max_block_duration so that this pods will not so in Back-off restarting?

应该传递什么值给max_block_duration,以便这些Pod不会进入Back-off重新启动状态?

I expect my Ingester pods should run fine every time.

我期望我的Ingestor Pod每次都能正常运行。

英文:

I am using grafana-tempo distributed helm chart. It is successfully deployed and its backend is configured on Azure Storage (blob containers) and working fine.

I have a demo application which is sending traces to grafana-tempo. I can confirm I'm receiving traces.

The issue I have observed is that exactly after 30m, my ingester pods are going into Back-off restarting state. And I have to manually restart its statefulset.

While searching the root cause, found that their is one parameter max_block_duration which has a default value of 30m: "max_block_duration: maximum length of time before cutting a block."

So I tried to increase the timing, and given value 60m.
Now after 60 minutes my ingester pods are going into Back-off restarting state.

I have also enabled autoscaling. But no new pods are coming up if all ingester pods are in the same error state.

Can someone help me out to understand why its happening like this and the possible solution to eleminate the issue?

What value should be passed to max_block_duration so that this pods will not so in Back-off restarting?

I expect my Ingester pods should run fine every time.

答案1

得分: 0

我也在tempo的GitHub上提出了一个问题。
现在这个问题在我的端上已经不存在了。
如果有人也遇到同样的问题,可以查看我的GitHub问题,以获取更多信息:https://github.com/grafana/tempo/issues/2488

英文:

I also opened a github issue on tempo.
And now this issue no more exist at my end.
If someone is also facing same, you can have a look into my github issue to get some more insights : https://github.com/grafana/tempo/issues/2488

huangapple
  • 本文由 发表于 2023年5月17日 21:28:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/76272652.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定