英文:
OpenTelemetry: ActivitySource.StartActivity returns null activity when there are listeners hooked
问题
我使用:
OpenTelemetry 1.4.0
OpenTelemetry.Extensions.Hosting 1.4.0
OpenTelemetry.Instrumentation.AspNetCore 1.0.0-rc9.14
运行时版本:.NET 6.0 ASP.NET WebApi,使用 Docker 基础镜像 6.0-alpine3.17
我有两个服务 A 和 B。服务 A 暴露了 REST API 端点,服务 B 也暴露了 REST API 端点。
在服务 A 中注册 OpentTelemetry 如下:
public static class OTRegistration
{
private static readonly ActivitySource _activitySource = new ActivitySource(Assembly.GetExecutingAssembly().GetName().Name!, "1.0.0");
public static void AddOT(this IServiceCollection services)
{
services.AddOpenTelemetry()
.WithTracing(tracerProviderBuilder =>
tracerProviderBuilder
.AddSource(_activitySource.Name)
.ConfigureResource(resource => resource.AddService(_activitySource.Name))
.AddAspNetCoreInstrumentation()
);
}
}
客户端调用服务 A 端点,然后该端点调用服务 B。在对服务 B 的调用中,服务 A 有时会随机发送 traceparent
标头,有时以 00
结尾,有时以 01
结尾,例如:00-000000000000000056473954588e71ac-36ae11cc57b1e9c1-00
。
当以 00
结尾时,服务 B 会创建空活动:
Activity? activity = _activitySource.StartActivity("TestActivity");
我再次检查了服务 B 中的监听器是否在活动创建时被挂钩。我添加了一些日志来证明:
Activity was created as null. | OperationName=file.uploaded, traceparent=00-000000000000000056473954588e71ac-36ae11cc57b1e9c1-00, HasListeners=True.
Activity was created as not null. | OperationName=file.uploaded, traceparent=00-0000000000000000517d22c0e0b634be-5b066682501cc9b8-01, HasListeners=True, RootId=0000000000000000517d22c0e0b634be, ParentId=00-0000000000000000517d22c0e0b634be-a258d1b5dddfc7de-01, Id=00-0000000000000000517d22c0e0b634be-58691a9e53e3ae37-01.
我注意到 https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.activitytraceflags?view=net-6.0 有两个值,None (0) 和 Recorded (1),所以我怀疑 00
标志表示不会创建活动。
有关为什么服务 A 有时以 00
结尾,有时以 01
结尾的任何想法吗?如何控制这种行为以消除随机性?
英文:
I use:
OpenTelemetry 1.4.0
OpenTelemetry.Extensions.Hosting 1.4.0
OpenTelemetry.Instrumentation.AspNetCore 1.0.0-rc9.14
and runtime version: .NET 6.0 ASP.NET WebApi using docker base image 6.0-alpine3.17
I have 2 services A and B. Service A exposes REST API endpoint and also service B exposes REST API endpoint.
OpentTelemetry in service A is registered like this:
public static class OTRegistration
{
private static readonly ActivitySource _activitySource = new ActivitySource(Assembly.GetExecutingAssembly().GetName().Name!, "1.0.0");
public static void AddOT(this IServiceCollection services)
{
services.AddOpenTelemetry()
.WithTracing(tracerProviderBuilder =>
tracerProviderBuilder
.AddSource(_activitySource.Name)
.ConfigureResource(resource => resource.AddService(_activitySource.Name))
.AddAspNetCoreInstrumentation()
);
}
}
A client calls service A endpoint and next this endpoint calls service B. In a call to service B service A randomly sends traceparent
header which ends on sometimes 00
and sometimes 01
, for example: 00-000000000000000056473954588e71ac-36ae11cc57b1e9c1-00
.
When it ends on 00
service B creates null activities:
Activity? activity = _activitySource.StartActivity("TestActivity");
I double checked that in service B is hooked listener when the activity is created. I added some logs to prove it:
Activity was created as null. | OperationName=file.uploaded, traceparent=00-000000000000000056473954588e71ac-36ae11cc57b1e9c1-00, HasListeners=True.
Activity was created as not null. | OperationName=file.uploaded, traceparent=00-0000000000000000517d22c0e0b634be-5b066682501cc9b8-01, HasListeners=True, RootId=0000000000000000517d22c0e0b634be, ParentId=00-0000000000000000517d22c0e0b634be-a258d1b5dddfc7de-01, Id=00-0000000000000000517d22c0e0b634be-58691a9e53e3ae37-01.
I see that https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.activitytraceflags?view=net-6.0 has to values None (0) and Recorded (1) so that`s why I suspect that 00 flag means that activity will not be created.
Any ideas why service A sometimes ends the traceparent
with 00
and sometimes with 01
?
How to control this behavior to not have any randomness?
答案1
得分: 0
traceparent
的最后一部分确定是否应该对活动进行采样: https://www.w3.org/TR/trace-context/#trace-flags
Sampler
负责设置/确定是否应该记录/创建活动。
默认情况下,Sample 设置为 parentbased_always_on
。这意味着如果父级未记录,则没有理由创建新活动。
您可以通过调用 .SetSampler(new AlwaysOnSampler())
在服务 B 中覆盖此功能,或者检查服务 A 为什么决定不记录活动(也许它是从服务 A 的客户端传播过来的?)。
英文:
The last part of traceparent
determines if activity should be sampled or no: https://www.w3.org/TR/trace-context/#trace-flags
Sampler
is responsible to set/determine if the activity should be recorder/created.
By default Sample is set to parentbased_always_on
. It means that if the parent was not recorder, then the there is no reason to create new acitivity.
You can overwrite this functionality by call .SetSampler(new AlwaysOnSampler())
in Service B or check why Service A decided not to record activity (maybe it is propagated from the client of Service A?).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论