英文:
Unit Tests using MassTransit Test Harness randomly fail on build server
问题
I see you're experiencing issues with flaky test results in your C# code. It appears that you've tried adjusting timeouts, but the problem persists. One potential approach to make the test less flaky is to explicitly wait for the message consumption to complete before making assertions. Here's the modified code:
await Harness.Bus.Publish(new SomeMessage("something"));
await Harness.Consumed.SelectAsync<SomeMessage>().Any();
(await ConsumerHarness.Consumed.SelectAsync<SomeMessage>().Any()).Should().BeTrue();
By awaiting the message consumption before checking, you can ensure that the test doesn't proceed until the message has been fully processed. This may help in making your test less flaky.
However, if the issue persists, further investigation into the MassTransit library or potential concurrency issues in your pipeline may be necessary.
英文:
I wrote some test with the following structure using the TestHarness:
using MassTransit.Testing;
namespace g2fp.PosApi.UnitTests;
public class SampleTest : IAsyncLifetime
{
public async Task InitializeAsync()
{
var serviceCollection = new ServiceCollection();
serviceCollection.AddTransient<ISomeService, SomeService>();
serviceCollection.AddMassTransitTestHarness(cfg =>
{
cfg.AddConsumer<SomeConsumer>();
});
Services = serviceCollection.BuildServiceProvider();
Harness = Services.GetRequiredService<ITestHarness>();
ConsumerHarness = Services.GetRequiredService<IConsumerTestHarness<SomeConsumer>>();
await Harness.Start();
}
[Fact]
public async Task Consume()
{
await Harness.Bus.Publish(new SomeMessage("something"));
// await Harness.InactivityTask;
(await Harness.Consumed.SelectAsync<SomeMessage>().Any()).Should().BeTrue();
(await ConsumerHarness.Consumed.SelectAsync<SomeMessage>().Any()).Should().BeTrue();
}
public IConsumerTestHarness<SomeConsumer> ConsumerHarness { get; set; }
public ITestHarness Harness { get; set; }
public ServiceProvider Services { get; set; }
public async Task DisposeAsync()
{
await Harness.Stop();
}
}
public record SomeMessage(string Message);
public class SomeConsumer : IConsumer<SomeMessage>
{
private readonly ISomeService _someService;
public SomeConsumer(ISomeService someService)
{
_someService = someService;
}
public async Task Consume(ConsumeContext<SomeMessage> context)
{
await _someService.DoSomething();
}
}
public interface ISomeService
{
Task DoSomething();
}
public class SomeService : ISomeService
{
public async Task DoSomething()
{
await Task.Delay(500);
}
}
On my PC I can run this test 1000 times and it's fine, but on our pipeline (Gitlab) from time to time it fails on the following line:
(await Harness.Consumed.SelectAsync<SomeMessage>().Any()).Should().BeTrue();
It doesn't really matter how much delay I put in SomeService
- the test waits nicely for that to finish, so that we can do asserts on our mocks.
Am I missing something obvious or is it just flaky? When the test fails, execution time is around 2s, so I guess that the wait logic in Harness.Consumed
actually waits, but nothing happens.
Should I change the order of waiting? Should I do something like:
var waitTask = Harness.Consumed.SelectAsync<SomeMessage>().Any();
await Harness.Bus.Publish(new SomeMessage("something"));
await waitTask).Should().BeTrue();
It still works, but I have no idea if that is less flaky or not. Maybe the classic ConfigureAwait
in some place would unlock writing to consumed message collection?
EDIT: Following the advice, I set the inactivity timeout to 10s:
serviceCollection.AddMassTransitTestHarness(cfg =>
{
cfg.SetTestTimeouts(testInactivityTimeout: TimeSpan.FromSeconds(10));
cfg.SetKebabCaseEndpointNameFormatter();
ConfigureTestHarness(cfg);
});
However, the test still fails randomly, in the same way as before, without waiting for the full timeout span.
Here are the logs with timestamps, and you can see logs from the actuall consumer (I changed the names, so if there is something incosistent, it's due to that):
2023-06-06T09:30:35.5251966+00:00 - Information - 0 - MassTransit - Configured endpoint my-message-endpoint, Consumer: MyConsumer
2023-06-06T09:30:35.5270254+00:00 - Debug - 0 - MassTransit.Transports.BusDepot - Starting bus instances: IBus
2023-06-06T09:30:35.5270426+00:00 - Debug - 0 - MassTransit - Starting bus: loopback://localhost/
2023-06-06T09:30:39.5021627+00:00 - Debug - 0 - MassTransit - Endpoint Ready: loopback://localhost/my-message-endpoint
2023-06-06T09:30:39.5049111+00:00 - Debug - 0 - MassTransit - Endpoint Ready: loopback://localhost/runnerwspzdzwlproject21928941concurrent0_testhost_bus_9hboyyfcnrbrfprdbdpschfqfb
2023-06-06T09:30:39.5050096+00:00 - Information - 0 - MassTransit - Bus started: loopback://localhost/
2023-06-06T09:30:39.5363258+00:00 - Debug - 0 - MassTransit - Create send transport: loopback://localhost/urn:message:MyMessage
2023-06-06T09:30:39.5365853+00:00 - Debug - 0 - MassTransit.Messages - SEND loopback://localhost/urn:message:Events:MyMessage ff030000-ac11-0242-a321-08db6670b105 Events.MyMessage
2023-06-06T09:30:41.6678433+00:00 - Information - 0 - MyConsumer - Updating for Product 42 <-- logs from the consumer code
2023-06-06T09:30:41.7407242+00:00 - Information - 0 - MyConsumer - 0 modified for Product 42 <-- logs from the consumer code
2023-06-06T09:30:41.7719252+00:00 - Debug - 0 - MassTransit.Messages - RECEIVE loopback://localhost/my-message-endpoint ff030000-ac11-0242-a321-08db6670b105 MyMessage MyConsumer(00:00:00.0734534)
2023-06-06T09:30:42.0018748+00:00 - Debug - 0 - MassTransit.Transports.BusDepot - Stopping bus instances: IBus
2023-06-06T09:30:42.0021543+00:00 - Debug - 0 - MassTransit - Stopping bus: loopback://localhost/
2023-06-06T09:30:42.0023811+00:00 - Debug - 0 - MassTransit - Endpoint Stopping: loopback://localhost/my-message-endpoint
2023-06-06T09:30:42.0026290+00:00 - Debug - 0 - MassTransit - Endpoint Completed: loopback://localhost/my-message-endpoint
2023-06-06T09:30:42.0026475+00:00 - Debug - 0 - MassTransit.Messages - Consumer Completed: loopback://localhost/my-message-endpoint: 1 received, 1 concurrent
2023-06-06T09:30:42.0026759+00:00 - Debug - 0 - MassTransit - Endpoint Stopping: loopback://localhost/runnerwspzdzwlproject21928941concurrent0_testhost_bus_9hboyyfcnrbrfprdbdpschfqfb
2023-06-06T09:30:42.0027023+00:00 - Debug - 0 - MassTransit - Endpoint Completed: loopback://localhost/runnerwspzdzwlproject21928941concurrent0_testhost_bus_9hboyyfcnrbrfprdbdpschfqfb
2023-06-06T09:30:42.0027102+00:00 - Debug - 0 - MassTransit.Messages - Consumer Completed: loopback://localhost/runnerwspzdzwlproject21928941concurrent0_testhost_bus_9hboyyfcnrbrfprdbdpschfqfb: 0 received, 0 concurrent
2023-06-06T09:30:42.0027993+00:00 - Information - 0 - MassTransit - Bus stopped: loopback://localhost/
The error is still on the same line:
await Harness.Bus.Publish(message);
(await Harness.Consumed.SelectAsync<TMessage>().Any()).Should().BeTrue(); // <- here
(await ConsumerHarness.Consumed.SelectAsync<TMessage>().Any()).Should().BeTrue();
Expected (Harness.Consumed.SelectAsync<TMessage>().Any()) to be true, but found False.
Any other ideas?
答案1
得分: 1
cfg.SetTestTimeouts(testInactivityTimeout: TimeSpan.FromSeconds(3));
Or some larger value to account for the fact that your CI/CD box likely has significantly fewer CPU cores and less memory than your local box, and it just might be "busy."
For instance, if your consumer is slow to create the first time for some reason, or whatever, that initial delay might be cause.
### UPDATED
The only other thing I see is that you aren't disposing of the service provider after the test. In the scenario where you are running multiple tests, it might be that something is getting in the way and causing to fail. No idea what, but not disposing of the service provider is a bad idea since it leaves all sorts of things in memory.
英文:
You can use:
cfg.SetTestTimeouts(testInactivityTimeout: TimeSpan.FromSeconds(3));
Or some larger value to account for the fact that your CI/CD box likely has significantly fewer CPU cores and less memory than your local box, and it just might be "busy."
For instance, if your consumer is slow to create the first time for some reason, or whatever, that initial delay might be cause.
UPDATED
The only other thing I see is that you aren't disposing of the service provider after the test. In the scenario where you are running multiple tests, it might be that something is getting in the way and causing to fail. No idea what, but not disposing of the service provider is a bad idea since it leaves all sorts of things in memory.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论