Deployed .NET Core 6 Blazor Server App stops processing HTTP requests after user (in his browser) cleans the cache, and impacts all other clients

huangapple go评论70阅读模式
英文:

Deployed .NET Core 6 Blazor Server App stops processing HTTP requests after user (in his browser) cleans the cache, and impacts all other clients

问题

我有一个非常奇怪的问题。

当位于奥地利的一个用户刷新缓存并尝试重新连接到网站时,应用程序不再处理HTTP请求,即使应用程序已经打开,但应该从URL获取的标志和其他HTTP请求也不会通过。在我所在的欧盟其他国家,直到用户刷新缓存之前,一切都很顺利。我已经通过与刷新缓存的用户进行通信来确认了复制,他正在使用IOS iPhone设备。

我在Azure应用服务中部署了Blazor Server,位于欧洲西部地区。我尝试了Linux和Windows计划,但问题仍然存在。

日志没有显示WebSocket断开连接的任何失败,因为Blazor Server正在使用SignalR。在我启用了DEBUG级别的日志记录的应用程序中,我找不到任何错误。

我尝试了许多Hub连接的配置。以下是服务器部分的当前配置:

services.AddServerSideBlazor()
    .AddHubOptions(options =>
    {
        options.HandshakeTimeout = TimeSpan.FromSeconds(15);
        options.KeepAliveInterval = TimeSpan.FromSeconds(15);
        options.ClientTimeoutInterval = TimeSpan.FromSeconds(40);
        options.EnableDetailedErrors = true;
    })
    .AddCircuitOptions(options =>
    {
        options.DetailedErrors = true;
    });

此外,在客户端方面,我配置了以下部分:

public static HubConnection TryInitialize(this HubConnection hubConnection, NavigationManager navigationManager)
{
    if (hubConnection == null)
    {
        hubConnection = new HubConnectionBuilder()
                          .WithUrl(navigationManager.ToAbsoluteUri(ApplicationConstants.SignalR.HubUrl))
                          .Build();

        hubConnection.HandshakeTimeout = TimeSpan.FromSeconds(15);
        hubConnection.KeepAliveInterval = TimeSpan.FromSeconds(15);
        hubConnection.ServerTimeout = TimeSpan.FromSeconds(40);
    }

    return hubConnection;
}

问题是一旦在一个用户端出现这个问题,然后应用程序就像停止处理所有其他用户的HTTP一样。页面加载正常,所以我不认为它崩溃了,因为在以下跟踪中没有记录任何错误:

"Logging": {
    "LogLevel": {
        "Default": "Debug",
        "Microsoft": "Debug",
        "Hangfire": "Debug",
        "Microsoft.Hosting.Lifetime": "Debug",
        "Microsoft.AspNetCore.HttpLogging.HttpLoggingMiddleware": "Debug",
        "Microsoft.AspNetCore.SignalR": "Debug"
    }
}

我无论如何都无法在我的本地机器上复制这个问题。此外,以下是没有特殊配置的_Layout页面:

<body>
    <div id="app">
        @RenderBody()
    </div>
    <div id="blazor-error-ui">
        <environment include="Staging,Production">
            An error has occurred. This application may no longer respond until reloaded.
        </environment>
        <environment include="Development">
            An unhandled exception has occurred. See browser dev tools for details.
        </environment>
        <a href="" class="reload">Reload</a>
        <a class="dismiss">&#128473;</a>
    </div>
    <script src="_framework/blazor.server.js" autostart="false"></script>
    <script>
        Blazor.start();
    </script>
    <script src="_content/MudBlazor/MudBlazor.min.js?v=5.0.5"></script>
    <script src="js/scroll.js"></script>
    <script src="js/sounds.js"></script>
    <script src="js/file.js"></script>
    <script src="js/script.js"></script>
</body>

主要问题是,如果有经验的人在部署Blazor Server应用程序时遇到了这个问题,或者我应该在哪里寻找解决这个问题的方法?

英文:

I have a very bizarre problem.

When one user located in Austria refreshes the cache and tries to reconnect to the Website the application is not processing anymore the HTTP requests, even though the app opens, but the logos that should be fetched from the URL are not coming through neither other HTTP requests. On my end, in the other country in EU it works very good until that user is doing the refresh cache. I have confirmed the replication by communicating with the user that is refreshing the cache, he is using an IOS iPhone device.

I have deployed the Blazor Server in Azure app service, western EU region. I have tried in Linux and Windows plan, and the same issue persists.

Logs are not showing any failure of WebSocket disconnection as Blazor Server is using SingalR. There is not single error that I can trace in the Logs that I have enabled in the application with DEBUG level.

I have tried many configuration of the Hub connection. Here it is the current config in the server part:

 services.AddServerSideBlazor()
            .AddHubOptions(options =&gt;
            {
               
                options.HandshakeTimeout = TimeSpan.FromSeconds(15);
                options.KeepAliveInterval = TimeSpan.FromSeconds(15);
                options.ClientTimeoutInterval = TimeSpan.FromSeconds(40);
                options.EnableDetailedErrors= true;

            })
            .AddCircuitOptions(options =&gt;
            {
                options.DetailedErrors = true;
            });

Also at client side I have this part configured:

 public static HubConnection TryInitialize(this HubConnection hubConnection, NavigationManager navigationManager)
    {
        if (hubConnection == null)
        {
            hubConnection = new HubConnectionBuilder()
                              .WithUrl(navigationManager.ToAbsoluteUri(ApplicationConstants.SignalR.HubUrl))
                              .Build();

            hubConnection.HandshakeTimeout = TimeSpan.FromSeconds(15);
            hubConnection.KeepAliveInterval = TimeSpan.FromSeconds(15);
            hubConnection.ServerTimeout = TimeSpan.FromSeconds(40);
        }

        return hubConnection;
    }

The problem is that once it is occurred in one user end, then the app is kind of like stopped processing the HTTP for all other users. The page loads well, so I do not think that it is crashing or anything, as no error is logged in any of the following traces:

&quot;Logging&quot;: {
&quot;LogLevel&quot;: {
  &quot;Default&quot;: &quot;Debug&quot;,
  &quot;Microsoft&quot;: &quot;Debug&quot;,
  &quot;Hangfire&quot;: &quot;Debug&quot;,
  &quot;Microsoft.Hosting.Lifetime&quot;: &quot;Debug&quot;,
  &quot;Microsoft.AspNetCore.HttpLogging.HttpLoggingMiddleware&quot;: &quot;Debug&quot;,
  &quot;Microsoft.AspNetCore.SignalR&quot;: &quot;Debug&quot;
}

I cannot replicate this in my local machine in any way. Also just in case here is the _Layout page with no specific configuration:

&lt;body&gt;
&lt;div id=&quot;app&quot;&gt;
@RenderBody()

&lt;div id=&quot;blazor-error-ui&quot;&gt;
    &lt;environment include=&quot;Staging,Production&quot;&gt;
        An error has occurred. This application may no longer respond until reloaded.
    &lt;/environment&gt;
    &lt;environment include=&quot;Development&quot;&gt;
        An unhandled exception has occurred. See browser dev tools for details.
    &lt;/environment&gt;
    &lt;a href=&quot;&quot; class=&quot;reload&quot;&gt;Reload&lt;/a&gt;
    &lt;a class=&quot;dismiss&quot;&gt;&#128473;&lt;/a&gt;
&lt;/div&gt;
&lt;script src=&quot;_framework/blazor.server.js&quot; autostart=&quot;false&quot;&gt;&lt;/script&gt;
&lt;script&gt;
    Blazor.start();
&lt;/script&gt;
&lt;script src=&quot;_content/MudBlazor/MudBlazor.min.js?v=5.0.5&quot;&gt;&lt;/script&gt;
&lt;script src=&quot;js/scroll.js&quot;&gt;&lt;/script&gt;
&lt;script src=&quot;js/sounds.js&quot;&gt;&lt;/script&gt;
&lt;script src=&quot;js/file.js&quot;&gt;&lt;/script&gt;
&lt;script src=&quot;js/script.js&quot;&gt;&lt;/script&gt;

</body>

The main question is, if any experienced person, who deployed Blazor Server apps had this issue, or where should I be looking to resolve this issue?

答案1

得分: 1

我最终成功隔离了这个问题,通过在释放方法上添加了一些额外的日志。这是一个特定的情况:

  1. 当用户在浏览器中使用并关闭标签页、清除缓存或关闭浏览器时,没有问题,释放方法会被调用并关闭连接。
  2. 当移动用户关闭标签页或清除缓存时,释放方法未被调用,连接保持打开状态长达3分钟,最大的问题是在此3分钟内没有处理任何HTTP请求,直到连接关闭。而且这影响所有客户端,不论其所在国家。因此,Web应用在3分钟内无法使用。

因此,我通过在services.AddServerSideBlazor().AddHubOptions之后添加一些配置来解决了这个问题:

.AddCircuitOptions(options =>
{
    options.DetailedErrors = true;
    options.DisconnectedCircuitRetentionPeriod = TimeSpan.FromSeconds(0);
    options.DisconnectedCircuitMaxRetained = 0;
});

由于我不保留任何断开连接的电路,问题已解决,Web应用运行良好。

我通过理解释放并不总是被调用来找到了解决方案,我从这篇帖子中了解到了这一点:https://github.com/dotnet/aspnetcore/issues/39370
并且有人解释说:

我们的通知机制是尽力而为,而不是保证一定发生,因为浏览器不保证事件一定会被触发。即使在这种情况下,这是一种“性能”优化,用于在我们知道电路不再使用时迅速终止电路。在这种情况下,默认情况下,释放大约在3分钟后被调用,当我们检测到连接丢失且客户端未在释放整个电路的过程中重新连接时。

英文:

I was finally able to isolate the issue, by adding some extra log on the disposing method. It was specific scenario:

  1. When users were using the browser, and closed the tab, or cleaned the cache or closed the browser; there was no issue, the disposed method was called and connection was closed.
  2. When mobile users closed the tab, or cleaned the cache, disposing method was not called and connection stayed open for 3 minutes, and the biggest problem was that no HTTP request was processed within this 3 minutes until connection closed. And it was affecting all clients, regardless the country location. So the webapp became unusable for 3 minutes.

So I solved this by adding some configuration after services.AddServerSideBlazor().AddHubOptions:

.AddCircuitOptions(options =&gt;
                {
                    options.DetailedErrors = true;
                    options.DisconnectedCircuitRetentionPeriod = TimeSpan.FromSeconds(0);
                    options.DisconnectedCircuitMaxRetained = 0;
                });

As I do not retain any disconnected circuit, the problem is solved and the webapp is working great.

I came to the solution by understanding that disposed is not called always, I understood from this post: https://github.com/dotnet/aspnetcore/issues/39370
and it was explained that:

> Our notification mechanism is a best effort not a guaranteed outcome, since the browser doesn't guarantee that the event is fired. Even in that case, this is a "performance" optimization to eagerly terminate the circuit when we know it won't be used any longer.
In this scenario dispose gets called after around 3 minutes (by default) when we detect the connection has been lost and the client doesn't re-connect as part of disposing the entire circuit.

huangapple
  • 本文由 发表于 2023年6月26日 04:35:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/76552302.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定