Azure 部署槽位交换 – 两个槽位都使用生产设置

huangapple go评论71阅读模式
英文:

Azure deployment slot swap - both slots use production settings

问题

I am using deployment slot settings to distinguish between my main (production) slot and the staging slot.

When I perform a swap, the new production app (that was in staging before the swap) properly reads app settings.

However, the new staging app (which was the production app before the swap) DOES NOT re-read app settings and continues to use production settings.

(EDIT): Turns out the real problem was that the staging slot was running with production settings for a while (in parallel to the production slot), and this persisted up until the end of the swap process where the old production site was restarted with the staging settings. As well as the fact that WEBSITE_HOSTNAME remained at the staging slot setting even after the swap - and remained so until a restart (which incurs downtime).

I am using IOptionsMonitor<MyOptions> as well as registering a change handler myOptionsMonitor.OnChange(...) but nothing works. The new staging app (old production app) always starts and reads production settings.

Only if I later manually restart the staging app does it get the correct settings again.

This causes race conditions when two apps run as production and compete for the same resources.

What am I doing wrong? How can I make this work properly?

Clarification based on comment: The new staging (old production) app is restarted as a result of the swap, but it reads the production settings. I verify this by logging the settings in the constructor of my singleton service.

英文:

I am using deployment slot settings to distinguish between my main (production) slot and the staging slot.

When I perform a swap, the new production app (that was in staging before the swap) properly reads app settings.

<s>However, the new staging app (which was the production app before the swap) DOES NOT re-read app settings and continues to use production settings.</s>

(EDIT): Turns out the real problem was that the staging slot was running with production settings for a while (in parallel to the production slot), and this persisted up until the end of the swap process where the old production site was restarted with the staging settings. As well as the fact that WEBSITE_HOSTNAME remained at the staging slot setting even after the swap - and remained so until a restart (which incurs downtime).

I am using IOptionsMonitor&lt;MyOptions&gt; as well as registering a change handler myOptionsMonitor.OnChange(...) but nothing works. The new staging app (old production app) always start and reads production settings.

Only if I later manually restart the staging app does it get the correct settings again.

This causes race conditions when two apps run as production and compete for the same resources.

What am I doing wrong? How can I make this work properly?

Clarification based on comment: The new staging (old production) app is restarted as a result of the swap, but it reads the production settings. I verify this by logging the settings in the constructor of my singleton service.

答案1

得分: 0

在经过一整天的调查之后,我终于搞清楚了。

在开始之前,我应该说的是 发生在交换期间的情况 的文档是正确的,应该多次阅读以充分理解。

但除此之外,这是我发现的以及如何解决问题的情况。

因此,文档中提到,在交换期间,第一步是将生产设置应用到暂存槽并使其预热。这意味着在生产槽和暂存槽都使用生产设置运行应用的一段时间 - 这就是我要解决的问题 - 我希望在任何时间只有一个“生产”应用的实例在运行。

为了检测我的应用是否在使用生产设置的暂存槽中运行,我检查 WEBSITE_HOSTNAME 环境变量。在交换之前,生产槽上的值为 xxxx.azurewebsites.net,暂存槽上的值为 xxxx-staging1.azurewebsites.net。如果我发现自己在使用生产设置的暂存槽上运行,我将暂停访问共享资源,直到交换完成。

交换完成后,WEBSITE_HOSTNAME两个槽中都将具有值 xxxx-staging1.azurewebsite.net - 因此无法使用此变量来自动检测交换何时完成。

实际上,我没有找到其他自动检测交换何时完成的方法,所以我创建了一个必须手动触发的端点。此端点手动更改 WEBSITE_HOSTNAME 的值,还释放等待交换完成的生产功能。

注意:设置 WEBSITE_HOSTNAME 的值很重要,因为它用于应用洞察遥测以派生 cloud_RoleName 属性。

为了实现这一点,我创建了一个名为 AzureSwap 的单例服务,负责所有这些:

public class AzureSwap {
    private readonly IOptions<AppOptions> _appOptions;
    private readonly TaskCompletionSource _swapDoneTcs = new();

    public AzureSwap(IOptions<AppOptions> appOptions) {
        _appOptions = appOptions;
        var azureWebsiteHostname =
            Environment.GetEnvironmentVariable("WEBSITE_HOSTNAME");
        var myEnv = appOptions.Value.DeploymentEnvironment;
        if (azureWebsiteHostname != "xxxx.azurewebsites.net" &&
            myEnv == "Production") {
            // 交换正在进行中...
            IsSwapping = true;
        }
        else {
            // 交换已完成
            SetSwapDone();
        }
    }

    public bool IsSwapping { get; private set; }

    // 这是由 HTTP 端点调用的
    public void SetSwapDone() {
        var myEnv = _appOptions.Value.DeploymentEnvironment;
        var hostname = myEnv switch {
            "Production" => "xxxx.azurewebsites.net",
            "Staging1" => "xxxx-staging1.azurewebsites.net",
            _ => throw new Exception($"未知环境 {myEnv}"),
        };

        Environment.SetEnvironmentVariable("WEBSITE_HOSTNAME", hostname);

        IsSwapping = false;
        _swapDoneTcs.TrySetResult();
    }

    // 生产服务等待此操作
    public async Task WaitSwapDone() {
        await _swapDoneTcs.Task;
    }
}

要使用它,例如在 IHostedService 中,我这样做:

public class MyHostedService : IHostedService {
    public MyHostedService(AzureSwap azureSwap) {
        _azureSwap = azureSwap;
    }

    public async Task StartAsync(CancellationToken cancellationToken) {
        Task.Run(async () => {
            await _azureSwap.WaitSwapDone();

            // 启动长时间运行的服务操作

        }, cancellationToken);
    }
}
英文:

After a day's worth of investigation, I finally figured it out.

Before starting I should say is that the documentation for What Happens During a Swap is correct and should be read a few times over to properly understand.

But apart from that, here is what I discovered and how I worked around the problem.

So, the documentation says that during a swap, the first step is to apply production settings to the staging slot and warm it up. This means that there is a period of time during which both the production slot and the staging slot run the app with production settings - which is the problem I'm trying to fix - I want one and only one instance of a "production" app to be running at any one time.

In order to detect whether my app is running in the staging slot with production settings I check the WEBSITE_HOSTNAME environment variable. Before the swap this has the value of xxxx.azurewebsites.net on the production slot and xxxx-staging1.azurewebsites.net on the staging slot. If I discover that I am running on the staging slot with production settings, I hold back on accessing shared resources until the swap is done.

After the swap is complete, WEBSITE_HOSTNAME will have the value of xxxx-staging1.azurewebsite.net in both slots - so it is impossible to detect when the swap is done using this variable.

In fact, I have not found a way to automatically detect when the swap is complete any other way, so I created an endpoint that has to be triggered manually. This endpoint manually changes the value of WEBSITE_HOSTNAME and also releases the production functionality that is waiting for the swap to complete.

NOTE: Setting the value of WEBSITE_HOSTNAME is important because it is used for application insights telemetry to derive the cloud_RoleName property.

To achieve that I created a singleton service called AzureSwap that facilitates all this:

public class AzureSwap {
    private readonly IOptions&lt;AppOptions&gt; _appOptions;
    private readonly TaskCompletionSource _swapDoneTcs = new();

    public AzureSwap(IOptions&lt;AppOptions&gt; appOptions) {
        _appOptions = appOptions;
        var azureWebsiteHostname =
            Environment.GetEnvironmentVariable(&quot;WEBSITE_HOSTNAME&quot;);
        var myEnv = appOptions.Value.DeploymentEnvironment;
        if (azureWebsiteHostname != &quot;xxxx.azurewebsites.net&quot; &amp;&amp;
            myEnv == &quot;Production&quot;) {
            // Swap is in progress...
            IsSwapping = true;
        }
        else {
            // Swap has completed
            SetSwapDone();
        }
    }

    public bool IsSwapping { get; private set; }

    // This is called by the HTTP endpoint
    public void SetSwapDone() {
        var myEnv = _appOptions.Value.DeploymentEnvironment;
        var hostname = myEnv switch {
            &quot;Production&quot; =&gt; &quot;xxxx.azurewebsites.net&quot;,
            &quot;Staging1&quot; =&gt; &quot;xxxx-staging1.azurewebsites.net&quot;,
            _ =&gt; throw new Exception($&quot;Unknown environment {myEnv}&quot;),
        };

        Environment.SetEnvironmentVariable(&quot;WEBSITE_HOSTNAME&quot;, hostname);

        IsSwapping = false;
        _swapDoneTcs.TrySetResult();
    }

    // This is waited upon by production services
    public async Task WaitSwapDone() {
        await _swapDoneTcs.Task;
    }
}

And to use that, for example in a IHostedService I do this:

public class MyHostedService : IHostedService {
    public MyHostedService(AzureSwap azureSwap) {
        _azureSwap = azureSwap;
    }

    public async Task StartAsync(CancellationToken cancellationToken) {
        Task.Run(async () =&gt; {
            await _azureSwap.WaitSwapDone();

            // Start long running service operation

        }, cancellationToken);
    }
}

huangapple
  • 本文由 发表于 2023年3月9日 20:40:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/75684758.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定