CDK Pipelines – 在流水线中迁移数据库

huangapple go评论60阅读模式
英文:

CDK Pipelines - Migrating a database in the pipeline

问题

我有一个处理自身突变和在ECS上部署我的应用程序的CDK Pipelines流水线,我在尝试实现数据库迁移时遇到了困难。

我的迁移文件以及迁移命令都位于在流水线中构建和部署的Docker容器内。以下是我迄今为止尝试的两种方法:


我的第一个想法 是在阶段上创建一个“pre”步骤,但我认为存在一个鸡生蛋蛋生鸡的情况。由于迁移命令需要存在数据库(以及具有端点和凭据),而迁移步骤是“pre”步骤,因此在运行此命令时栈还不存在...

    const pipeline = new CodePipeline(this, "CdkCodePipeline", {
      // ...
      // ...
    }

    pipeline.addStage(applicationStage).addPre(new CodeBuildStep("MigrateDatabase", {
      input: pipeline.cloudAssemblyFileSet,
      buildEnvironment: {
        environmentVariables: {
          DB_HOST: { value: databaseProxyEndpoint },
          // ...
          // ...
        },
        privileged: true,
        buildImage: LinuxBuildImage.fromAsset(this, 'Image', {
          directory: path.join(__dirname, '../../docker/php'),
        }),
      },
      commands: [
        'cd /var/www/html',
        'php artisan migrate --force',
      ],
    }))

在上述代码中,databaseProxyEndpoint 可以是从 CfnOutput、SSM Parameter 到普通的 typescript 引用的任何内容,但由于值为空、缺失或尚未生成而导致所有尝试失败。

我觉得这个方法很接近,因为它在没有尝试引用 databaseProxyEndpoint 的情况下工作得非常好。


我的第二次尝试 是在ECS中创建一个初始化容器。

   const migrationContainer = webApplicationLoadBalancer.taskDefinition.addContainer('init', {
      image: ecs.ContainerImage.fromDockerImageAsset(webPhpDockerImageAsset),
      essential: false,
      logging: logger,
      environment: {
        DB_HOST: databaseProxy.endpoint,
        // ...
        // ...
      },
      secrets: {
        DB_PASSWORD: ecs.Secret.fromSecretsManager(databaseSecret, 'password')
      },
      command: [
        "sh",
        "-c",
        [
          "php artisan migrate --force",
        ].join(" && "),
      ]
    });

    // 确保迁移运行并且我们的初始化容器返回成功
    serviceContainer.addContainerDependencies({
      container: migrationContainer,
      condition: ecs.ContainerDependencyCondition.SUCCESS,
    });

这种方法有效,但我不太喜欢它。迁移命令应该在CI/CD流水线中的部署时运行一次,而不是在ECS服务启动/重启或扩展时运行... 我的迁移失败一次,导致CloudFormation被锁定,因为健康检查在部署时失败,然后自然也在回滚时失败,从而导致一个完全崩溃的痛苦循环。

有关如何实现这一目标的任何想法或建议都将挽救我剩下的头发!

英文:

I have a CDK Pipelines pipeline that is handling the self mutation and deployment of my application on ECS and I am having a tough time figuring out how to implement database migrations.

My migration files as well as the migration command reside inside of the docker container that are built and deployed in the pipeline. Below are two things I've tried so far:


My first thought was just creating a pre step on the stage, but i believe there is a chicken/egg situation. Since the migration command requires database to exist (as well as having the endpoint and credentials) and the migration step is pre, the stack doesn't exist when this command would run...

    const pipeline = new CodePipeline(this, "CdkCodePipeline", {
      // ...
      // ...
    }

    pipeline.addStage(applicationStage).addPre(new CodeBuildStep("MigrateDatabase", {
      input: pipeline.cloudAssemblyFileSet,
      buildEnvironment: {
        environmentVariables: {
          DB_HOST: { value: databaseProxyEndpoint },
          // ...
          // ...
        },
        privileged: true,
        buildImage: LinuxBuildImage.fromAsset(this, 'Image', {
          directory: path.join(__dirname, '../../docker/php'),
        }),
      },
      commands: [
        'cd /var/www/html',
        'php artisan migrate --force',
      ],
    }))

In the above code, databaseProxyEndpoint has been everything from a CfnOutput, SSM Parameter to a plain old typescript reference but all failed due to the value being empty, missing, or not generated yet.

I felt this was close, since it works perfectly fine until I try and reference databaseProxyEndpoint.


My second attempt was to create an init container in ECS.

   const migrationContainer = webApplicationLoadBalancer.taskDefinition.addContainer('init', {
      image: ecs.ContainerImage.fromDockerImageAsset(webPhpDockerImageAsset),
      essential: false,
      logging: logger,
      environment: {
        DB_HOST: databaseProxy.endpoint,
        // ...
        // ...
      },
      secrets: {
        DB_PASSWORD: ecs.Secret.fromSecretsManager(databaseSecret, 'password')
      },
      command: [
        "sh",
        "-c",
        [
          "php artisan migrate --force",
        ].join(" && "),
      ]
    });

    // Make sure migrations run and our init container return success
    serviceContainer.addContainerDependencies({
      container: migrationContainer,
      condition: ecs.ContainerDependencyCondition.SUCCESS,
    });

This worked, but I am not a fan at all. The migration command should run once in the ci/cd pipeline on a deploy, not when the ECS service starts/restarts or scales... My migrations failed once and it locked up cloudformation because the health check failed both on the deploy and then naturally on the rollback as well causing a completely broken loop of pain.

Any ideas or suggestions on how to pull this off would save me from losing the remaining hair i have left!

答案1

得分: 2

我不会在CDK Pipeline的构建步骤中解决这个问题。

相反,我会选择CustomResource的方法。使用自定义资源,特别是在CDK中,您始终知道依赖关系以及何时运行它们。在CDK Pipeline的上下文中,这完全丧失了,您需要自行查找或实现。

那么,自定义资源是什么样子?

// 这个Lambda函数是一个示例定义,您将在其中运行实际的迁移命令
const migrationFunction = new lambda.Function(this, 'MigrationFunction', {
    runtime: lambda.Runtime.PROVIDED_AL2,
    code: lambda.Code.fromAsset('path/to/migration.ts'),
    layers: [
        // 在这里找到层:
        // https://bref.sh/docs/runtimes/#lambda-layers-in-details
        // https://bref.sh/docs/runtimes/#layer-version-
        lambda.LayerVersion.fromLayerVersionArn(this, 'BrefPHPLayer', 'arn:aws:lambda:us-east-1:209497400698:layer:php-80:21')
    ],
    timeout: cdk.Duration.seconds(30),
    memorySize: 256,
});

const migrationFunctionProvider = new Provider(this, 'MigrationProvider', {
    onEventHandler: migrationFunction,
});

new CustomResource(this, 'MigrationCustomResource', {
    serviceToken: migrationFunctionProvider.serviceToken,
    properties: {
        date: new Date(Date.now()).toUTCString(),
    },
});
// migration.ts
import child_process from 'child_process';
import AWS from 'aws-sdk';

const sm = new AWS.SecretsManager();

export const handler = async (event, context) => {
    // 事件提供了比环境变量更灵活的选项
    const { dbName, secretName } = event;

    // 从AWS Secrets Manager检索数据库凭据
    const secret = await sm.getSecretValue({ SecretId: secretName }).promise();
    const { username, password } = JSON.parse(secret.SecretString);

    // 使用数据库凭据运行迁移命令
    const command = `php artisan migrate --database=mysql --host=your-database-host --port=3306 --database=${dbName} --username=${username} --password=${password}`;
    child_process.exec(command, (error, stdout, stderr) => {
        if (error) {
            console.error(`exec error: ${error}`);
            return;
        }
        console.log(`stdout: ${stdout}`);
        console.error(`stderr: ${stderr}`);
    });
};

CustomResource接受您的迁移Lambda函数。Lambda运行实际命令来执行数据库迁移。每次运行部署时都会应用自定义资源。这是通过date值应用的。您可以通过更改CustomResource中的任何属性来控制执行。

英文:

I wouldn't solve it within a build step of a CDK Pipeline.

Rather I'd go for the CustomResource approach.
With Custom Resources, especially in CDK, you're always aware of the dependencies and when you need to run them.
This gets completely lost within a CDK Pipeline context and you need to find out/implement by yourself.

So, what does a Custom Resource look like?


// this lambda function is an example definition, where you would run your actual migration commands
const migrationFunction = new lambda.Function(this, 'MigrationFunction', {
      runtime: lambda.Runtime.PROVIDED_AL2,
      code: lambda.Code.fromAsset('path/to/migration.ts'),
      layers: [
        // find the layers here: 
        // https://bref.sh/docs/runtimes/#lambda-layers-in-details
        // https://bref.sh/docs/runtimes/#layer-version-
        lambda.LayerVersion.fromLayerVersionArn(this, 'BrefPHPLayer', 'arn:aws:lambda:us-east-1:209497400698:layer:php-80:21')
      ],
      timeout: cdk.Duration.seconds(30),
      memorySize: 256,
    });

      const migrationFunctionProvider = new Provider(this, 'MigrationProvider', {
      onEventHandler: migrationFunction,
    });

    new CustomResource(this, 'MigrationCustomResource', {
      serviceToken: migrationFunctionProvider.serviceToken,
      properties: {
        date: new Date(Date.now()).toUTCString(),
      },
    });
  }

  // grant your migration lambda the policies to read secrets for your DB connection etc.
// migration.ts
import child_process from 'child_process';
import AWS from 'aws-sdk';

const sm = new AWS.SecretsManager();

export const handler = async (event, context) => {
  // an event provides more flexibility than env vars
  const { dbName, secretName } = event;

  // Retrieve the database credentials from AWS Secrets Manager
  const secret = await sm.getSecretValue({ SecretId: secretName }).promise();
  const { username, password } = JSON.parse(secret.SecretString);

  // Run the migration command with the database credentials
  const command = `php artisan migrate --database=mysql --host=your-database-host --port=3306 --database=${dbName} --username=${username} --password=${password}`;
  child_process.exec(command, (error, stdout, stderr) => {
    if (error) {
      console.error(`exec error: ${error}`);
      return;
    }
    console.log(`stdout: ${stdout}`);
    console.error(`stderr: ${stderr}`);
  });
};

The Custom-Resource takes your migration lambda function.
The Lambda runs the actual command to do your database migration.
The Custom Resource is applied every time when running a deployment.
This is applied via the date value.
You can control the execution by altering any property within the CustomResource.

答案2

得分: 1

以下是翻译好的部分:

  1. Within a stack: Migrations as a Custom Resource

    • 一个选项是将您的迁移定义为CustomResource。这是CloudFormation的一个功能,用于在堆栈部署生命周期中执行用户定义的代码(通常在Lambda中执行)。参见 @mchlfchr 的答案中的示例。还可以考虑使用CDK的Trigger构造,这是更高级的Custom Resource实现。
  2. After a stack or stage: "post" Step

    • 如果您将应用程序拆分为例如 StatefulStack(数据库)和 StatelessStack(应用程序容器),您可以将迁移代码作为 post Step 在两者之间运行。这是OP尝试的方法。在您的 StatefulStack 中,变量产生器为环境变量值公开了一个 CfnOutput 实例变量:readonly databaseProxyEndpoint: CfnOutput。然后通过将它们传递给 post 步骤作为 envFromCfnOutputs 来在管道迁移操作中使用这些变量。CDK将它们合成为CodePipeline Variables
  3. After the Pipeline execution: EventBridge rule

    • 尽管这可能不是最佳选项,但您可以在管道执行后运行迁移。CodePipeline在管道执行期间发出事件。使用EventBridge规则,监听 CodePipeline Pipeline Execution State Change 事件,其中 "state": "SUCCEEDED"

有关故障模式的说明: 这三种选项具有不同的故障模式。如果将迁移定义为Custom Resource 并且迁移失败,StatefulStack 部署将失败(更改将被回滚),并且管道执行也将失败。如果将迁移实现为步骤,管道执行将失败,但 StatefulStack 不会回滚。最后,如果迁移是事件触发的,失败的迁移将不会影响堆栈或执行,因为它们在迁移运行时已经完成。

英文:

You can run your migrations (1) within a stack's deployment with a Custom Resource construct, (2) after a stack's or stage's deployment with a post Step, (3) or after the pipeline has run with an EventBridge rule.

1. Within a stack: Migrations as a Custom Resource

One option is to define your migrations as a CustomResource. It's a CloudFormation feature for executing user-defined code (typically in a Lambda) during the stack deployment lifecycle. See @mchlfchr's answer for an example. Also consider the CDK Trigger construct, a higher-level Custom Resource implementation.

2. After a stack or stage: "post" Step

If you split your application into, say, a StatefulStack (database) and StatelessStack (application containers), you can run your migrations code as a post Step between the two. This is the approach attempted in the OP.

In your StatefulStack, the variable producer, expose a CfnOutput instance variable for the environment variable values: readonly databaseProxyEndpoint: CfnOutput. Then consume the variables in a pipeline migration action by passing them to a post step as envFromCfnOutputs. The CDK will synth them into CodePipeline Variables:

pipeline.addStage(myStage, { // myStage includes the StatefulStack and StatelessStack instances
    stackSteps: [
        {
            stack: statefulStack,
            post: [
                new pipelines.CodeBuildStep("Migrate", {
                    commands: [ 'cd /var/www/html', 'php artisan migrate --force',],
                    envFromCfnOutputs: { TABLE_ARN: stack1.tableArn },
                    // ... other step config
                }),
            ],
        },
    ],
    post: // steps to run after the stage
});

The addStage method's stackSteps option runs post steps after a specific stack in a stage. The post option work similarly, but runs after the stage.

3. After the Pipeline execution: EventBridge rule

Although it's likely not the best option, you could run migrations after the pipeline executes. CodePipeline emits events during pipeline execution. With an EventBridge rule, listen for CodePipeline Pipeline Execution State Change events where "state": "SUCCEEDED".


Note on failure modes: The three options have different failure modes. If the migrations fail as a Custom Resource, the StatefulStack deployment will fail (with changes rolled back) and the pipeline execution will fail. If the migrations are implemented as a step, the pipeline execution will fail but the StatefulStack won't roll back. Finally, if migrations are event-triggered, a failed migration will affect neither the stack nor execution, as they will already be finished when the migrations run.

huangapple
  • 本文由 发表于 2023年2月14日 05:25:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/75441330.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定