Postponed migrations

Postponed migrations allow:

  • Postponing the launch of a migration, and/or
  • Postponing the final cut-over/completion of a migration.

In both cases, it takes an explicit user interaction to launch or to complete the migration.

Normally, migrations are executed by Vitess and are launched and completed automatically. For example, an ALTER on a large table can take hours or more to complete. Vitess automatically instates the new schema in place whenever it is satisfied that the ALTER is complete. Or, a DROP statement could wait in queue while other statements are running, only to actually execute hours later.

Postpone launch

A postponed-launch migration will remain in queued state and will not start executing, until instructed to launch.

Add --postpone-launch to any of the online DDL strategies. Example:

  1. mysql> set @@ddl_strategy='vitess --postpone-launch';
  2. -- The migration is tracked, but the ALTER process won't start running.
  3. mysql> alter table mytable add column i int not null;
  4. +--------------------------------------+
  5. | uuid |
  6. +--------------------------------------+
  7. | 9e8a9249_3976_11ed_9442_0a43f95f28a3 |
  8. +--------------------------------------+

Migrations executed with --postpone-launch are visible on show vitess_migrations just as normal. They will present as queued. The column postpone_launch indicates that the migration will not auto-start:

  1. mysql> show vitess_migrations like '9e8a9249_3976_11ed_9442_0a43f95f28a3' \G
  2. *************************** 1. row ***************************
  3. id: 3
  4. migration_uuid: 9e8a9249_3976_11ed_9442_0a43f95f28a3
  5. keyspace: commerce
  6. shard: 0
  7. mysql_schema: vt_commerce
  8. mysql_table: mytable
  9. migration_statement: alter table mytable add column i int not null
  10. strategy: vitess
  11. options: --postpone-launch
  12. added_timestamp: 2022-09-21 06:28:34
  13. requested_timestamp: 2022-09-21 06:28:34
  14. ready_timestamp: NULL
  15. started_timestamp: NULL
  16. liveness_timestamp: NULL
  17. completed_timestamp: NULL
  18. cleanup_timestamp: NULL
  19. migration_status: queued
  20. ...
  21. postpone_launch: 1

Use case

The use case is specific to multi sharded environments. In some cases, a user may wish to experiment a migration on a single shard:

  • To see how long it takes to run.
  • To see what impact it has on production traffic.
  • To see what impact the schema change has.

Postponed launch migrations make it possible to launch a migration on specific shards, while the migration remains queued on the rest of the shards.

Completing a migration on a subset of the shards means different shards will have different schemas. Do not do this when adding/removing columns, because the table on affected shards will be inconsistent with that of unaffected shards. This feature can be useful to experiment with adding/removing (ideally non-unique) indexes.

Launching a postponed migration

Launching a postponed-launch migration is achieved by the following commands:

  1. mysql> alter vitess_migration '9e8a9249_3976_11ed_9442_0a43f95f28a3' launch;

The above unblocks the specific migration on all shards. The migration will execute at the discretion of the Online DDL executor.

  1. mysql> alter vitess_migration '9e8a9249_3976_11ed_9442_0a43f95f28a3' launch vitess_shards '-40,40-80';

This variation accepts a comma delimited list of shard names. The migration will only launch on the specified shards. the rest of the shards ignore the command. An empty list of shards lets the command run on all shards.

  1. mysql> alter vitess_migration launch all;

Launches all currently postponed migrations on all shards.

Postponed launch is supported for all migrations.

Postpone completion

A common requirement by engineers is to have more control over the cut-over time. With postponed completion migrations, it is possible to:

  • Invoke a migration that postpones completion
  • Manually COMPLETE a migration

This lets an engineer observe the change of schema at a point when they’re comfortably at their console and prepared to take action should any issue occur.

Add --postpone-completion to any* (see supported migrations) of the online DDL strategies. Example:

  1. mysql> set @@ddl_strategy='vitess --postpone-completion';
  2. -- The migration is tracked, but the table won't get created
  3. mysql> create table mytable(id int primary key);
  4. +--------------------------------------+
  5. | uuid |
  6. +--------------------------------------+
  7. | a1dac193_4b86_11ec_a827_0a43f95f28a3 |
  8. +--------------------------------------+

Migrations executed with --postpone-completion are visible on show vitess_migrations just as normal. They will present either as queued, ready or running, at the scheduler’s discretion. But they will not actually make changes to affected tables. The column postpone_completion indicates that the migration will not auto-complete:

  1. mysql> show vitess_migrations like 'a1dac193_4b86_11ec_a827_0a43f95f28a3' \G
  2. id: 1
  3. migration_uuid: a1dac193_4b86_11ec_a827_0a43f95f28a3
  4. keyspace: commerce
  5. shard: 0
  6. mysql_schema: vt_commerce
  7. mysql_table: my_table
  8. migration_statement: create table my_table (
  9. id int primary key
  10. )
  11. strategy: vitess
  12. options: --postpone-completion --allow-zero-in-date
  13. added_timestamp: 2021-11-22 11:23:35
  14. requested_timestamp: 0000-00-00 00:00:00
  15. ready_timestamp: NULL
  16. started_timestamp: NULL
  17. liveness_timestamp: NULL
  18. completed_timestamp: NULL
  19. cleanup_timestamp: NULL
  20. migration_status: queued
  21. log_path:
  22. artifacts:
  23. retries: 0
  24. tablet: zone1-0000000100
  25. tablet_failure: 0
  26. progress: 0
  27. migration_context: vtgate:a1d8c5e0-4b86-11ec-a827-0a43f95f28a3
  28. ddl_action: create
  29. message:
  30. eta_seconds: -1
  31. rows_copied: 0
  32. table_rows: 0
  33. added_unique_keys: 0
  34. removed_unique_keys: 0
  35. log_file:
  36. retain_artifacts_seconds: 86400
  37. postpone_completion: 1
  1. -- The migration is tracked, will start running when scheduler chooses, but will not cut-over
  2. -- to replace the table with the new schema
  3. mysql> alter table another_table add column ts timestamp not null;
  4. +--------------------------------------+
  5. | uuid |
  6. +--------------------------------------+
  7. | b7d6e6fb_8a74_11eb_badd_f875a4d24e90 |
  8. +--------------------------------------+
  9. *************************** 1. row ***************************
  10. id: 3
  11. migration_uuid: 3091ef2a_4b87_11ec_a827_0a43f95f28a3
  12. ...
  13. strategy: vitess
  14. options: --postpone-completion
  15. added_timestamp: 2021-11-22 11:27:34
  16. requested_timestamp: 0000-00-00 00:00:00
  17. ready_timestamp: 2021-11-22 11:27:35
  18. started_timestamp: 2021-11-22 11:27:35
  19. liveness_timestamp: 2021-11-22 11:27:39
  20. completed_timestamp: NULL
  21. cleanup_timestamp: NULL
  22. migration_status: running
  23. ...
  24. postpone_completion: 1

Completing a postponed migration

Completing a postponed-completion migration is achieved by:

  1. mysql> alter vitess_migration 'b7d6e6fb_8a74_11eb_badd_f875a4d24e90' complete;

This command instructs Vitess that the migration should not kept waiting any further.

The command serves as a hint. It does not synchronously cut-over the migration. It is possible that the migration is not yet ready to cut-over (e.g. a long running ALTER may not be done copying all necessary data)

After issuing the command, value of postpone_completion turns to 0:

  1. mysql> show vitess_migrations like '3091ef2a_4b87_11ec_a827_0a43f95f28a3' \G
  2. *************************** 1. row ***************************
  3. id: 3
  4. migration_uuid: 3091ef2a_4b87_11ec_a827_0a43f95f28a3
  5. ...
  6. strategy: vitess
  7. options: --postpone-completion
  8. added_timestamp: 2021-11-22 11:27:34
  9. requested_timestamp: 0000-00-00 00:00:00
  10. ready_timestamp: 2021-11-22 11:27:35
  11. started_timestamp: 2021-11-22 11:27:35
  12. liveness_timestamp: 2021-11-22 11:29:32
  13. completed_timestamp: NULL
  14. cleanup_timestamp: NULL
  15. migration_status: running
  16. ...
  17. postpone_completion: 0

In the above the migration is still running. The scheduler has not determined yet that it is ready to cut-over. Continuing the example, two seconds later:

  1. mysql> show vitess_migrations like '3091ef2a_4b87_11ec_a827_0a43f95f28a3' \G
  2. *************************** 1. row ***************************
  3. id: 3
  4. migration_uuid: 3091ef2a_4b87_11ec_a827_0a43f95f28a3
  5. ...
  6. strategy: vitess
  7. options: --postpone-completion
  8. added_timestamp: 2021-11-22 11:27:34
  9. requested_timestamp: 0000-00-00 00:00:00
  10. ready_timestamp: 2021-11-22 11:27:35
  11. started_timestamp: 2021-11-22 11:27:35
  12. liveness_timestamp: 2021-11-22 11:29:32
  13. completed_timestamp: 2021-11-22 11:29:33
  14. cleanup_timestamp: NULL
  15. migration_status: complete
  16. ...
  17. postpone_completion: 0

Supported migrations

Postponed completion is supported for:

  • CREATE and DROP for all online strategies
  • ALTER migrations in vitess (formerly known as online) strategy
  • ALTER migrations in gh-ost strategy
  • REVERT migrations, including cascading REVERT operations

Postponed completion is not supported in:

  • direct strategy
  • pt-osc for ALTER migrations

declarative migrations will remain queued when --postpone-migration is specified, until alter vitess_migration ... complete is issued. This is true whether the declarative migration implies an eventual CREATE, DROP or ALTER.

Implementation details

The two strong cases for postponed migrations are DROP and long running ALTERs. Both carry an amount of risk to production above other migrations.

Postponed ALTER migrations (in vitess and gh-ost strategies) are actually executed, and begin copying table data as well as track ongoing changes. But as they reach the point where cut-over is agreeable, they stall, and keep waiting until the user issues the alter vitess_migration ... complete statement. Assuming the user runs the statement when all data has already been copied, it is typically a matter of seconds until the migration completes and the new schema is instated.

For CREATE and DROP statements, there’s no such backfill process as with ALTER, and the migrations are simply not scheduled, until the user issues the complete statement. Once the statement is issued, the migrations still need to be scheduled, and may be possibly delayed by an existing queue of migrations.

Mixing postponed launch and completion

You may use both --postpone-launch --postpone-completion for a migration. For example, if you wanted to just see how long it takes to run a migration on a single shard and what impact it makes, but then only cut-over with all other shards, you would:

  1. mysql> set @@ddl_strategy='vitess --postpone-launch --postpone-completion';
  2. -- The migration is tracked, but the ALTER process won't start running.
  3. mysql> alter table mytable add column i int not null;
  4. +--------------------------------------+
  5. | uuid |
  6. +--------------------------------------+
  7. | 9e8a9249_3976_11ed_9442_0a43f95f28a3 |
  8. +--------------------------------------+
  9. mysql> alter vitess_migration '9e8a9249_3976_11ed_9442_0a43f95f28a3' launch vitess_shards '-40';

You’d then wait until ready_to_complete to be 1 for that specific shard, at which time you’ll have gathered all your data. You’d then:

  1. mysql> alter vitess_migration launch all;

Next, you’d wait for all shard to reach ready_to_complete, at which time you’d cut-over all of them:

  1. mysql> alter vitess_migration complete all;