Analyzing a SQL statement using VTEXPLAIN

Introduction

This document covers the way Vitess executes a particular SQL statement using the VTExplain tool. This tool works similarly to the MySQL EXPLAIN statement. You can run vtexplain before you have a running Vitess cluster, which lets you quickly try different schema/vschema. If you’re already running a cluster, you can also use the VEXPLAIN QUERIES|ALL|PLAN command from a SQL console.

Prerequisites

You can find a prebuilt binary version of the VTExplain tool in the most recent release of Vitess.

You can also build the vtexplain binary in your environment. To build this binary, refer to the build guide for your OS.

Overview

To successfully analyze your SQL queries and determine how Vitess executes each statement, follow these steps:

  1. Identify a SQL schema for the statement’s source tables
  2. Identify a VSchema for the statement’s source tables
  3. Run the VTExplain tool

If you have a large number of queries you want to analyze for issues, based on a Vschema you’ve created for your database, you can read through a detailed scripted example here.

1. Identify a SQL schema for tables in the statement

In order to explain a statement, first identify the SQL schema for the tables that the statement uses. This includes tables that a query targets and tables that a DML statement modifies.

Example SQL Schema

The following example SQL schema creates two tables, users and users_name_idx, each of which contain the columns user_id and name, and define both columns as a composite primary key. The example statements in step 3 include these tables.

  1. CREATE TABLE users(
  2. user_id bigint,
  3. name varchar(128),
  4. primary key(user_id)
  5. );
  6. CREATE TABLE users_name_idx(
  7. user_id bigint,
  8. name varchar(128),
  9. primary key(name, user_id)
  10. );

2. Identify a VSchema for the statement’s source tables

Next, identify a VSchema that contains the Vindexes for the tables in the statement.

The VSchema must use a keyspace name.

VTExplain requires a keyspace name for each keyspace in an input VSchema:

  1. "keyspace_name": {
  2. "_comment": "Keyspace definition goes here."
  3. }

If no keyspace name is present, VTExplain will return the following error:

  1. ERROR: initVtgateExecutor: json: cannot unmarshal bool into Go value of type map[string]json.RawMessage

Example VSchema

The following example VSchema defines a single keyspace mainkeyspace and three Vindexes, and specifies vindexes for each column in the two tables users and users_name_idx. The keyspace name "mainkeyspace" precedes the keyspace definition object.

  1. {
  2. "mainkeyspace": {
  3. "sharded": true,
  4. "vindexes": {
  5. "hash": {
  6. "type": "hash"
  7. },
  8. "md5": {
  9. "type": "unicode_loose_md5",
  10. "params": {},
  11. "owner": ""
  12. },
  13. "users_name_idx": {
  14. "type": "lookup_hash",
  15. "params": {
  16. "from": "name",
  17. "table": "users_name_idx",
  18. "to": "user_id"
  19. },
  20. "owner": "users"
  21. }
  22. },
  23. "tables": {
  24. "users": {
  25. "column_vindexes": [
  26. {
  27. "column": "user_id",
  28. "name": "hash"
  29. },
  30. {
  31. "column": "name",
  32. "name": "users_name_idx"
  33. }
  34. ],
  35. "auto_increment": null
  36. },
  37. "users_name_idx": {
  38. "type": "",
  39. "column_vindexes": [
  40. {
  41. "column": "name",
  42. "name": "md5"
  43. }
  44. ],
  45. "auto_increment": null
  46. }
  47. }
  48. }
  49. }

3. Run the VTExplain tool

To explain a query, pass the SQL schema and VSchema files as arguments to the VTExplain command.

Example: Explaining a SELECT query

In the following example, the VTExplain command takes a SELECT query and returns the sequence of queries that Vitess runs in order to execute the query:

  1. vtexplain --shards 8 --vschema-file vschema.json --schema-file schema.sql --replication-mode "ROW" --output-mode text --sql "SELECT * from users"
  2. ----------------------------------------------------------------------
  3. SELECT * from users
  4. 1 mainkeyspace/-20: select * from users limit 10001
  5. 1 mainkeyspace/20-40: select * from users limit 10001
  6. 1 mainkeyspace/40-60: select * from users limit 10001
  7. 1 mainkeyspace/60-80: select * from users limit 10001
  8. 1 mainkeyspace/80-a0: select * from users limit 10001
  9. 1 mainkeyspace/a0-c0: select * from users limit 10001
  10. 1 mainkeyspace/c0-e0: select * from users limit 10001
  11. 1 mainkeyspace/e0-: select * from users limit 10001
  12. ----------------------------------------------------------------------
  13. <nil>

In the example above, the output of VTExplain shows the sequence of queries that Vitess runs in order to execute the query. Each line shows the logical sequence of the query, the keyspace where the query executes, the shard where the query executes, and the query that executes, in the following format:

  1. [Sequence number] [keyspace]/[shard]: [query]

In this example, each query has sequence number 1, which shows that Vitess executes these in parallel. Vitess automatically adds the LIMIT 10001 clause to protect against large results.

Example: Explaining an INSERT query

In the following example, the VTExplain command takes an INSERT query and returns the sequence of queries that Vitess runs in order to execute the query:

  1. vtexplain --shards 128 --vschema-file vschema.json --schema-file schema.sql --replication-mode "ROW" --output-mode text --sql "INSERT INTO users (user_id, name) VALUES(1, 'john')"
  2. ----------------------------------------------------------------------
  3. INSERT INTO users (user_id, name) VALUES(1, 'john')
  4. 1 mainkeyspace/22-24: begin
  5. 1 mainkeyspace/22-24: insert into users_name_idx(`name`, user_id) values ('john', 1)
  6. 2 mainkeyspace/16-18: begin
  7. 2 mainkeyspace/16-18: insert into users(user_id, `name`) values (1, 'john')
  8. 3 mainkeyspace/22-24: commit
  9. 4 mainkeyspace/16-18: commit
  10. ----------------------------------------------------------------------
  11. <nil>

The example above shows how Vitess handles an insert into a table with a secondary lookup Vindex:

  • At sequence number 1, Vitess opens a transaction on shard 22-24 to insert the row into the users_name_idx table.
  • At sequence number 2, Vitess opens a second transaction on shard 16-18 to perform the actual insert into the users table.
  • At sequence number 3, the first transaction commits.
  • At sequence number 4, the second transaction commits.

Example: Explaining an uneven keyspace

In previous examples, we used the --shards flag to set up an evenly-sharded keyspace, where each shard covers the same fraction of the keyrange. VTExplain also supports receiving a JSON mapping of shard ranges to see how Vitess would handle a query against an arbitrarly-sharded keyspace.

To do this, we first create a JSON file containing a mapping of keyspace names to shardrange maps. The shardrange map has the same structure as the output of running vtctl FindAllShardsInKeyspace <keyspace>.

  1. {
  2. "mainkeyspace": {
  3. "-80": {
  4. "primary_alias": {
  5. "cell": "test",
  6. "uid": 100
  7. },
  8. "primary_term_start_time": {
  9. "seconds": 1599828375,
  10. "nanoseconds": 664404881
  11. },
  12. "key_range": {
  13. "end": "gA=="
  14. },
  15. "is_primary_serving": true
  16. },
  17. "80-90": {
  18. "primary_alias": {
  19. "cell": "test",
  20. "uid": 200
  21. },
  22. "primary_term_start_time": {
  23. "seconds": 1599828344,
  24. "nanoseconds": 868327074
  25. },
  26. "key_range": {
  27. "start": "gA==",
  28. "end": "kA=="
  29. },
  30. "is_primary_serving": true
  31. },
  32. "90-a0": {
  33. "primary_alias": {
  34. "cell": "test",
  35. "uid": 300
  36. },
  37. "primary_term_start_time": {
  38. "seconds": 1599828405,
  39. "nanoseconds": 152120945
  40. },
  41. "key_range": {
  42. "start": "kA==",
  43. "end": "oA=="
  44. },
  45. "is_primary_serving": true
  46. },
  47. "a0-e8": {
  48. "primary_alias": {
  49. "cell": "test",
  50. "uid": 400
  51. },
  52. "primary_term_start_time": {
  53. "seconds": 1599828183,
  54. "nanoseconds": 911677983
  55. },
  56. "key_range": {
  57. "start": "oA==",
  58. "end": "6A=="
  59. },
  60. "is_primary_serving": true
  61. },
  62. "e8-": {
  63. "primary_alias": {
  64. "cell": "test",
  65. "uid": 500
  66. },
  67. "primary_term_start_time": {
  68. "seconds": 1599827865,
  69. "nanoseconds": 770606551
  70. },
  71. "key_range": {
  72. "start": "6A=="
  73. },
  74. "is_primary_serving": true
  75. }
  76. }
  77. }

After having saved that to a file called shardmaps.json:

  1. vtexplain --vschema-file vschema.json --schema-file schema.sql --ks-shard-map "$(cat shardmaps.json)" --replication-mode "ROW" --output-mode text --sql "SELECT * FROM users; SELECT * FROM users WHERE id IN (10, 17, 42, 1000);"
  2. ----------------------------------------------------------------------
  3. SELECT * FROM users
  4. 1 mainkeyspace/-80: select * from users limit 10001
  5. 1 mainkeyspace/80-90: select * from users limit 10001
  6. 1 mainkeyspace/90-a0: select * from users limit 10001
  7. 1 mainkeyspace/a0-e8: select * from users limit 10001
  8. 1 mainkeyspace/e8-: select * from users limit 10001
  9. ----------------------------------------------------------------------
  10. SELECT * FROM users WHERE id IN (10, 17, 42, 1000)
  11. 1 mainkeyspace/-80: select * from users where id in (10, 17, 42, 1000) limit 10001
  12. 1 mainkeyspace/80-90: select * from users where id in (10, 17, 42, 1000) limit 10001
  13. 1 mainkeyspace/90-a0: select * from users where id in (10, 17, 42, 1000) limit 10001
  14. 1 mainkeyspace/a0-e8: select * from users where id in (10, 17, 42, 1000) limit 10001
  15. 1 mainkeyspace/e8-: select * from users where id in (10, 17, 42, 1000) limit 10001
  16. ----------------------------------------------------------------------
  17. <nil>

See also