MaterializeMySQL

Creates ClickHouse database with all the tables existing in MySQL, and all the data in those tables.

ClickHouse server works as MySQL replica. It reads binlog and performs DDL and DML queries.

This feature is experimental.

Creating a Database

  1. CREATE DATABASE [IF NOT EXISTS] db_name [ON CLUSTER cluster]
  2. ENGINE = MaterializeMySQL('host:port', ['database' | database], 'user', 'password') [SETTINGS ...]

Engine Parameters

  • host:port — MySQL server endpoint.
  • database — MySQL database name.
  • user — MySQL user.
  • password — User password.

Virtual columns

When working with the MaterializeMySQL database engine, ReplacingMergeTree tables are used with virtual _sign and _version columns.

  • _version — Transaction counter. Type UInt64.
  • _sign — Deletion mark. Type Int8. Possible values:
    • 1 — Row is not deleted,
    • -1 — Row is deleted.

Data Types Support

MySQLClickHouse
TINYInt8
SHORTInt16
INT24Int32
LONGUInt32
LONGLONGUInt64
FLOATFloat32
DOUBLEFloat64
DECIMAL, NEWDECIMALDecimal
DATE, NEWDATEDate
DATETIME, TIMESTAMPDateTime
DATETIME2, TIMESTAMP2DateTime64
STRINGString
VARCHAR, VAR_STRINGString
BLOBString

Other types are not supported. If MySQL table contains a column of such type, ClickHouse throws exception “Unhandled data type” and stops replication.

Nullable is supported.

Specifics and Recommendations

DDL Queries

MySQL DDL queries are converted into the corresponding ClickHouse DDL queries (ALTER, CREATE, DROP, RENAME). If ClickHouse cannot parse some DDL query, the query is ignored.

Data Replication

MaterializeMySQL does not support direct INSERT, DELETE and UPDATE queries. However, they are supported in terms of data replication:

  • MySQL INSERT query is converted into INSERT with _sign=1.

  • MySQl DELETE query is converted into INSERT with _sign=-1.

  • MySQL UPDATE query is converted into INSERT with _sign=-1 and INSERT with _sign=1.

Selecting from MaterializeMySQL Tables

SELECT query from MaterializeMySQL tables has some specifics:

  • If _version is not specified in the SELECT query, FINAL modifier is used. So only rows with MAX(_version) are selected.

  • If _sign is not specified in the SELECT query, WHERE _sign=1 is used by default. So the deleted rows are not included into the result set.

Index Conversion

MySQL PRIMARY KEY and INDEX clauses are converted into ORDER BY tuples in ClickHouse tables.

ClickHouse has only one physical order, which is determined by ORDER BY clause. To create a new physical order, use materialized views.

Notes

  • Rows with _sign=-1 are not deleted physically from the tables.
  • Cascade UPDATE/DELETE queries are not supported by the MaterializeMySQL engine.
  • Replication can be easily broken.
  • Manual operations on database and tables are forbidden.
  • MaterializeMySQL is influenced by optimize_on_insert setting. The data is merged in the corresponding table in the MaterializeMySQL database when a table in the MySQL server changes.

Examples of Use

Queries in MySQL:

  1. mysql> CREATE DATABASE db;
  2. mysql> CREATE TABLE db.test (a INT PRIMARY KEY, b INT);
  3. mysql> INSERT INTO db.test VALUES (1, 11), (2, 22);
  4. mysql> DELETE FROM db.test WHERE a=1;
  5. mysql> ALTER TABLE db.test ADD COLUMN c VARCHAR(16);
  6. mysql> UPDATE db.test SET c='Wow!', b=222;
  7. mysql> SELECT * FROM test;
  1. +---+------+------+
  2. | a | b | c |
  3. +---+------+------+
  4. | 2 | 222 | Wow! |
  5. +---+------+------+

Database in ClickHouse, exchanging data with the MySQL server:

The database and the table created:

  1. CREATE DATABASE mysql ENGINE = MaterializeMySQL('localhost:3306', 'db', 'user', '***');
  2. SHOW TABLES FROM mysql;
  1. ┌─name─┐
  2. test
  3. └──────┘

After inserting data:

  1. SELECT * FROM mysql.test;
  1. ┌─a─┬──b─┐
  2. 1 11
  3. 2 22
  4. └───┴────┘

After deleting data, adding the column and updating:

  1. SELECT * FROM mysql.test;
  1. ┌─a─┬───b─┬─c────┐
  2. 2 222 Wow!
  3. └───┴─────┴──────┘

Original article