概述

MySQL从5.7.8起开始支持JSON字段,这极大的丰富了MySQL的数据类型。也方便了广大开发人员。但MySQL并没有提供对JSON对象中的字段进行索引的功能,至少没有直接对其字段进行索引的方法。本文将介绍利用MySQL 5.7中的虚拟字段的功能来对JSON对象中的字段进行索引。

示例数据

我们将基于下面的JSON对象进行演示

  1. {
  2. "id": 1,
  3. "name": "Sally",
  4. "games_played":{
  5. "Battlefield": {
  6. "weapon": "sniper rifle",
  7. "rank": "Sergeant V",
  8. "level": 20
  9. },
  10. "Crazy Tennis": {
  11. "won": 4,
  12. "lost": 1
  13. },
  14. "Puzzler": {
  15. "time": 7
  16. }
  17. }
  18. }

表的基本结构

  1. CREATE TABLE `players` (
  2. `id` INT UNSIGNED NOT NULL,
  3. `player_and_games` JSON NOT NULL,
  4. PRIMARY KEY (`id`)
  5. );

如果只是基于上面的表的结构我们是无法对JSON字段中的Key进行索引的。接下来我们演示如何借助虚拟字段对其进行索引

增加虚拟字段

虚拟列语法如下

  1. <type> [ GENERATED ALWAYS ] AS ( <expression> ) [ VIRTUAL|STORED ]
  2. [ UNIQUE [KEY] ] [ [PRIMARY] KEY ] [ NOT NULL ] [ COMMENT <text> ]

在MySQL 5.7中,支持两种Generated Column,即Virtual Generated Column和Stored Generated Column,前者只将Generated Column保存在数据字典中(表的元数据),并不会将这一列数据持久化到磁盘上;后者会将Generated Column持久化到磁盘上,而不是每次读取的时候计算所得。很明显,后者存放了可以通过已有数据计算而得的数据,需要更多的磁盘空间,与Virtual Column相比并没有优势,因此,MySQL 5.7中,不指定Generated Column的类型,默认是Virtual Column。

如果需要Stored Generated Golumn的话,可能在Virtual Generated Column上建立索引更加合适,一般情况下,都使用Virtual Generated Column,这也是MySQL默认的方式

加完虚拟列的建表语句如下:

  1. CREATE TABLE `players` (
  2. `id` INT UNSIGNED NOT NULL,
  3. `player_and_games` JSON NOT NULL,
  4. `names_virtual` VARCHAR(20) GENERATED ALWAYS AS (`player_and_games` ->> '$.name') NOT NULL,
  5. PRIMARY KEY (`id`)
  6. );

Note: 利用操作符-» 来引用JSON字段中的KEY。在本例中字段names_virtual为虚拟字段,我把它定义成不可以为空。在实际的工作中,一定要集合具体的情况来定。因为JSON本身是一种弱结构的数据对象。也就是说的它的结构不是固定不变的。

我们插入数据

  1. INSERT INTO `players` (`id`, `player_and_games`) VALUES (1, '{
  2. "id": 1,
  3. "name": "Sally",
  4. "games_played":{
  5. "Battlefield": {
  6. "weapon": "sniper rifle",
  7. "rank": "Sergeant V",
  8. "level": 20
  9. },
  10. "Crazy Tennis": {
  11. "won": 4,
  12. "lost": 1
  13. },
  14. "Puzzler": {
  15. "time": 7
  16. }
  17. }
  18. }'
  19. );
  20. ...

查看表里的数据

  1. SELECT * FROM `players`;
  2. +----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+
  3. | id | player_and_games | names_virtual |
  4. +----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+
  5. | 1 | {"id": 1, "name": "Sally", "games_played": {"Puzzler": {"time": 7}, "Battlefield": {"rank": "Sergeant V", "level": 20, "weapon": "sniper rifle"}, "Crazy Tennis": {"won": 4, "lost": 1}}} | Sally |
  6. | 2 | {"id": 2, "name": "Thom", "games_played": {"Puzzler": {"time": 25}, "Battlefield": {"rank": "Major General VIII", "level": 127, "weapon": "carbine"}, "Crazy Tennis": {"won": 10, "lost": 30}}} | Thom |
  7. | 3 | {"id": 3, "name": "Ali", "games_played": {"Puzzler": {"time": 12}, "Battlefield": {"rank": "First Sergeant II", "level": 37, "weapon": "machine gun"}, "Crazy Tennis": {"won": 30, "lost": 21}}} | Ali |
  8. | 4 | {"id": 4, "name": "Alfred", "games_played": {"Puzzler": {"time": 10}, "Battlefield": {"rank": "Chief Warrant Officer Five III", "level": 73, "weapon": "pistol"}, "Crazy Tennis": {"won": 47, "lost": 2}}} | Alfred |
  9. | 5 | {"id": 5, "name": "Phil", "games_played": {"Puzzler": {"time": 7}, "Battlefield": {"rank": "Lt. Colonel III", "level": 98, "weapon": "assault rifle"}, "Crazy Tennis": {"won": 130, "lost": 75}}} | Phil |
  10. | 6 | {"id": 6, "name": "Henry", "games_played": {"Puzzler": {"time": 17}, "Battlefield": {"rank": "Captain II", "level": 87, "weapon": "assault rifle"}, "Crazy Tennis": {"won": 68, "lost": 149}}} | Henry |
  11. +----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+

查看表Players的字段

  1. SHOW COLUMNS FROM `players`;
  2. +------------------+------------------+------+-----+---------+-------------------+
  3. | Field | Type | Null | Key | Default | Extra |
  4. +------------------+------------------+------+-----+---------+-------------------+
  5. | id | int(10) unsigned | NO | PRI | NULL | |
  6. | player_and_games | json | NO | | NULL | |
  7. | names_virtual | varchar(20) | NO | | NULL | VIRTUAL GENERATED |
  8. +------------------+------------------+------+-----+---------+-------------------+

我们看到虚拟字段names_virtual的类型是VIRTUAL GENERATED。MySQL只是在数据字典里保存该字段元数据,并没有真正的存储该字段的值。这样表的大小并没有增加。我们可以利用索引把这个字段上的值进行物理存储。

在虚拟字段上加索引

再添加索引之前,让我们先看下面查询的执行计划

  1. EXPLAIN SELECT * FROM `players` WHERE `names_virtual` = "Sally"\G
  2. *************************** 1. row ***************************
  3. id: 1
  4. select_type: SIMPLE
  5. table: players
  6. partitions: NULL
  7. type: ALL
  8. possible_keys: NULL
  9. key: NULL
  10. key_len: NULL
  11. ref: NULL
  12. rows: 6
  13. filtered: 16.67
  14. Extra: Using where

添加索引

  1. CREATE INDEX `names_idx` ON `players`(`names_virtual`);

再执行上面的查询语句,我们将得到不一样的执行计划

  1. EXPLAIN SELECT * FROM `players` WHERE `names_virtual` = "Sally"\G
  2. *************************** 1. row ***************************
  3. id: 1
  4. select_type: SIMPLE
  5. table: players
  6. partitions: NULL
  7. type: ref
  8. possible_keys: names_idx
  9. key: names_idx
  10. key_len: 22
  11. ref: const
  12. rows: 1
  13. filtered: 100.00
  14. Extra: NULL

如我们所见,最新的执行计划走了新建的索引。

小结

本文介绍了如何在MySQL 5.7中保存JSON文档。为了高效的检索JSON中内容,我们可以利用5.7的虚拟字段来对JSON的不同的KEY来建索引。极大的提高检索的速度。