在多表连接的场景中,优化器的一个很重要的任务是决定各个表之间的连接顺序,因为不同的连接顺序会影响中间结果集的大小,进而影响到计划整体的执行代价。为了减少执行计划的搜索空间和计划执行态的内存占用,OceanBase 数据库的优化器在生成连接顺序时主要考虑左深树的连接形式。

    下图展示了左深树, 右深树和多支树的计划形状。

    树.jpg

    OceanBase 数据库连接顺序的生成采用了 System-R 的动态规划算法,考虑到的因素包括每一个表可能的访问路径、interesting order、可能的连接算法(nested-loop,block-based nested-loop, sort-merge 等)以及不同表之间的连接选择率等等。

    给定 N 个表的连接,OceanBase 数据库生成连接顺序的方法如下:

    1. 为每一个基表生成访问路径,保留代价最小的访问路径以及有所有有 interesting order 的路径。一个路径 如果具有interesting order,它的序能够被后续的算子使用。
    2. 生成所有表集合的大小为 i (1 < i <= N) 的计划。 OceanBase 数据库一般只考虑左深树,表集合大小为 i 的计划可以由一个表集合大小为 i 的计划和一个基表的计划组成。OceanBase 数据库按照这种策略,考虑了所有的连接算法,interesting order 的继承等因素把所有表集合大小为 i 的计划生成。这里也只是保留代价最小的计划以及所有具有interesting order的计划。

    同时,OceanBase 数据库提供了hint 机制 /*+ leading(table_name_list)*/ 去显示控制多表连接的顺序,比如下面开始选择的连接顺序是先做 t1、t2 的 join 连接,然后再和 t3 做 join 连接;如果用户希望先做 t2、t3 的 join 连接,然后再和 t1做 join 连接,则可以使用 /*+leading(t2,t3,t1)*/hint 去控制,而用户希望先做 t1、t3 的 join 连接,然后再和 t2做 join 连接,则可以使用 /*+leading(t1,t3,t2)*/hint 去控制。

    1. OceanBase(TEST@TEST)>create table t1(c1 int, c2 int, primary key(c1));
    2. Query OK, 0 rows affected (0.31 sec)
    3. OceanBase(TEST@TEST)>create table t2(c1 int, c2 int, primary key(c1));
    4. Query OK, 0 rows affected (0.33 sec)
    5. OceanBase(TEST@TEST)>create table t3(c1 int, c2 int, primary key(c1));
    6. Query OK, 0 rows affected (0.44 sec)
    7. OceanBase(TEST@TEST)>explain select * from t1,t2,t3 where t1.c1 = t2.c2 and t2.c1 = t3.c2;
    8. | =======================================
    9. |ID|OPERATOR |NAME|EST. ROWS|COST |
    10. ---------------------------------------
    11. |0 |HASH JOIN | |98010 |926122|
    12. |1 | TABLE SCAN |T3 |100000 |61860 |
    13. |2 | HASH JOIN | |99000 |494503|
    14. |3 | TABLE SCAN|T1 |100000 |61860 |
    15. |4 | TABLE SCAN|T2 |100000 |61860 |
    16. =======================================
    17. Outputs & filters:
    18. -------------------------------------
    19. 0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2], [T3.C1], [T3.C2]), filter(nil),
    20. equal_conds([T2.C1 = T3.C2]), other_conds(nil)
    21. 1 - output([T3.C2], [T3.C1]), filter(nil),
    22. access([T3.C2], [T3.C1]), partitions(p0)
    23. 2 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2]), filter(nil),
    24. equal_conds([T1.C1 = T2.C2]), other_conds(nil)
    25. 3 - output([T1.C1], [T1.C2]), filter(nil),
    26. access([T1.C1], [T1.C2]), partitions(p0)
    27. 4 - output([T2.C2], [T2.C1]), filter(nil),
    28. access([T2.C2], [T2.C1]), partitions(p0)
    29. OceanBase(TEST@TEST)>explain select /*+leading(t2,t3,t1)*/* from t1,t2,t3 where t1.c1 = t2.c2 and t2.c1 = t3.c2;
    30. | ========================================
    31. |ID|OPERATOR |NAME|EST. ROWS|COST |
    32. ----------------------------------------
    33. |0 |HASH JOIN | |98010 |1096613|
    34. |1 | HASH JOIN | |99000 |494503 |
    35. |2 | TABLE SCAN|T2 |100000 |61860 |
    36. |3 | TABLE SCAN|T3 |100000 |61860 |
    37. |4 | TABLE SCAN |T1 |100000 |61860 |
    38. ========================================
    39. Outputs & filters:
    40. -------------------------------------
    41. 0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2], [T3.C1], [T3.C2]), filter(nil),
    42. equal_conds([T1.C1 = T2.C2]), other_conds(nil)
    43. 1 - output([T2.C1], [T2.C2], [T3.C1], [T3.C2]), filter(nil),
    44. equal_conds([T2.C1 = T3.C2]), other_conds(nil)
    45. 2 - output([T2.C2], [T2.C1]), filter(nil),
    46. access([T2.C2], [T2.C1]), partitions(p0)
    47. 3 - output([T3.C2], [T3.C1]), filter(nil),
    48. access([T3.C2], [T3.C1]), partitions(p0)
    49. 4 - output([T1.C1], [T1.C2]), filter(nil),
    50. access([T1.C1], [T1.C2]), partitions(p0)
    51. explain select /*+leading(t1,t3,t2)*/* from t1,t2,t3 where t1.c1 = t2.c2 and t2.c1 = t3.c2;
    52. | =============================================================
    53. |ID|OPERATOR |NAME|EST. ROWS |COST |
    54. -------------------------------------------------------------
    55. |0 |HASH JOIN | |98010 |53098071243|
    56. |1 | NESTED-LOOP JOIN CARTESIAN| |10000000000|7964490204 |
    57. |2 | TABLE SCAN |T1 |100000 |61860 |
    58. |3 | MATERIAL | |100000 |236426 |
    59. |4 | TABLE SCAN |T3 |100000 |61860 |
    60. |5 | TABLE SCAN |T2 |100000 |61860 |
    61. =============================================================
    62. Outputs & filters:
    63. -------------------------------------
    64. 0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2], [T3.C1], [T3.C2]), filter(nil),
    65. equal_conds([T1.C1 = T2.C2], [T2.C1 = T3.C2]), other_conds(nil)
    66. 1 - output([T1.C1], [T1.C2], [T3.C1], [T3.C2]), filter(nil),
    67. conds(nil), nl_params_(nil)
    68. 2 - output([T1.C1], [T1.C2]), filter(nil),
    69. access([T1.C1], [T1.C2]), partitions(p0)
    70. 3 - output([T3.C1], [T3.C2]), filter(nil)
    71. 4 - output([T3.C2], [T3.C1]), filter(nil),
    72. access([T3.C2], [T3.C1]), partitions(p0)
    73. 5 - output([T2.C2], [T2.C1]), filter(nil),
    74. access([T2.C2], [T2.C1]), partitions(p0)