全文搜索

全文搜索是基于全文索引对值为字符串类型的属性进行前缀搜索、通配符搜索、正则表达式搜索和模糊搜索。

LOOKUP语句中,使用WHERE子句指定字符串的搜索条件。

前提条件

请确保已经部署全文索引。详情请参见部署全文索引部署listener

注意事项

使用全文索引前,请确认已经了解全文索引的使用限制

自然语言全文搜索

自然语言搜索将搜索的字符串解释为自然人类语言中的短语。搜索不区分大小写,且默认是对字符串的每个子字符串(以空格分隔)单独判断搜索。例如,有三个点属于标签player,标签player含有属性name,这三个点的name分别为Kevin DurantTim DuncanDavid Beckham。现在已经建立好有关player.name的全文索引,在用全文索引前缀搜索语句LOOKUP ON player WHERE PREFIX(player.name,"d");查询时,这三个点都会被查询到。

语法

创建全文索引

  1. CREATE FULLTEXT {TAG | EDGE} INDEX <index_name> ON {<tag_name> | <edge_name>} ([<prop_name_list>]);

显示全文索引

  1. SHOW FULLTEXT INDEXES;

重建全文索引

  1. REBUILD FULLTEXT INDEX;

删除全文索引

  1. DROP FULLTEXT INDEX <index_name>;

使用查询选项

  1. LOOKUP ON {<tag> | <edge_type>} WHERE <expression> [YIELD <return_list>];
  2. <expression> ::=
  3. PREFIX | WILDCARD | REGEXP | FUZZY
  4. <return_list>
  5. <prop_name> [AS <prop_alias>] [, <prop_name> [AS <prop_alias>] ...]
  • PREFIX(schema_name.prop_name, prefix_string, row_limit, timeout)

  • WILDCARD(schema_name.prop_name, wildcard_string, row_limit, timeout)

  • REGEXP(schema_name.prop_name, regexp_string, row_limit, timeout)

  • FUZZY(schema_name.prop_name, fuzzy_string, fuzziness, operator, row_limit, timeout)

    • fuzziness:可选项。允许匹配的最大编辑距离。默认值为AUTO。查看其他可选值和更多信息,请参见Elasticsearch官方文档

    • operator:可选项。解释文本的布尔逻辑。可选值为OR(默认)和and

  • row_limit:可选项。指定要返回的行数。默认值为100

  • timeout:可选项。指定超时时间。单位:毫秒(ms)。默认值为200

示例

  1. //创建图空间。
  2. nebula> CREATE SPACE basketballplayer (partition_num=3,replica_factor=1, vid_type=fixed_string(30));
  3. //登录文本搜索客户端。
  4. nebula> SIGN IN TEXT SERVICE (127.0.0.1:9200);
  5. //切换图空间。
  6. nebula> USE basketballplayer;
  7. //添加listener到Nebula Graph集群。
  8. nebula> ADD LISTENER ELASTICSEARCH 192.168.8.5:9789;
  9. //创建Tag。
  10. nebula> CREATE TAG player(name string, age int);
  11. //创建原生索引。
  12. nebula> CREATE TAG INDEX name ON player(name(20));
  13. //重建原生索引。
  14. nebula> REBUILD TAG INDEX;
  15. //创建全文索引,索引名称需要以nebula开头。
  16. nebula> CREATE FULLTEXT TAG INDEX nebula_index_1 ON player(name);
  17. //重建全文索引。
  18. nebula> REBUILD FULLTEXT INDEX;
  19. //查看全文索引。
  20. nebula> SHOW FULLTEXT INDEXES;
  21. +------------------+-------------+-------------+--------+
  22. | Name | Schema Type | Schema Name | Fields |
  23. +------------------+-------------+-------------+--------+
  24. | "nebula_index_1" | "Tag" | "player" | "name" |
  25. +------------------+-------------+-------------+--------+
  26. //插入测试数据。
  27. nebula> INSERT VERTEX player(name, age) VALUES \
  28. "Russell Westbrook": ("Russell Westbrook", 30), \
  29. "Chris Paul": ("Chris Paul", 33),\
  30. "Boris Diaw": ("Boris Diaw", 36),\
  31. "David West": ("David West", 38),\
  32. "Danny Green": ("Danny Green", 31),\
  33. "Tim Duncan": ("Tim Duncan", 42),\
  34. "James Harden": ("James Harden", 29),\
  35. "Tony Parker": ("Tony Parker", 36),\
  36. "Aron Baynes": ("Aron Baynes", 32),\
  37. "Ben Simmons": ("Ben Simmons", 22),\
  38. "Blake Griffin": ("Blake Griffin", 30);
  39. //测试查询
  40. nebula> LOOKUP ON player WHERE PREFIX(player.name, "B");
  41. +-----------------+
  42. | _vid |
  43. +-----------------+
  44. | "Boris Diaw" |
  45. | "Ben Simmons" |
  46. | "Blake Griffin" |
  47. +-----------------+
  48. nebula> LOOKUP ON player WHERE WILDCARD(player.name, "*ri*") YIELD player.name, player.age;
  49. +-----------------+-----------------+-----+
  50. | _vid | name | age |
  51. +-----------------+-----------------+-----+
  52. | "Chris Paul" | "Chris Paul" | 33 |
  53. | "Boris Diaw" | "Boris Diaw" | 36 |
  54. | "Blake Griffin" | "Blake Griffin" | 30 |
  55. +-----------------+-----------------+-----+
  56. nebula> LOOKUP ON player WHERE WILDCARD(player.name, "*ri*") | YIELD count(*);
  57. +----------+
  58. | count(*) |
  59. +----------+
  60. | 3 |
  61. +----------+
  62. nebula> LOOKUP ON player WHERE REGEXP(player.name, "R.*") YIELD player.name, player.age;
  63. +---------------------+---------------------+-----+
  64. | _vid | name | age |
  65. +---------------------+---------------------+-----+
  66. | "Russell Westbrook" | "Russell Westbrook" | 30 |
  67. +---------------------+---------------------+-----+
  68. nebula> LOOKUP ON player WHERE REGEXP(player.name, ".*");
  69. +---------------------+
  70. | _vid |
  71. +---------------------+
  72. | "Danny Green" |
  73. | "David West" |
  74. | "Russell Westbrook" |
  75. +---------------------+
  76. ...
  77. nebula> LOOKUP ON player WHERE FUZZY(player.name, "Tim Dunncan", AUTO, OR) YIELD player.name;
  78. +--------------+--------------+
  79. | _vid | name |
  80. +--------------+--------------+
  81. | "Tim Duncan" | "Tim Duncan" |
  82. +--------------+--------------+
  83. //删除全文索引。
  84. nebula> DROP FULLTEXT INDEX nebula_index_1;

最后更新: October 27, 2021