Fulltext indexes

This is an introduction to ArangoDB’s fulltext indexes.

Introduction to Fulltext Indexes

A fulltext index can be used to find words, or prefixes of words inside documents.

A fulltext index can be defined on one attribute only, and will include all words contained indocuments that have a textual value in the index attribute. Since ArangoDB 2.6 the indexwill also include words from the index attribute if the index attribute is an array ofstrings, or an object with string value members.

For example, given a fulltext index on the translations attribute and the followingdocuments, then searching for лиса using the fulltext index would return only thefirst document. Searching for the index for the exact string Fox would return the firsttwo documents, and searching for prefix:Fox would return all three documents:

  1. { translations: { en: "fox", de: "Fuchs", fr: "renard", ru: "лиса" } }
  2. { translations: "Fox is the English translation of the German word Fuchs" }
  3. { translations: [ "ArangoDB", "document", "database", "Foxx" ] }

Note that deeper nested objects are ignored. For example, a fulltext index ontranslations would index Fuchs, but not fox, given the following documentstructure:

  1. { translations: { en: { US: "fox" }, de: "Fuchs" }

If you need to search across multiple fields and/or nested objects, you may writeall the strings into a special attribute, which you then create the index on(it might be necessary to clean the strings first, e.g. remove line breaks andstrip certain words).

If the index attribute is neither a string, an object or an array, its contents willnot be indexed. When indexing the contents of an array attribute, an array member willonly be included in the index if it is a string. When indexing the contents of an objectattribute, an object member value will only be included in the index if it is a string.Other data types are ignored and not indexed.

Accessing Fulltext Indexes from the Shell

Ensures that a fulltext index exists:

collection.ensureIndex({ type: "fulltext", fields: [ "field" ], minLength: minLength })

Creates a fulltext index on all documents on attribute field.

Fulltext indexes are implicitly sparse: all documents which do not havethe specified field attribute or that have a non-qualifying value in theirfield attribute will be ignored for indexing.

Only a single attribute can be indexed. Specifying multiple attributes isunsupported.

The minimum length of words that are indexed can be specified via theminLength parameter. Words shorter than minLength characters willnot be indexed. minLength has a default value of 2, but this value mightbe changed in future versions of ArangoDB. It is thus recommended to explicitlyspecify this value.

In case that the index was successfully created, an object with the indexdetails is returned.

  1. arangosh> db.example.ensureIndex({ type: "fulltext", fields: [ "text" ], minLength: 3 });
  2. arangosh> db.example.save({ text : "the quick brown", b : { c : 1 } });
  3. arangosh> db.example.save({ text : "quick brown fox", b : { c : 2 } });
  4. arangosh> db.example.save({ text : "brown fox jums", b : { c : 3 } });
  5. arangosh> db.example.save({ text : "fox jumps over", b : { c : 4 } });
  6. arangosh> db.example.save({ text : "jumps over the", b : { c : 5 } });
  7. arangosh> db.example.save({ text : "over the lazy", b : { c : 6 } });
  8. arangosh> db.example.save({ text : "the lazy dog", b : { c : 7 } });
  9. arangosh> db._query("FOR document IN FULLTEXT(example, 'text', 'the') RETURN document");

Show execution results

  1. {
  2. "fields" : [
  3. "text"
  4. ],
  5. "id" : "example/74363",
  6. "isNewlyCreated" : true,
  7. "minLength" : 3,
  8. "name" : "idx_1642473900921061378",
  9. "sparse" : true,
  10. "type" : "fulltext",
  11. "unique" : false,
  12. "code" : 201
  13. }
  14. {
  15. "_id" : "example/74367",
  16. "_key" : "74367",
  17. "_rev" : "_ZJNS8HS---"
  18. }
  19. {
  20. "_id" : "example/74369",
  21. "_key" : "74369",
  22. "_rev" : "_ZJNS8HS--A"
  23. }
  24. {
  25. "_id" : "example/74371",
  26. "_key" : "74371",
  27. "_rev" : "_ZJNS8HW---"
  28. }
  29. {
  30. "_id" : "example/74373",
  31. "_key" : "74373",
  32. "_rev" : "_ZJNS8HW--A"
  33. }
  34. {
  35. "_id" : "example/74375",
  36. "_key" : "74375",
  37. "_rev" : "_ZJNS8HW--C"
  38. }
  39. {
  40. "_id" : "example/74377",
  41. "_key" : "74377",
  42. "_rev" : "_ZJNS8Ha---"
  43. }
  44. {
  45. "_id" : "example/74379",
  46. "_key" : "74379",
  47. "_rev" : "_ZJNS8Ha--A"
  48. }
  49. [
  50. {
  51. "_key" : "74367",
  52. "_id" : "example/74367",
  53. "_rev" : "_ZJNS8HS---",
  54. "text" : "the quick brown",
  55. "b" : {
  56. "c" : 1
  57. }
  58. },
  59. {
  60. "_key" : "74375",
  61. "_id" : "example/74375",
  62. "_rev" : "_ZJNS8HW--C",
  63. "text" : "jumps over the",
  64. "b" : {
  65. "c" : 5
  66. }
  67. },
  68. {
  69. "_key" : "74377",
  70. "_id" : "example/74377",
  71. "_rev" : "_ZJNS8Ha---",
  72. "text" : "over the lazy",
  73. "b" : {
  74. "c" : 6
  75. }
  76. },
  77. {
  78. "_key" : "74379",
  79. "_id" : "example/74379",
  80. "_rev" : "_ZJNS8Ha--A",
  81. "text" : "the lazy dog",
  82. "b" : {
  83. "c" : 7
  84. }
  85. }
  86. ]
  87. [object ArangoQueryCursor, count: 4, cached: false, hasMore: false]

Hide execution results

Looks up a fulltext index:

collection.lookupFulltextIndex(attribute, minLength)

Checks whether a fulltext index on the given attribute attribute exists.

Fulltext AQL Functions

Fulltext AQL functions are detailed in Fulltext functions.