Hash Indexes

Hash Indexes

Introduction to Hash Indexes

It is possible to define a hash index on one or more attributes (or paths) of adocument. This hash index is then used in queries to locate documents in O(1)operations. If the hash index is unique, then no two documents are allowed to have thesame set of attribute values.

Creating a new document or updating a document will fail if the uniqueness is violated. If the index is declared sparse, a document will be excluded from the index and no uniqueness checks will be performed if any index attribute value is not set or has a value of null.

Accessing Hash Indexes from the Shell

Unique Hash Indexes

Ensures that a unique constraint exists:collection.ensureIndex({ type: "hash", fields: [ "field1", …, "fieldn" ], unique: true })

Creates a unique hash index on all documents using field1, … _fieldn_as attribute paths. At least one attribute path has to be given.The index will be non-sparse by default.

All documents in the collection must differ in terms of the indexed attributes. Creating a new document or updating an existing document willwill fail if the attribute uniqueness is violated.

To create a sparse unique index, set the sparse attribute to true:

collection.ensureIndex({ type: "hash", fields: [ "field1", …, "fieldn" ], unique: true, sparse: true })

In case that the index was successfully created, the index identifier is returned.

Non-existing attributes will default to null.In a sparse index all documents will be excluded from the index for which allspecified index attributes are null. Such documents will not be taken into accountfor uniqueness checks.

In a non-sparse index, all documents regardless of null - attributes will beindexed and will be taken into account for uniqueness checks.

In case that the index was successfully created, an object with the indexdetails, including the index-identifier, is returned.

arangosh> db.test.ensureIndex({ type: "hash", fields: [ "a", "b.c" ], unique: true });
arangosh> db.test.save({ a : 1, b : { c : 1 } });
arangosh> db.test.save({ a : 1, b : { c : 1 } });
arangosh> db.test.save({ a : 1, b : { c : null } });
arangosh> db.test.save({ a : 1 });

Show execution results

{ 
  "deduplicate" : true, 
  "fields" : [ 
    "a", 
    "b.c" 
  ], 
  "id" : "test/74693", 
  "isNewlyCreated" : true, 
  "name" : "idx_1642473900986073090", 
  "selectivityEstimate" : 1, 
  "sparse" : false, 
  "type" : "hash", 
  "unique" : true, 
  "code" : 201 
}
{ 
  "_id" : "test/74697", 
  "_key" : "74697", 
  "_rev" : "_ZJNS8LK---" 
}
[ArangoError 1210: unique constraint violated - in index idx_1642473900986073090 of type hash over 'a, b.c'; conflicting key: 74697]
{ 
  "_id" : "test/74701", 
  "_key" : "74701", 
  "_rev" : "_ZJNS8LO---" 
}
[ArangoError 1210: unique constraint violated - in index idx_1642473900986073090 of type hash over 'a, b.c'; conflicting key: 74701]

Hide execution results

Non-unique Hash Indexes

Ensures that a non-unique hash index exists:collection.ensureIndex({ type: "hash", fields: [ "field1", …, "fieldn" ] })

Creates a non-unique hash index on all documents using field1, … _fieldn_as attribute paths. At least one attribute path has to be given.The index will be non-sparse by default.

To create a sparse unique index, set the sparse attribute to true:

collection.ensureIndex({ type: "hash", fields: [ "field1", …, "fieldn" ], sparse: true })

In case that the index was successfully created, an object with the indexdetails, including the index-identifier, is returned.

arangosh> db.test.ensureIndex({ type: "hash", fields: [ "a" ] });
arangosh> db.test.save({ a : 1 });
arangosh> db.test.save({ a : 1 });
arangosh> db.test.save({ a : null });

Show execution results

{ 
  "deduplicate" : true, 
  "fields" : [ 
    "a" 
  ], 
  "id" : "test/74389", 
  "isNewlyCreated" : true, 
  "name" : "idx_1642473900930498562", 
  "selectivityEstimate" : 1, 
  "sparse" : false, 
  "type" : "hash", 
  "unique" : false, 
  "code" : 201 
}
{ 
  "_id" : "test/74393", 
  "_key" : "74393", 
  "_rev" : "_ZJNS8Hy--B" 
}
{ 
  "_id" : "test/74395", 
  "_key" : "74395", 
  "_rev" : "_ZJNS8H2---" 
}
{ 
  "_id" : "test/74397", 
  "_key" : "74397", 
  "_rev" : "_ZJNS8H2--A" 
}

Hide execution results

Hash Array Indexes

Ensures that a hash array index exists (non-unique):collection.ensureIndex({ type: "hash", fields: [ "field1[]", …, "fieldn[]" ] })

Creates a non-unique hash array index for the individual elements of the arrayattributes field1[], … fieldn[] found in the documents. At leastone attribute path has to be given. The index always treats the indexed arrays assparse.

It is possible to combine array indexing with standard indexing:collection.ensureIndex({ type: "hash", fields: [ "field1[*]", "field2" ] })

In case that the index was successfully created, an object with the indexdetails, including the index-identifier, is returned.

arangosh> db.test.ensureIndex({ type: "hash", fields: [ "a[*]" ] });
arangosh> db.test.save({ a : [ 1, 2 ] });
arangosh> db.test.save({ a : [ 1, 3 ] });
arangosh> db.test.save({ a : null });

Show execution results

{ 
  "deduplicate" : true, 
  "fields" : [ 
    "a[*]" 
  ], 
  "id" : "test/74406", 
  "isNewlyCreated" : true, 
  "name" : "idx_1642473900935741440", 
  "selectivityEstimate" : 1, 
  "sparse" : false, 
  "type" : "hash", 
  "unique" : false, 
  "code" : 201 
}
{ 
  "_id" : "test/74410", 
  "_key" : "74410", 
  "_rev" : "_ZJNS8IG--_" 
}
{ 
  "_id" : "test/74412", 
  "_key" : "74412", 
  "_rev" : "_ZJNS8IG--B" 
}
{ 
  "_id" : "test/74414", 
  "_key" : "74414", 
  "_rev" : "_ZJNS8IK---" 
}

Hide execution results

Creating Hash Index in Background

This section only applies to the rocksdb storage engine

Creating new indexes is by default done under an exclusive collection lock. This meansthat the collection (or the respective shards) are not available as long as the indexis created. This “foreground” index creation can be undesirable, if you have to perform iton a live system without a dedicated maintenance window.

Indexes can also be created in “background”, not using an exclusive lock during the creation. The collection remains available, other CRUD operations can run on the collection while the index is created.This can be achieved by using the inBackground option.

To create an hash index in the background in arangosh just specify inBackground: true:

db.collection.ensureIndex({ type: "hash", fields: [ "value" ], inBackground: true });

For more information see “Creating Indexes in Background” in the Index basics page.

Ensure uniqueness of relations in edge collections

It is possible to create secondary indexes using the edge attributes _fromand _to, starting with ArangoDB 3.0. A combined index over both fields togetherwith the unique option enabled can be used to prevent duplicate relations frombeing created.

For example, a document collection verts might contain vertices with the documenthandles verts/A, verts/B and verts/C. Relations between these documents canbe stored in an edge collection edges for instance. Now, you may want to make surethat the vertex verts/A is never linked to verts/B by an edge more than once.This can be achieved by adding a unique, non-sparse hash index for the fields _fromand _to:

db.edges.ensureIndex({ type: "hash", fields: [ "_from", "_to" ], unique: true });

Creating an edge { from: "verts/A", _to: "verts/B" } in _edges will be accepted,but only once. Another attempt to store an edge with the relation A → B willbe rejected by the server with a unique constraint violated error. This includesupdates to the _from and _to fields.

Note that adding a relation B → A is still possible, so is A → Aand B → B, because they are all different relations in a directed graph.Each one can only occur once however.