JavaScript Interface to Collections

This is an introduction to ArangoDB’s interface for collections and how to handlecollections from the JavaScript shell arangosh. For other languages see thecorresponding language API.

The most important call is the call to create a new collection.

Address of a Collection

All collections in ArangoDB have a unique identifier and a uniquename. The namespace for collections is shared with views, so there cannot exista collection and a view with the same name in the same database. ArangoDBinternally uses the collection’s unique identifier to look up collections. Thisidentifier, however, is managed by ArangoDB and the user has no control over it.In order to allow users to use their own names, each collection also has aunique name which is specified by the user. To access a collection from the userperspective, the collection nameshould be used, i.e.:

Collection

db._collection(collection-name)

A collection is created by a “db._create” call.

For example: Assume that the collection identifier is 7254820 and the name isdemo, then the collection can be accessed as:

  1. db._collection("demo")

If no collection with such a name exists, then null is returned.

There is a short-cut that can be used for non-system collections:

Collection name

db.collection-name

This call will either return the collection named db.collection-name or createa new one with that name and a set of default properties.

Note: Creating a collection on the fly using db.collection-name isnot recommend and does not work in arangosh. To create a new collection, pleaseuse

Create

db._create(collection-name)

This call will create a new collection called collection-name.This method is a database method and is documented in detail at Database Methods

Synchronous replication

Starting in ArangoDB 3.0, the distributed version offers synchronousreplication, which means that there is the option to replicate all dataautomatically within the ArangoDB cluster. This is configured for shardedcollections on a per collection basis by specifying a “replication factor”when the collection is created. A replication factor of k means that altogether k copies of each shard are kept in the cluster on k differentservers, and are kept in sync. That is, every write operation is automaticallyreplicated on all copies.

This is organized using a leader/follower model. At all times, one of theservers holding replicas for a shard is “the leader” and all othersare “followers”, this configuration is held in the Agency (see Cluster for details of the ArangoDBcluster architecture). Every write operation is sent to the leaderby one of the coordinators, and then replicated to all followersbefore the operation is reported to have succeeded. The leader keepsa record of which followers are currently in sync. In case of networkproblems or a failure of a follower, a leader can and will drop a follower temporarily after 3 seconds, such that service can resume. In due course,the follower will automatically resynchronize with the leader to restoreresilience.

If a leader fails, the cluster Agency automatically initiates a failoverroutine after around 15 seconds, promoting one of the followers toleader. The other followers (and the former leader, when it comes back),automatically resynchronize with the new leader to restore resilience.Usually, this whole failover procedure can be handled transparentlyfor the coordinator, such that the user code does not even see an error message.

Obviously, this fault tolerance comes at a cost of increased latency.Each write operation needs an additional network roundtrip for thesynchronous replication of the followers, but all replication operationsto all followers happen concurrently. This is, why the default replicationfactor is 1, which means no replication.

For details on how to switch on synchronous replication for a collection,see the database method db._create(collection-name) in the section about Database Methods.