Concepts

Database Interaction

ArangoDB is a database that serves documents to clients. These documents aretransported using JSON via a TCP connection,using the HTTP protocol. A REST APIis provided to interact with the database system.

The web interface that comes withArangoDB, called Aardvark, provides graphical user interface that is easy to use.An interactive shell, called Arangosh, is alsoshipped. In addition, there are so called driversthat make it easy to use the database system in various environments andprogramming languages. All these tools use the HTTP interface of the server andremove the necessity to roll own low-level code for basic communication in mostcases.

Data model

The documents you can store in ArangoDB closely follow the JSON format,although they are stored in a binary format called VelocyPack.A document contains zero or more attributes, each of these attributes havinga value. A value can either be an atomic type, i. e. number, string, booleanor null, or a compound type, i.e. an array or embedded document / object.Arrays and sub-objects can contain all of these types, which means thatarbitrarily nested data structures can be represented in a single document.

Documents are grouped into collections. A collection contains zero or moredocuments. If you are familiar with relational database management systems (RDBMS)then it is safe to compare collections to tables and documents to rows. Thedifference is that in a traditional RDBMS, you have to define columns beforeyou can store records in a table. Such definitions are also known as schemas.ArangoDB is schema-less, which means that there is no need to define whatattributes a document can have. Every single document can have a completelydifferent structure and still be stored together with other documents in asingle collection. In practice, there will be common denominators among thedocuments in a collection, but the database system itself doesn’t force you tolimit yourself to a certain data structure.

There are two types of collections: document collection (also refered to asvertex collections in the context of graphs) as well as edge collections.Edge collections store documents as well, but they include two special attributes,from_ and to_, which are used to create relations between documents.Usually, two documents (vertices) stored in document collections are linkedby a document (edge) stored in an edge collection. This is ArangoDB’s graphdata model. It follows the mathematical concept of a directed, labeled graph,except that edges don’t just have labels, but are full-blown documents.

Collections exist inside of databases. There can be one or many databases.Different databases are usually used for multi tenant setups, as the data insidethem (collections, documents etc.) is isolated from one another. The defaultdatabase _system is special, because it cannot be removed. Database usersare managed in this database, and their credentials are valid for all databasesof a server instance.

Similarly databases may also contain view entities. AView in its simplest form can be seen as a read-onlyarray or collection of documents. The view concept quite closely matches asimilarly named concept available in most relational database management systems(RDBMS). Each view entity usually maps some implementation specific documenttransformation, (possibly identity), onto documents from zero or morecollections.

Data Retrieval

Queries are used to filter documents based on certain criteria, to computenew data, as well as to manipulate or delete existing documents. Queries can beas simple as a “query by example” or as complex as “joins”using many collections or traversing graph structures. They are written inthe ArangoDB Query Language (AQL).

Cursors are used to iterate over the result of queries, so that you geteasily processable batches instead of one big hunk.

Indexes are used to speed up searches. There are various types of indexes,such as hash indexesand geo-spatial indexes.