Documents

A document is an object composed of one or more fields

A field, or a key-value pair, is a set of two data items linked together: an attribute and its associated value**.

Ex: "attribute": "value"

**

. Each field consists of an **attribute

An attribute is the name of a field, like a key.

Ex: "title": "Batman"
In the example above, “title” is the attribute.

and its associated value

A piece of data linked to an attribute. One half of a field.

Ex: "title": "Batman"
In the example above, “Batman” is the value.

**.

Documents function as containers for organizing data, and are the basic building blocks of a MeiliSearch database. To search for a document, it must first be added to an index.

Structure

document structure

Important terms

  • Document: an object which contains data in the form of one or more fields.
  • Field: a set of two data items that are linked together: an attribute and a value.
  • Attribute: the first part of a field. Acts as a name or description for its associated value.
  • Value: the second part of a field, consisting of data of any valid JSON type.
  • Primary Field: A special field that is mandatory in all documents. It contains the primary key and document identifier.
  • Primary Key: the attribute of the primary field. All documents in the same index must possess the same primary key. Its associated value is the document identifier.
  • Document Identifier: the value of the primary field. Every document in a given index must have a unique identifier.

Formatting

Documents are represented as JSON objects: key-value pairs enclosed by curly brackets. As such, any rule that applies to formatting JSON objectsDocuments - 图2 (opens new window) also applies to formatting MeiliSearch documents. For example, an attribute must be a string, while a value must be a valid JSON data typeDocuments - 图3 (opens new window).

As an example, let’s say you are making an index that contains information about movies. A sample document might look like this:

  1. {
  2. "id": "1564saqw12ss",
  3. "title": "Kung Fu Panda",
  4. "genre": "Children's Animation",
  5. "release-year": 2008,
  6. "cast": [ {"Jack Black": "Po"}, {"Jackie Chan": "Monkey"} ]
  7. }

In the above example, "id", "title", "genre", "release-year", and "cast" are attributes.
Each attribute must be associated with a value, e.g. "Kung Fu Panda" is the value of "title".
At minimum, the document must contain one field with the primary key attribute and a unique document id as its value. Above, that’s: "id": "1564saqw12ss".

Limitations and requirements

Documents have a soft maximum of 1000 fields; beyond that the ranking rules

[A set of consecutive rules applied to ensure relevancy in search results.

For example, to sort results by number of typos or number of matched query terms in each matching document.

]($c31ce7ad63fb9615.md#ranking-rules)

may no longer be effective, leading to undefined behavior.

Additionally, every document must have at minimum one field containing the primary key

**[An attribute that must be present in every document of a given index, used to identify and distinguish documents.

Example: In a document with the primary field "id": "Abc_012", “id” is the index’s primary key and “Abc_012” is the document’s unique identifier.

]($646dd5f874873bbe.md#primary-key)**

and a unique id.

If you try to index a document that’s incorrectly formatted, missing a primary key, or possessing the wrong primary key for a given index, it will cause an error and no documents will be added.

Fields

A field

A field, or a key-value pair, is a set of two data items linked together: an attribute and its associated value.

Ex: "attribute": "value"

is a set of two data items linked together: an attribute

An attribute is the name of a field, like a key.

Ex: "title": "Batman"
In the example above, “title” is the attribute.

and a value. Documents are made up of fields.

An attribute functions a bit like a variable in most programming languages, i.e. it is a name that allows you to store, access, and describe some data. That data is the attribute’s value.

Every field has a data type dictated by its value. Every value must be a valid JSON data typeDocuments - 图4 (opens new window).

Take note that in the case of strings, the value can contain at most 1000 words. If it contains more than 1000 words, only the first 1000 will be indexed.

You can also apply ranking rules

[A set of consecutive rules applied to ensure relevancy in search results.

For example, to sort results by number of typos or number of matched query terms in each matching document.

]($c31ce7ad63fb9615.md#ranking-rules)

to some fields. For example, you may decide recent movies should be more relevant than older ones.

If you would like to adjust how a field gets handled by MeiliSearch, you can do so in the settings.

Field properties

A field may also possess field properties. Field properties determine the characteristics and behavior of the data added to that field.

At this time, there are two field properties: searchable

[The data is used to determine the relevancy of a document when doing a search query.

]($5ae407d090348a99.md#searchable-fields)

and [displayed

The field is present in the documents returned upon search.

]($5ae407d090348a99.md#displayed-fields). A field can have one, both, or neither of these properties. By default, all fields in a document are both displayed and searchable.

To clarify, a field may be:

  • searchable but not displayed
  • displayed but not searchable
  • both displayed and searchable (default)
  • neither displayed nor searchable

In the latter case, the field will be completely ignored when a search is performed. However, it will still be stored in the document.

Primary field

The primary field

A special field containing the primary key and a unique document id.

Every document must possess a correctly formatted primary field in order to be indexed.

is a special field

A field, or a key-value pair, is a set of two data items linked together: an attribute and its associated value.

Ex: "attribute": "value"

that must be present in all documents. Its attribute is the [primary key

An attribute that must be present in every document of a given index, used to identify and distinguish documents.

Example: In a document with the primary field "id": "Abc_012", “id” is the index’s primary key and “Abc_012” is the document’s unique identifier.

]($646dd5f874873bbe.md#primary-key)and its value is the [document id

The value of the primary field. The document id acts as a unique identifier for storing documents.

Example: in a document with the primary field "movie_id": "Abc_012", “Abc_012” is the document id.

]($646dd5f874873bbe.md#document-id).

The primary field serves the important role of uniquely identifying each document stored in an index, ensuring that it is impossible to have two exactly identical documents present in the same index.

Therefore, every document in the same index must possess the exact same primary key associated with a unique document id as value.

Example:

Suppose we have an index called movie that contains 200,000 documents. As shown below, each document is identified by a primary field containing the primary key movie_id and a unique value (the document id).

Aside from the primary key, documents in the same index are not required to share attributes, e.g. you could have a document in this index without the “title” attribute.

  1. [
  2. {
  3. "movie_id": "1564saqw12ss",
  4. "title": "Kung fu Panda",
  5. "runtime": 95
  6. },
  7. {
  8. "movie_id": "15k1j2kkw223s",
  9. "title": "Batman Begins",
  10. "gritty reboot": true
  11. }
  12. ]

Primary key

The primary key

An attribute that must be present in every document of a given index, used to identify and distinguish documents.

Example: In a document with the primary field "id": "Abc_012", “id” is the index’s primary key and “Abc_012” is the document’s unique identifier.

is a **mandatory attribute

An attribute is the name of a field, like a key.

Ex: "title": "Batman"
In the example above, “title” is the attribute.

linked to a unique value

A piece of data linked to an attribute. One half of a field.

Ex: "title": "Batman"
In the example above, “Batman” is the value.

:** the [document id

The value of the primary field. The document id acts as a unique identifier for storing documents.

Example: in a document with the primary field "movie_id": "Abc_012", “Abc_012” is the document id.

]($646dd5f874873bbe.md#document-id). It is part of the [primary field

A special field containing the primary key and a unique document id.

Every document must possess a correctly formatted primary field in order to be indexed.

]($646dd5f874873bbe.md#primary-field).

Each index recognizes only one primary key attribute. Once a primary key has been set for an index, it cannot be changed anymore. If no primary key is found in a document, the document will not be stored.

Setting the primary key

There are several ways to set the primary key for an index:

MeiliSearch guesses your primary key

If the primary key has neither been set at index creation nor as a parameter of the add documents route, MeiliSearch will search your first document for an attribute that contains the string id in a case-insensitive manner (e.g., uid, MovieId, ID, 123id123) and set it as that index’s primary key.

If no corresponding attribute is found, the index will have no known primary key, and therefore, no documents will be added.

Missing primary key error

❗️ If you get the Could not infer a primary key error, the primary key was not recognized. This means your primary key is wrongly formatted or absent.

Manually adding the primary key can be accomplished by using its name as a parameter for the add document route or the update index route.

Document Id

The document id

The value of the primary field. The document id acts as a unique identifier for storing documents.

Example: in a document with the primary field "movie_id": "Abc_012", “Abc_012” is the document id.

is the value

A piece of data linked to an attribute. One half of a field.

Ex: "title": "Batman"
In the example above, “Batman” is the value.

associated to the primary key

An attribute that must be present in every document of a given index, used to identify and distinguish documents.

Example: In a document with the primary field "id": "Abc_012", “id” is the index’s primary key and “Abc_012” is the document’s unique identifier.

. It is part of the primary field

A special field containing the primary key and a unique document id.

Every document must possess a correctly formatted primary field in order to be indexed.

, and acts as a unique identifier for each of the documents of a given index.

This unique value ensures that two documents in the same index cannot be exactly alike. If two documents in the same index have the same id, then they are treated as the same document and the more recent one will replace the older.

The document id must contain only A-Z a-z 0-9 and -_ characters.

Example:

Good:

  1. "id": "_Aabc012_"

Bad:

  1. "id": "@BI+* ^5h2%"

Take note that the document addition request in MeiliSearch is atomic

An atomic transaction is an indivisible and irreducible series of database operations such that either all occur, or nothing occurs.

. This means that if even a single document id is incorrectly formatted, an error will occur and none of your documents will be added.

Upload

By default, MeiliSearch limits the size of JSON payloads—and therefore document uploads—to 100MB.

To upload more documents in one go, it is possible to change the payload size limit at runtime using the http-payload-size-limit option.

  1. ./meilisearch --http-payload-size-limit=1048576000

The above code sets the payload limit to 1GB, instead of the 100MB default.

MeiliSearch uses a lot of RAM when indexing documents. Be aware of your RAM availability as you increase the size of your batch as this could cause MeiliSearch to crash.

When using the route to add new documents, all documents must be sent in an array even if there is only one document.

cURL

  1. curl -X POST `http://localhost:7700/indexes/movies/documents` \
  2. --data '[
  3. {
  4. "movie_id": "123sq178",
  5. "title": "Amelie Poulain"
  6. }
  7. ]'

JavaScript

  1. client.index('movies').addDocuments([{
  2. 'movie_id': '123sq178',
  3. 'title': 'Amelie Poulain'
  4. }])

Python

  1. client.index('movies').add_documents([{
  2. 'movie_id': '123sq178',
  3. 'title': 'Amélie Poulain'
  4. }])

PHP

  1. $client->index('movies')->addDocuments([['movie_id' => '123sq178', 'title' => 'Amelie Poulain']]);

Ruby

  1. client.index('movies').add_documents([{
  2. "movie_id": "123sq178",
  3. "title": "Amelie Poulain"
  4. }])

Go

  1. documents := []map[string]interface{}{
  2. {
  3. "movie_id": "123sq178",
  4. "title": "Amelie Poulain",
  5. },
  6. }
  7. client.Index("movies").AddDocuments(documents)

Rust

  1. // Define the type of our documents
  2. #[derive(Serialize, Deserialize, Debug)]
  3. struct IncompleteMovie {
  4. id: String,
  5. title: String
  6. }
  7. impl Document for IncompleteMovie {
  8. type UIDType = String;
  9. fn get_uid(&self) -> &Self::UIDType { &self.id }
  10. }
  11. // Add a document to our index
  12. let progress: Progress = movies.add_documents(&[
  13. IncompleteMovie {
  14. id: "123sq178".to_string(),
  15. title: "Amélie Poulain".to_string(),
  16. }
  17. ], None).await.unwrap();