Data types

This topic lists the various data types available in YugabyteDB’s Cassandra-compatible YCQL API.

JSONB

There are a number of different serialization formats for JSON data, one of the popular formats being JSONB to efficiently model document data. And just in case you were wondering, JSONB stands for JSON Better.

The YCQL API supports the JSONB data type to parse, store and query JSON documents natively. This data type is similar in query language syntax and functionality to the one supported by PostgreSQL. JSONB serialization allows for easy search and retrieval of attributes inside the document. This is achieved by storing all the JSON attributes in a sorted order, which allows for efficient binary search of keys. Similarly arrays are stored such that random access for a particular array index into the serialized json document is possible. DocDB, YugabyteDB’s underlying storage engine, is document-oriented in itself which makes storing the data of the JSON data type lot more simple than otherwise possible.

Let us take the example of an ecommerce app of an online bookstore. The database for such a bookstore needs to store details of various books, some of which may have custom attributes. Below is an example of a JSON document that captures the details of a particular book, Macbeth written by William Shakespeare.

  1. {
  2. "name":"Macbeth",
  3. "author":{
  4. "first_name":"William",
  5. "last_name":"Shakespeare"
  6. },
  7. "year":1623,
  8. "editors":[
  9. "John",
  10. "Elizabeth",
  11. "Jeff"
  12. ]
  13. }

Create a Table

The books table for this bookstore can be modelled as shown below. We assume that the id of each book is an int, but this could be a string or a uuid.

  1. cqlsh> CREATE KEYSPACE store;
  1. cqlsh> CREATE TABLE store.books ( id int PRIMARY KEY, details jsonb );

Insert Data

Next we insert some sample data for a few books into this store. You can copy and paste the following commands into the cqlsh shell for YugabyteDB to insert the data. Note that you would need a cqlsh that has the enhancement to work with YugabyteDB JSON documents, you can download it using the documentation here.

  1. INSERT INTO store.books (id, details) VALUES (1,
  2. '{ "name" : "Macbeth",
  3. "author" : {"first_name": "William", "last_name": "Shakespeare"},
  4. "year" : 1623,
  5. "editors": ["John", "Elizabeth", "Jeff"] }'
  6. );
  7. INSERT INTO store.books (id, details) VALUES (2,
  8. '{ "name" : "Hamlet",
  9. "author" : {"first_name": "William", "last_name": "Shakespeare"},
  10. "year" : 1603,
  11. "editors": ["Lysa", "Mark", "Robert"] }'
  12. );
  13. INSERT INTO store.books (id, details) VALUES (3,
  14. '{ "name" : "Oliver Twist",
  15. "author" : {"first_name": "Charles", "last_name": "Dickens"},
  16. "year" : 1838,
  17. "genre" : "novel",
  18. "editors": ["Mark", "Tony", "Britney"] }'
  19. );
  20. INSERT INTO store.books (id, details) VALUES (4,
  21. '{ "name" : "Great Expectations",
  22. "author" : {"first_name": "Charles", "last_name": "Dickens"},
  23. "year" : 1950,
  24. "genre" : "novel",
  25. "editors": ["Robert", "John", "Melisa"] }'
  26. );
  27. INSERT INTO store.books (id, details) VALUES (5,
  28. '{ "name" : "A Brief History of Time",
  29. "author" : {"first_name": "Stephen", "last_name": "Hawking"},
  30. "year" : 1988,
  31. "genre" : "science",
  32. "editors": ["Melisa", "Mark", "John"] }'
  33. );

Note the following interesting points about the book details above:- The year attribute for each of the books is interpreted as an integer.- The first two books do not have a genre attribute, which the others do.- The author attribute is a map.- The editors attribute is an array.

Retrieve a Subset of Attributes

Running the following default select query will return all attributes of each book.

  1. cqlsh> SELECT * FROM store.books;

But a number of times we may want to query just a subset of attributes from YugabyteDB database. Below is an example of a query that retrieves just the id and name for all the books.

  1. cqlsh> SELECT id, details->>'name' as book_title FROM store.books;
  1. id | book_title
  2. ----+-------------------------
  3. 5 | A Brief History of Time
  4. 1 | Macbeth
  5. 4 | Great Expectations
  6. 2 | Hamlet
  7. 3 | Oliver Twist

Query by Attribute Values - String

The name attribute is a string in the book details JSON document. Let us query all the details of book named Hamlet.

  1. cqlsh> SELECT * FROM store.books WHERE details->>'name'='Hamlet';
  1. id | details
  2. ----+---------------------------------------------------------------
  3. 2 | {"author":{"first_name":"William","last_name":"Shakespeare"},
  4. "editors":["Lysa","Mark","Robert"],
  5. "name":"Hamlet","year":1603}

Note that we can query by attributes that exist only in some of the documents. For example, we can query for all books that have a genre of novel. Recall from before that all books do not have a genre attribute defined.

  1. cqlsh> SELECT id, details->>'name' as title,
  2. details->>'genre' as genre
  3. FROM store.books
  4. WHERE details->>'genre'='novel';
  1. id | title | genre
  2. ----+--------------------+-------
  3. 4 | Great Expectations | novel
  4. 3 | Oliver Twist | novel

Query by Attribute Values - Integer

The year attribute is an integer in the book details JSON document. Let us query the id and name of books written after 1900.

  1. cqlsh> SELECT id, details->>'name' as title, details->>'year'
  2. FROM store.books
  3. WHERE CAST(details->>'year' AS integer) > 1900;
  1. id | title | expr
  2. ----+-------------------------+------
  3. 5 | A Brief History of Time | 1988
  4. 4 | Great Expectations | 1950

Query by Attribute Values - Map

The author attribute is a map, which in turn consists of the attributes first_name and last_name. Let us fetch the ids and titles of all books written by the author William Shakespeare.

  1. cqlsh> SELECT id, details->>'name' as title,
  2. details->>'author' as author
  3. FROM store.books
  4. WHERE details->'author'->>'first_name' = 'William' AND
  5. details->'author'->>'last_name' = 'Shakespeare';
  1. id | title | author
  2. ----+---------+----------------------------------------------------
  3. 1 | Macbeth | {"first_name":"William","last_name":"Shakespeare"}
  4. 2 | Hamlet | {"first_name":"William","last_name":"Shakespeare"}

Query by Attribute Values - Array

The editors attribute is an array consisting of the first names of the editors of each of the books. We can query for the book titles where Mark is the first entry in the editors list as follows.

  1. cqlsh> SELECT id, details->>'name' as title,
  2. details->>'editors' as editors FROM store.books
  3. WHERE details->'editors'->>0 = 'Mark';
  1. id | title | editors
  2. ----+--------------+---------------------------
  3. 3 | Oliver Twist | ["Mark","Tony","Britney"]