Apache Cassandra Compatibility in YCQL

AttentionThis page documents an earlier version. Go to the latest (v2.1)version.

Do INSERTs do “upserts” by default? How do I insert data only if it is absent?

By default, inserts overwrite data on primary key collisions. So INSERTs do an upsert. This an intended CQL feature. In order to insert data only if the primary key is not already present, add a clause “IF NOT EXISTS” to the INSERT statement. This will cause the INSERT fail if the row exists.

Here is an example from CQL:

  1. INSERT INTO mycompany.users (id, lastname, firstname)
  2. VALUES (100, Smith’, John’)
  3. IF NOT EXISTS;

Can I have collection data types in the partition key? Will I be able to do partial matches on that collection data type?

Yes, you can have collection data types as primary keys as long as they are marked FROZEN. Collection types that are marked FROZEN do not support partial matches.

What is the difference between a COUNTER type and INTEGER types?

Unlike Apache Cassandra, Yugabyte COUNTER type is almost the same as INTEGER types. There is no need of lightweight transactions requiring 4 round trips to perform increments in Yugabyte - these are efficiently performed with just one round trip.

How is ‘USING TIMESTAMP’ different in Yugabyte?

In Apache Cassandra, the highest timestamp provided always wins. Example:

INSERT with timestamp far in the future.

  1. > INSERT INTO table (c1, c2, c3) VALUES (1, 2, 3) USING TIMESTAMP 1607681258727447;
  2. > SELECT * FROM table;
  1. c1 | c2 | c3
  2. ----+----+----
  3. 1 | 2 | 3

INSERT at the current timestamp does not overwrite previous value which was written at a highertimestamp.

  1. > INSERT INTO table (c1, c2, c3) VALUES (1, 2, 4);
  2. > SELECT * FROM table;
  1. c1 | c2 | c3
  2. ----+----+----
  3. 1 | 2 | 3

On the other hand in Yugabyte, for efficiency purposes INSERTs and UPDATEs without the USINGTIMESTAMP clause always overwrite the older values. On the other hand if we have the USINGTIMESTAMP clause, then appropriate timestamp ordering is performed. Example:

  1. > INSERT INTO table (c1, c2, c3) VALUES (1, 2, 3) USING TIMESTAMP 1000;
  2. > SELECT * FROM table;
  1. c1 | c2 | c3
  2. ----+----+----
  3. 1 | 2 | 3

INSERT with timestamp far in the future, this would overwrite old value.

  1. > INSERT INTO table (c1, c2, c3) VALUES (1, 2, 4) USING TIMESTAMP 1607681258727447;
  2. > SELECT * FROM table;
  1. c1 | c2 | c3
  2. ----+----+----
  3. 1 | 2 | 4

INSERT without ‘USING TIMESTAMP’ will always overwrite.

  1. > INSERT INTO table (c1, c2, c3) VALUES (1, 2, 5);
  2. > SELECT * FROM table;
  1. c1 | c2 | c3
  2. ----+----+----
  3. 1 | 2 | 5