Apache HBase

AttentionThis page documents an earlier version. Go to the latest (v2.1)version.

Following are the key areas of differences between YugabyteDB and Apache HBase.

1. Simpler Software Stack

HBase relies on HDFS (another complex piece of infrastructure for data replication) and on Zookeeper for leader election, failure detection, and so on. Running HBase smoothly requires a lot of on-going operational overheads.

2. High Availability / Fast-Failover

In HBase, on a region-server death, the unavailability window of shards/regionson the server can be in the order of 60 seconds or more. This is because the HBase master firstneeds to wait for the server’s ephemeral node in Zookeeper to expire, followed by time take to splitthe transaction logs into per-shard recovery logs, and the time taken to replay the edits from thetransaction log by a new server before the shard is available to take IO. In contrast, in Yugabyte,the tablet-peers are hot standbys, and within a matter of few heartbeats (about few seconds) detectfailure of the leader, and initiate leader election.

3. C++ Implementation

Avoids GC tuning; can run better on large memory machines.Richer data model: Yugabyte offers a multi-model/multi-API through CQL & Redis (and SQL in future).Rather than deal with just byte keys and values, Yugabyte offers a rich set of scalar (int, text,decimal, binary, timestamp, etc.) and composite types (such as collections, UDTs, etc.).

4. Multi-Datacenter deployment

Flexlible deployment choices across multiple DCs or availability zones. HBase providesstrong-consistency only within a single datacenter and offers only async replication alternative forcross-DC deployments.