Index Command Line Interface

Usage

The index executable will be located under the bin directory in the installation.

For example, <path to installtion directory>/bin/index and must be executed from the bin directory because it uses relative paths by default.

  1. Usage: index [-v] [--debug] [--disableLocking] --table=<table>
  2. [-c=<configDirPath>] [--column=<columns>[,<columns>...]]...
  3. [--partition=<partitions>[,<partitions>...]]...
  4. [--type=<indexTypes>[,<indexTypes>...]]... [-p=<plugins>[,
  5. <plugins>...]]... <command>
  6. Using this index tool, you can CREATE, SHOW and DELETE indexes.
  7. Supported index types: BITMAP, BLOOM, MINMAX
  8. Supported index stores: LOCAL, HDFS (must be configured in {--config}/config.properties
  9. Supported data sources: HIVE using ORC files (must be configured in {--config}/catalog/catalog_name.properties
  10. <command> command types, e.g. create, delete, show; Note: delete command
  11. works a column level only.
  12. --column=<columns>[,<columns>...]
  13. column, comma separated format for multiple columns
  14. --debug if enabled the original data for each split will
  15. also be written to a file alongside the index
  16. --disableLocking by default locking is enabled at the table level; if this
  17. is set to false, the user must ensure that the same data
  18. is not indexed by multiple callers at the same time
  19. (indexing different columns or partitions in parallel is
  20. allowed)
  21. --partition=<partitions>[,<partitions>...]
  22. only create index for these partitions, comma separated
  23. format for multiple partitions
  24. --table=<table> fully qualified table name
  25. --type=<indexTypes>[,<indexTypes>...]
  26. index type, comma separated format for multiple types
  27. (supported types: BLOOM, BITMAP, MINMAX
  28. -c, --config=<configDirPath>
  29. root folder of openLooKeng etc directory (default: ../etc)
  30. -p, --plugins=<plugins>[,<plugins>...]
  31. plugins dir or file, defaults to (default: .
  32. /hetu-heuristic-index/plugins)
  33. -v verbose

Examples

Create index

  1. $ ./index -v -c ../etc --table hive.schema.table --column column1,column2 --type bloom,minmax,bitmap --partition p=part1 create

Show index

  1. $ ./index -v -c ../etc --table hive.schema.table show

Delete index

Note: index can only be deleted at table or column level, i.e. all index types will be deleted

  1. $ ./index -v -c ../etc --table hive.schema.table --column column1 delete

Notes on resource usage

Memory

By default the default JVM MaxHeapSize will be used (java -XX:+PrintFlagsFinal -version | grep MaxHeapSize). For improved performance, it is recommended to increase the MaxHeapSize. This can be done by setting -Xmx value:

  1. export JAVA_TOOL_OPTIONS="-Xmx100G"

In this example the MaxHeapSize will be set to 100G.

Indexing in parallel

If creating the index for a large table is too slow on one machine, you can create index for different partitions in parallel on different machines. This requires setting the –disableLocking flag and specifying the partition(s). For example:

On machine 1:

  1. $ ./index -v ---disableLocking c ../etc --table hive.schema.table --columncolumn1,column2 --type bloom,minmax,bitmap --partition p=part1 create

On machine 2:

  1. $ ./index -v ---disableLocking c ../etc --table hive.schema.table --columncolumn1,column2 --type bloom,minmax,bitmap --partition p=part2 create