CMake in ClickHouse

TL; DR How to make ClickHouse compile and link faster?

Developer only! This command will likely fulfill most of your needs. Run before calling ninja.

  1. cmake .. \
  2. -DCMAKE_C_COMPILER=/bin/clang-10 \
  3. -DCMAKE_CXX_COMPILER=/bin/clang++-10 \
  4. -DCMAKE_BUILD_TYPE=Debug \
  5. -DENABLE_CLICKHOUSE_ALL=OFF \
  6. -DENABLE_CLICKHOUSE_SERVER=ON \
  7. -DENABLE_CLICKHOUSE_CLIENT=ON \
  8. -DUSE_STATIC_LIBRARIES=OFF \
  9. -DSPLIT_SHARED_LIBRARIES=ON \
  10. -DENABLE_LIBRARIES=OFF \
  11. -DUSE_UNWIND=ON \
  12. -DENABLE_UTILS=OFF \
  13. -DENABLE_TESTS=OFF

CMake files types

  1. ClickHouse’s source CMake files (located in the root directory and in /src).
  2. Arch-dependent CMake files (located in /cmake/*os_name*).
  3. Libraries finders (search for contrib libraries, located in /cmake/find).
  4. Contrib build CMake files (used instead of libraries’ own CMake files, located in /cmake/modules)

List of CMake flags

  • This list is auto-generated by this Python script.
  • The flag name is a link to its position in the code.
  • If an option’s default value is itself an option, it’s also a link to its position in this list.

ClickHouse modes

NameDefault valueDescriptionComment
ENABLE_CLICKHOUSE_ALLONEnable all ClickHouse modes by defaultThe clickhouse binary is a multi purpose tool that contains multiple execution modes (client, server, etc.), each of them may be built and linked as a separate library. If you do not know what modes you need, turn this option OFF and enable SERVER and CLIENT only.
ENABLE_CLICKHOUSE_BENCHMARKENABLE_CLICKHOUSE_ALLQueries benchmarking modehttps://clickhouse.tech/docs/en/operations/utilities/clickhouse-benchmark/
ENABLE_CLICKHOUSE_CLIENTENABLE_CLICKHOUSE_ALLClient mode (interactive tui/shell that connects to the server)
ENABLE_CLICKHOUSE_COMPRESSORENABLE_CLICKHOUSE_ALLData compressor and decompressorhttps://clickhouse.tech/docs/en/operations/utilities/clickhouse-compressor/
ENABLE_CLICKHOUSE_COPIERENABLE_CLICKHOUSE_ALLInter-cluster data copying modehttps://clickhouse.tech/docs/en/operations/utilities/clickhouse-copier/
ENABLE_CLICKHOUSE_EXTRACT_FROM_CONFIGENABLE_CLICKHOUSE_ALLConfigs processor (extract values etc.)
ENABLE_CLICKHOUSE_FORMATENABLE_CLICKHOUSE_ALLQueries pretty-printer and formatter with syntax highlighting
ENABLE_CLICKHOUSE_GIT_IMPORTENABLE_CLICKHOUSE_ALLA tool to analyze Git repositorieshttps://presentations.clickhouse.tech/matemarketing_2020/
ENABLE_CLICKHOUSE_INSTALLOFFInstall ClickHouse without .deb/.rpm/.tgz packages (having the binary only)
ENABLE_CLICKHOUSE_LOCALENABLE_CLICKHOUSE_ALLLocal files fast processing modehttps://clickhouse.tech/docs/en/operations/utilities/clickhouse-local/
ENABLE_CLICKHOUSE_OBFUSCATORENABLE_CLICKHOUSE_ALLTable data obfuscator (convert real data to benchmark-ready one)https://clickhouse.tech/docs/en/operations/utilities/clickhouse-obfuscator/
ENABLE_CLICKHOUSE_ODBC_BRIDGEENABLE_CLICKHOUSE_ALLHTTP-server working like a proxy to ODBC driverhttps://clickhouse.tech/docs/en/operations/utilities/odbc-bridge/
ENABLE_CLICKHOUSE_SERVERENABLE_CLICKHOUSE_ALLServer mode (main mode)

External libraries

Note that ClickHouse uses forks of these libraries, see https://github.com/ClickHouse-Extras.

NameDefault valueDescriptionComment
ENABLE_AMQPCPPENABLE_LIBRARIESEnalbe AMQP-CPP
ENABLE_AVROENABLE_LIBRARIESEnable AvroNeeded when using Apache Avro serialization format
ENABLE_BASE64ENABLE_LIBRARIESEnable base64
ENABLE_BROTLIENABLE_LIBRARIESEnable brotli
ENABLE_CAPNPENABLE_LIBRARIESEnable Cap’n Proto
ENABLE_CASSANDRAENABLE_LIBRARIESEnable Cassandra
ENABLE_CCACHEENABLE_CCACHE_BY_DEFAULTSpeedup re-compilations using ccache (external tool)https://ccache.dev/
ENABLE_CLANG_TIDYOFFUse clang-tidy static analyzerhttps://clang.llvm.org/extra/clang-tidy/
ENABLE_CURLENABLE_LIBRARIESEnable curl
ENABLE_EMBEDDED_COMPILERENABLE_LIBRARIESSet to TRUE to enable support for ‘compile_expressions’ option for query execution
ENABLE_FASTOPSENABLE_LIBRARIESEnable fast vectorized mathematical functions library by Mikhail Parakhin
ENABLE_GPERFENABLE_LIBRARIESUse gperf function hash generator tool
ENABLE_GRPCENABLE_GRPC_DEFAULTUse gRPC
ENABLE_GSASL_LIBRARYENABLE_LIBRARIESEnable gsasl library
ENABLE_H3ENABLE_LIBRARIESEnable H3
ENABLE_HDFSENABLE_LIBRARIESEnable HDFS
ENABLE_ICUENABLE_LIBRARIESEnable ICU
ENABLE_LDAPENABLE_LIBRARIESEnable LDAP
ENABLE_LIBPQXXENABLE_LIBRARIESEnalbe libpqxx
ENABLE_MSGPACKENABLE_LIBRARIESEnable msgpack library
ENABLE_MYSQLENABLE_LIBRARIESEnable MySQL
ENABLE_NURAFTENABLE_LIBRARIESEnable NuRaft
ENABLE_ODBCENABLE_LIBRARIESEnable ODBC library
ENABLE_ORCENABLE_LIBRARIESEnable ORC
ENABLE_PARQUETENABLE_LIBRARIESEnable parquet
ENABLE_PROTOBUFENABLE_LIBRARIESEnable protobuf
ENABLE_RAPIDJSONENABLE_LIBRARIESUse rapidjson
ENABLE_RDKAFKAENABLE_LIBRARIESEnable kafka
ENABLE_ROCKSDBENABLE_LIBRARIESEnable ROCKSDB
ENABLE_S3ENABLE_LIBRARIESEnable S3
ENABLE_SSLENABLE_LIBRARIESEnable sslNeeded when securely connecting to an external server, e.g. clickhouse-client —host … —secure
ENABLE_STATSENABLE_LIBRARIESEnalbe StatsLib library

External libraries system/bundled mode

NameDefault valueDescriptionComment
USE_INTERNAL_AVRO_LIBRARYONSet to FALSE to use system avro library instead of bundled
USE_INTERNAL_AWS_S3_LIBRARYONSet to FALSE to use system S3 instead of bundled (experimental set to OFF on your own risk)
USE_INTERNAL_BROTLI_LIBRARYUSE_STATIC_LIBRARIESSet to FALSE to use system libbrotli library instead of bundledMany system ship only dynamic brotly libraries, so we back off to bundled by default
USE_INTERNAL_CAPNP_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system capnproto library instead of bundled
USE_INTERNAL_CURLNOT_UNBUNDLEDUse internal curl library
USE_INTERNAL_GRPC_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system gRPC library instead of bundled. (Experimental. Set to OFF on your own risk)Normally we use the internal gRPC framework. You can set USE_INTERNAL_GRPC_LIBRARY to OFF to force using the external gRPC framework, which should be installed in the system in this case. The external gRPC framework can be installed in the system by running sudo apt-get install libgrpc++-dev protobuf-compiler-grpc
USE_INTERNAL_GTEST_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system Google Test instead of bundled
USE_INTERNAL_H3_LIBRARYONSet to FALSE to use system h3 library instead of bundled
USE_INTERNAL_HDFS3_LIBRARYONSet to FALSE to use system HDFS3 instead of bundled (experimental - set to OFF on your own risk)
USE_INTERNAL_ICU_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system ICU library instead of bundled
USE_INTERNAL_LDAP_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system LDAP library instead of bundled
USE_INTERNAL_LIBCXX_LIBRARYUSE_INTERNAL_LIBCXX_LIBRARY_DEFAULTDisable to use system libcxx and libcxxabi libraries instead of bundled
USE_INTERNAL_LIBGSASL_LIBRARYUSE_STATIC_LIBRARIESSet to FALSE to use system libgsasl library instead of bundledwhen USE_STATIC_LIBRARIES we usually need to pick up hell a lot of dependencies for libgsasl
USE_INTERNAL_LIBXML2_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system libxml2 library instead of bundled
USE_INTERNAL_LLVM_LIBRARYNOT_UNBUNDLEDUse bundled or system LLVM library.
USE_INTERNAL_MSGPACK_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system msgpack library instead of bundled
USE_INTERNAL_MYSQL_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system mysqlclient library instead of bundled
USE_INTERNAL_ODBC_LIBRARYNOT_UNBUNDLEDUse internal ODBC library
USE_INTERNAL_ORC_LIBRARYONSet to FALSE to use system ORC instead of bundled (experimental set to OFF on your own risk)
USE_INTERNAL_PARQUET_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system parquet library instead of bundled
USE_INTERNAL_POCO_LIBRARYONUse internal Poco library
USE_INTERNAL_PROTOBUF_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system protobuf instead of bundled. (Experimental. Set to OFF on your own risk)Normally we use the internal protobuf library. You can set USE_INTERNAL_PROTOBUF_LIBRARY to OFF to force using the external protobuf library, which should be installed in the system in this case. The external protobuf library can be installed in the system by running sudo apt-get install libprotobuf-dev protobuf-compiler libprotoc-dev
USE_INTERNAL_RAPIDJSON_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system rapidjson library instead of bundled
USE_INTERNAL_RDKAFKA_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system librdkafka instead of the bundled
USE_INTERNAL_RE2_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system re2 library instead of bundled [slower]
USE_INTERNAL_ROCKSDB_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system ROCKSDB library instead of bundled
USE_INTERNAL_SNAPPY_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system snappy library instead of bundled
USE_INTERNAL_SPARSEHASH_LIBRARYONSet to FALSE to use system sparsehash library instead of bundled
USE_INTERNAL_SSL_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system ssl library instead of bundled
USE_INTERNAL_ZLIB_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system zlib library instead of bundled
USE_INTERNAL_ZSTD_LIBRARYNOT_UNBUNDLEDSet to FALSE to use system zstd library instead of bundled

Other flags

NameDefault valueDescriptionComment
ADD_GDB_INDEX_FOR_GOLDOFFAdd .gdb-index to resulting binaries for gold linker.Ignored if lld is used
ARCH_NATIVEOFFAdd -march=native compiler flag
CLICKHOUSE_SPLIT_BINARYOFFMake several binaries (clickhouse-server, clickhouse-client etc.) instead of one bundled
COMPILER_PIPEON-pipe compiler optionLess /tmp usage, more RAM usage.
ENABLE_CHECK_HEAVY_BUILDSOFFDon’t allow C++ translation units to compile too long or to take too much memory while compiling
ENABLE_FUZZINGOFFFuzzy testing using libfuzzerImplies WITH_COVERAGE
ENABLE_LIBRARIESONEnable all external libraries by defaultTurns on all external libs like s3, kafka, ODBC, …
ENABLE_MULTITARGET_CODEONEnable platform-dependent codeClickHouse developers may use platform-dependent code under some macro (e.g. ifdef ENABLE_MULTITARGET). If turned ON, this option defines such macro. See src/Functions/TargetSpecific.h
ENABLE_TESTSONProvide unit_test_dbms target with Google.Test unit testsIf turned ON, assumes the user has either the system GTest library or the bundled one.
ENABLE_THINLTOONClang-specific link time optimizationhttps://clang.llvm.org/docs/ThinLTO.html Applies to clang only. Disabled when building with tests or sanitizers.
FAIL_ON_UNSUPPORTED_OPTIONS_COMBINATIONONStop/Fail CMake configuration if some ENABLE_XXX option is defined (either ON or OFF) but is not possible to satisfyIf turned off: e.g. when ENABLE_FOO is ON, but FOO tool was not found, the CMake will continue.
GLIBC_COMPATIBILITYONEnable compatibility with older glibc libraries.Only for Linux, x86_64. Implies ENABLE_FASTMEMCPY
LINKER_NAMEOFFLinker name or full pathExample values: lld-10, gold.
LLVM_HAS_RTTIONEnable if LLVM was build with RTTI enabled
MAKE_STATIC_LIBRARIESUSE_STATIC_LIBRARIESDisable to make shared libraries
PARALLEL_COMPILE_JOBS“”Maximum number of concurrent compilation jobs1 if not set
PARALLEL_LINK_JOBS“”Maximum number of concurrent link jobs1 if not set
SANITIZE“”Enable one of the code sanitizersPossible values: - address (ASan) - memory (MSan) - thread (TSan) - undefined (UBSan) - “” (no sanitizing)
SPLIT_SHARED_LIBRARIESOFFKeep all internal libraries as separate .so filesDEVELOPER ONLY. Faster linking if turned on.
STRIP_DEBUG_SYMBOLS_FUNCTIONSSTRIP_DSF_DEFAULTDo not generate debugger info for ClickHouse functionsProvides faster linking and lower binary size. Tradeoff is the inability to debug some source files with e.g. gdb (empty stack frames and no local variables).”
UNBUNDLEDOFFUse system libraries instead of ones in contrib/We recommend avoiding this mode for production builds because we can’t guarantee all needed libraries exist in your system. This mode exists for enthusiastic developers who are searching for trouble. Useful for maintainers of OS packages.
USE_INCLUDE_WHAT_YOU_USEOFFAutomatically reduce unneeded includes in source code (external tool)https://github.com/include-what-you-use/include-what-you-use
USE_LIBCXXNOT_UNBUNDLEDUse libc++ and libc++abi instead of libstdc++
USE_SENTRYENABLE_LIBRARIESUse Sentry
USE_SIMDJSONENABLE_LIBRARIESUse simdjson
USE_SNAPPYENABLE_LIBRARIESEnable snappy library
USE_STATIC_LIBRARIESONDisable to use shared libraries
USE_UNWINDENABLE_LIBRARIESEnable libunwind (better stacktraces)
WERROROFFEnable -Werror compiler optionUsing system libs can cause a lot of warnings in includes (on macro expansion).
WEVERYTHINGONEnable -Weverything option with some exceptions.Add some warnings that are not available even with -Wall -Wextra -Wpedantic. Intended for exploration of new compiler warnings that may be found useful. Applies to clang only
WITH_COVERAGEOFFProfile the resulting binary/binariesCompiler-specific coverage flags e.g. -fcoverage-mapping for gcc

Developer’s guide for adding new CMake options

Don’t be obvious. Be informative.

Bad:

  1. option (ENABLE_TESTS "Enables testing" OFF)

This description is quite useless as is neither gives the viewer any additional information nor explains the option purpose.

Better:

  1. option(ENABLE_TESTS "Provide unit_test_dbms target with Google.test unit tests" OFF)

If the option’s purpose can’t be guessed by its name, or the purpose guess may be misleading, or option has some
pre-conditions, leave a comment above the option() line and explain what it does.
The best way would be linking the docs page (if it exists).
The comment is parsed into a separate column (see below).

Even better:

  1. # implies ${TESTS_ARE_ENABLED}
  2. # see tests/CMakeLists.txt for implementation detail.
  3. option(ENABLE_TESTS "Provide unit_test_dbms target with Google.test unit tests" OFF)

If the option’s state could produce unwanted (or unusual) result, explicitly warn the user.

Suppose you have an option that may strip debug symbols from the ClickHouse’s part.
This can speed up the linking process, but produces a binary that cannot be debugged.
In that case, prefer explicitly raising a warning telling the developer that he may be doing something wrong.
Also, such options should be disabled if applies.

Bad:

  1. option(STRIP_DEBUG_SYMBOLS_FUNCTIONS
  2. "Do not generate debugger info for ClickHouse functions.
  3. ${STRIP_DSF_DEFAULT})
  4. if (STRIP_DEBUG_SYMBOLS_FUNCTIONS)
  5. target_compile_options(clickhouse_functions PRIVATE "-g0")
  6. endif()

Better:

  1. # Provides faster linking and lower binary size.
  2. # Tradeoff is the inability to debug some source files with e.g. gdb
  3. # (empty stack frames and no local variables)."
  4. option(STRIP_DEBUG_SYMBOLS_FUNCTIONS
  5. "Do not generate debugger info for ClickHouse functions."
  6. ${STRIP_DSF_DEFAULT})
  7. if (STRIP_DEBUG_SYMBOLS_FUNCTIONS)
  8. message(WARNING "Not generating debugger info for ClickHouse functions")
  9. target_compile_options(clickhouse_functions PRIVATE "-g0")
  10. endif()

In the option’s description, explain WHAT the option does rather than WHY it does something.

The WHY explanation should be placed in the comment.
You may find that the option’s name is self-descriptive.

Bad:

  1. option(ENABLE_THINLTO "Enable Thin LTO. Only applicable for clang. It's also suppressed when building with tests or sanitizers." ON)

Better:

  1. # Only applicable for clang.
  2. # Turned off when building with tests or sanitizers.
  3. option(ENABLE_THINLTO "Clang-specific link time optimisation" ON).

Don’t assume other developers know as much as you do.

In ClickHouse, there are many tools used that an ordinary developer may not know. If you are in doubt, give a link to
the tool’s docs. It won’t take much of your time.

Bad:

  1. option(ENABLE_THINLTO "Enable Thin LTO. Only applicable for clang. It's also suppressed when building with tests or sanitizers." ON)

Better (combined with the above hint):

  1. # https://clang.llvm.org/docs/ThinLTO.html
  2. # Only applicable for clang.
  3. # Turned off when building with tests or sanitizers.
  4. option(ENABLE_THINLTO "Clang-specific link time optimisation" ON).

Other example, bad:

  1. option (USE_INCLUDE_WHAT_YOU_USE "Use 'include-what-you-use' tool" OFF)

Better:

  1. # https://github.com/include-what-you-use/include-what-you-use
  2. option (USE_INCLUDE_WHAT_YOU_USE "Reduce unneeded #include s (external tool)" OFF)

Prefer consistent default values.

CMake allows you to pass a plethora of values representing boolean true/false, e.g. 1, ON, YES, ....
Prefer the ON/OFF values, if possible.