Hudi External Table of Doris

Hudi External Table of Doris provides Doris with the ability to access hdui external tables directly, eliminating the need for cumbersome data import and leveraging Doris’ own OLAP capabilities to solve hudi table data analysis problems.

  1. support hudi data sources for Doris
  2. Support joint query between Doris and hdui data source tables to perform more complex analysis operations

This document introduces how to use this feature and the considerations.

Glossary

Noun in Doris

  • FE: Frontend, the front-end node of Doris, responsible for metadata management and request access
  • BE: Backend, the backend node of Doris, responsible for query execution and data storage

How to use

Create Hudi External Table

Hudi tables can be created in Doris with or without schema. You do not need to declare the column definitions of the table when creating an external table, Doris can resolve the column definitions of the table in hive metastore when querying the table.

  1. Create a separate external table to mount the Hudi table.
    The syntax can be viewed in HELP CREATE TABLE.

    1. -- Syntax
    2. CREATE [EXTERNAL] TABLE table_name
    3. [(column_definition1[, column_definition2, ...])]
    4. ENGINE = HUDI
    5. [COMMENT "comment"]
    6. PROPERTIES (
    7. "hudi.database" = "hudi_db_in_hive_metastore",
    8. "hudi.table" = "hudi_table_in_hive_metastore",
    9. "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
    10. );
  1. -- Example: Mount hudi_table_in_hive_metastore under hudi_db_in_hive_metastore in Hive MetaStore
  2. CREATE TABLE `t_hudi`
  3. ENGINE = HUDI
  4. PROPERTIES (
  5. "hudi.database" = "hudi_db_in_hive_metastore",
  6. "hudi.table" = "hudi_table_in_hive_metastore",
  7. "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
  8. );
  9. -- ExampleMount hudi table with schema.
  10. CREATE TABLE `t_hudi` (
  11. `id` int NOT NULL COMMENT "id number",
  12. `name` varchar(10) NOT NULL COMMENT "user name"
  13. ) ENGINE = HUDI
  14. PROPERTIES (
  15. "hudi.database" = "hudi_db_in_hive_metastore",
  16. "hudi.table" = "hudi_table_in_hive_metastore",
  17. "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
  18. );
  19. ```

Parameter Description

  • column_definition
    • When create hudi table without schema(recommended), doris will resolve columns from hive metastore when query.
    • When create hudi table with schema, the columns must exist in corresponding table in hive metastore.
  • ENGINE needs to be specified as HUDI
  • PROPERTIES property.
    • hudi.hive.metastore.uris: Hive Metastore service address
    • hudi.database: the name of the database to which Hudi is mounted
    • hudi.table: the name of the table to which Hudi is mounted, not required when mounting Hudi database.

Show table structure

Show table structure can be viewed by HELP SHOW CREATE TABLE.

Data Type Matching

The supported Hudi column types correspond to Doris in the following table.

HudiDorisDescription
BOOLEANBOOLEAN
INTEGERINT
LONGBIGINT
FLOATFLOAT
DOUBLEDOUBLE
DATEDATE
TIMESTAMPDATETIMETimestamp to Datetime with loss of precision
STRINGSTRING
UUIDVARCHARUse VARCHAR instead
DECIMALDECIMAL
TIME-not supported
FIXED-not supported
BINARY-not supported
STRUCT-not supported
LIST-not supported
MAP-not supported

Note:

  • The current default supported version of hudi is 0.10.0 and has not been tested in other versions. More versions will be supported in the future.

Query Usage

Once you have finished building the hdui external table in Doris, it is no different from a normal Doris OLAP table except that you cannot use the data models in Doris (rollup, preaggregation, materialized views, etc.)

  1. select * from t_hudi where k1 > 1000 and k3 = 'term' or k4 like '%doris';