Hive Dialect

Starting from 1.11.0, Flink allows users to write SQL statements in Hive syntax when Hive dialect is used. By providing compatibility with Hive syntax, we aim to improve the interoperability with Hive and reduce the scenarios when users need to switch between Flink and Hive in order to execute different statements.

Use Hive Dialect

Flink currently supports two SQL dialects: default and hive. You need to switch to Hive dialect before you can write in Hive syntax. The following describes how to set dialect with SQL Client and Table API. Also notice that you can dynamically switch dialect for each statement you execute. There’s no need to restart a session to use a different dialect.

SQL Client

SQL dialect can be specified via the table.sql-dialect property. Therefore you can set the initial dialect to use in the configuration section of the yaml file for your SQL Client.

  1. execution:
  2. planner: blink
  3. type: batch
  4. result-mode: table
  5. configuration:
  6. table.sql-dialect: hive

You can also set the dialect after the SQL Client has launched.

  1. Flink SQL> set table.sql-dialect=hive; -- to use hive dialect
  2. [INFO] Session property has been set.
  3. Flink SQL> set table.sql-dialect=default; -- to use default dialect
  4. [INFO] Session property has been set.

Table API

You can set dialect for your TableEnvironment with Table API.

  1. EnvironmentSettings settings = EnvironmentSettings.newInstance().useBlinkPlanner()...build();
  2. TableEnvironment tableEnv = TableEnvironment.create(settings);
  3. // to use hive dialect
  4. tableEnv.getConfig().setSqlDialect(SqlDialect.HIVE);
  5. // to use default dialect
  6. tableEnv.getConfig().setSqlDialect(SqlDialect.DEFAULT);

DDL

This section lists the supported DDLs with the Hive dialect. We’ll mainly focus on the syntax here. You can refer to Hive doc for the semantics of each DDL statement.

DATABASE

Show

  1. SHOW DATABASES;

Create

  1. CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name
  2. [COMMENT database_comment]
  3. [LOCATION fs_path]
  4. [WITH DBPROPERTIES (property_name=property_value, ...)];

Alter

Update Properties
  1. ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, ...);
Update Owner
  1. ALTER (DATABASE|SCHEMA) database_name SET OWNER [USER|ROLE] user_or_role;
Update Location
  1. ALTER (DATABASE|SCHEMA) database_name SET LOCATION fs_path;

Drop

  1. DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];

Use

  1. USE database_name;

TABLE

Show

  1. SHOW TABLES;

Create

  1. CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name
  2. [(col_name data_type [column_constraint] [COMMENT col_comment], ... [table_constraint])]
  3. [COMMENT table_comment]
  4. [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
  5. [
  6. [ROW FORMAT row_format]
  7. [STORED AS file_format]
  8. ]
  9. [LOCATION fs_path]
  10. [TBLPROPERTIES (property_name=property_value, ...)]
  11. row_format:
  12. : DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char]
  13. [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
  14. [NULL DEFINED AS char]
  15. | SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, ...)]
  16. file_format:
  17. : SEQUENCEFILE
  18. | TEXTFILE
  19. | RCFILE
  20. | ORC
  21. | PARQUET
  22. | AVRO
  23. | INPUTFORMAT input_format_classname OUTPUTFORMAT output_format_classname
  24. column_constraint:
  25. : NOT NULL [[ENABLE|DISABLE] [VALIDATE|NOVALIDATE] [RELY|NORELY]]
  26. table_constraint:
  27. : [CONSTRAINT constraint_name] PRIMARY KEY (col_name, ...) [[ENABLE|DISABLE] [VALIDATE|NOVALIDATE] [RELY|NORELY]]

Alter

Rename
  1. ALTER TABLE table_name RENAME TO new_table_name;
Update Properties
  1. ALTER TABLE table_name SET TBLPROPERTIES (property_name = property_value, property_name = property_value, ... );
Update Location
  1. ALTER TABLE table_name [PARTITION partition_spec] SET LOCATION fs_path;

The partition_spec, if present, needs to be a full spec, i.e. has values for all partition columns. And when it’s present, the operation will be applied to the corresponding partition instead of the table.

Update File Format
  1. ALTER TABLE table_name [PARTITION partition_spec] SET FILEFORMAT file_format;

The partition_spec, if present, needs to be a full spec, i.e. has values for all partition columns. And when it’s present, the operation will be applied to the corresponding partition instead of the table.

Update SerDe Properties
  1. ALTER TABLE table_name [PARTITION partition_spec] SET SERDE serde_class_name [WITH SERDEPROPERTIES serde_properties];
  2. ALTER TABLE table_name [PARTITION partition_spec] SET SERDEPROPERTIES serde_properties;
  3. serde_properties:
  4. : (property_name = property_value, property_name = property_value, ... )

The partition_spec, if present, needs to be a full spec, i.e. has values for all partition columns. And when it’s present, the operation will be applied to the corresponding partition instead of the table.

Add Partitions
  1. ALTER TABLE table_name ADD [IF NOT EXISTS] (PARTITION partition_spec [LOCATION fs_path])+;
Drop Partitions
  1. ALTER TABLE table_name DROP [IF EXISTS] PARTITION partition_spec[, PARTITION partition_spec, ...];
Add/Replace Columns
  1. ALTER TABLE table_name
  2. ADD|REPLACE COLUMNS (col_name data_type [COMMENT col_comment], ...)
  3. [CASCADE|RESTRICT]
Change Column
  1. ALTER TABLE table_name CHANGE [COLUMN] col_old_name col_new_name column_type
  2. [COMMENT col_comment] [FIRST|AFTER column_name] [CASCADE|RESTRICT];

Drop

  1. DROP TABLE [IF EXISTS] table_name;

VIEW

Create

  1. CREATE VIEW [IF NOT EXISTS] view_name [(column_name, ...) ]
  2. [COMMENT view_comment]
  3. [TBLPROPERTIES (property_name = property_value, ...)]
  4. AS SELECT ...;

Alter

NOTE: Altering view only works in Table API, but not supported via SQL client.

Rename
  1. ALTER VIEW view_name RENAME TO new_view_name;
Update Properties
  1. ALTER VIEW view_name SET TBLPROPERTIES (property_name = property_value, ... );
Update As Select
  1. ALTER VIEW view_name AS select_statement;

Drop

  1. DROP VIEW [IF EXISTS] view_name;

FUNCTION

Show

  1. SHOW FUNCTIONS;

Create

  1. CREATE FUNCTION function_name AS class_name;

Drop

  1. DROP FUNCTION [IF EXISTS] function_name;

DML

INSERT

  1. INSERT (INTO|OVERWRITE) [TABLE] table_name [PARTITION partition_spec] SELECT ...;

The partition_spec, if present, can be either a full spec or partial spec. If the partition_spec is a partial spec, the dynamic partition column names can be omitted.

DQL

At the moment, Hive dialect supports the same syntax as Flink SQL for DQLs. Refer to Flink SQL queries for more details. And it’s recommended to switch to default dialect to execute DQLs.

Notice

The following are some precautions for using the Hive dialect.

  • Hive dialect should only be used to manipulate Hive tables, not generic tables. And Hive dialect should be used together with a HiveCatalog.
  • While all Hive versions support the same syntax, whether a specific feature is available still depends on the Hive version you use. For example, updating database location is only supported in Hive-2.4.0 or later.
  • Hive and Calcite have different sets of reserved keywords. For example, default is a reserved keyword in Calcite and a non-reserved keyword in Hive. Even with Hive dialect, you have to quote such keywords with backtick ( ` ) in order to use them as identifiers.
  • Due to expanded query incompatibility, views created in Flink cannot be queried in Hive.