5.6. Hive Security Configuration

Authorization

You can enable authorization checks for the Hive Connector by settingthe hive.security property in the Hive catalog properties file. Thisproperty must be one of the following values:

Property ValueDescription
legacy (default value)Few authorization checks are enforced, thus allowing mostoperations. The config properties hive.allow-drop-table,hive.allow-rename-table, hive.allow-add-column,hive.allow-drop-column andhive.allow-rename-column are used.
read-onlyOperations that read data or metadata, such as SELECT,are permitted, but none of the operations that write data ormetadata, such as CREATE, INSERT or DELETE, areallowed.
fileAuthorization checks are enforced using a config file specifiedby the Hive configuration property security.config-file.See File Based Authorization for details.
sql-standardUsers are permitted to perform the operations as long asthey have the required privileges as per the SQL standard.In this mode, Presto enforces the authorization checks forqueries based on the privileges defined in Hive metastore.To alter these privileges, use the GRANT andREVOKE commands.See SQL Standard Based Authorization for details.

SQL Standard Based Authorization

When sql-standard security is enabled, Presto enforces the same SQLstandard based authorization as Hive does.

Since Presto’s ROLE syntax support matches the SQL standard, andHive does not exactly follow the SQL standard, there are the followinglimitations and differences:

  • CREATE ROLE role WITH ADMIN is not supported.
  • The admin role must be enabled to execute CREATE ROLE or DROP ROLE.
  • GRANT role TO user GRANTED BY someone is not supported.
  • REVOKE role FROM user GRANTED BY someone is not supported.
  • By default, all a user’s roles except admin are enabled in a new user session.
  • One particular role can be selected by executing SET ROLE role.
  • SET ROLE ALL enables all of a user’s roles except admin.
  • The admin role must be enabled explicitly by executing SET ROLE admin.

Authentication

The default security configuration of the Hive Connector does not useauthentication when connecting to a Hadoop cluster. All queries are executed asthe user who runs the Presto process, regardless of which user submits thequery.

The Hive connector provides additional security options to support Hadoopclusters that have been configured to use Kerberos.

When accessing HDFS, Presto canimpersonate the end user who is running thequery. This can be used with HDFS permissions and ACLs to provide additional security for data.

Warning

Access to the Presto coordinator should be secured using Kerberos when usingKerberos authentication to Hadoop services. Failure to secure access to thePresto coordinator could result in unauthorized access to sensitive data onthe Hadoop cluster.

See Coordinator Kerberos Authentication and CLI Kerberos Authenticationfor information on setting up Kerberos authentication.

Kerberos Support

In order to use the Hive connector with a Hadoop cluster that uses kerberosauthentication, you will need to configure the connector to work with twoservices on the Hadoop cluster:

  • The Hive metastore Thrift service
  • The Hadoop Distributed File System (HDFS)Access to these services by the Hive connector is configured in the propertiesfile that contains the general Hive connector configuration.

Note

If your krb5.conf location is different from /etc/krb5.conf youmust set it explicitly using the java.security.krb5.conf JVM propertyin jvm.config file.

Example: -Djava.security.krb5.conf=/example/path/krb5.conf.

Hive Metastore Thrift Service Authentication

In a Kerberized Hadoop cluster, Presto connects to the Hive metastore Thriftservice using SASL andauthenticates using Kerberos. Kerberos authentication for the metastore isconfigured in the connector’s properties file using the following properties:

Property NameDescription
hive.metastore.authentication.typeHive metastore authentication type.
hive.metastore.service.principalThe Kerberos principal of the Hive metastore service.
hive.metastore.client.principalThe Kerberos principal that Presto will use when connectingto the Hive metastore service.
hive.metastore.client.keytabHive metastore client keytab location.

hive.metastore.authentication.type

One of NONE or KERBEROS. When using the default value of NONE,Kerberos authentication is disabled and no other properties need to beconfigured.

When set to KERBEROS the Hive connector will connect to the Hive metastoreThrift service using SASL and authenticate using Kerberos.

This property is optional; the default is NONE.

hive.metastore.service.principal

The Kerberos principal of the Hive metastore service. The Presto coordinatorwill use this to authenticate the Hive metastore.

The _HOST placeholder can be used in this property value. When connectingto the Hive metastore, the Hive connector will substitute in the hostname ofthe metastore server it is connecting to. This is useful if the metastoreruns on multiple hosts.

Example: hive/hive-server-host@EXAMPLE.COM or hive/_HOST@EXAMPLE.COM.

This property is optional; no default value.

hive.metastore.client.principal

The Kerberos principal that Presto will use when connecting to the Hivemetastore.

The _HOST placeholder can be used in this property value. When connectingto the Hive metastore, the Hive connector will substitute in the hostname ofthe worker node Presto is running on. This is useful if each worker nodehas its own Kerberos principal.

Example: presto/presto-server-node@EXAMPLE.COM orpresto/_HOST@EXAMPLE.COM.

This property is optional; no default value.

Warning

The principal specified by hive.metastore.client.principal must havesufficient privileges to remove files and directories within thehive/warehouse directory. If the principal does not, only the metadatawill be removed, and the data will continue to consume disk space.

This occurs because the Hive metastore is responsible for deleting theinternal table data. When the metastore is configured to use Kerberosauthentication, all of the HDFS operations performed by the metastore areimpersonated. Errors deleting data are silently ignored.

hive.metastore.client.keytab

The path to the keytab file that contains a key for the principal specified byhive.metastore.client.principal. This file must be readable by theoperating system user running Presto.

This property is optional; no default value.

Example configuration with NONE authentication

  1. hive.metastore.authentication.type=NONE

The default authentication type for the Hive metastore is NONE. When theauthentication type is NONE, Presto connects to an unsecured Hivemetastore. Kerberos is not used.

Example configuration with KERBEROS authentication

  1. hive.metastore.authentication.type=KERBEROS
  2. hive.metastore.service.principal=hive/hive-metastore-host.example.com@EXAMPLE.COM
  3. hive.metastore.client.principal=presto@EXAMPLE.COM
  4. hive.metastore.client.keytab=/etc/presto/hive.keytab

When the authentication type for the Hive metastore Thrift service isKERBEROS, Presto will connect as the Kerberos principal specified by theproperty hive.metastore.client.principal. Presto will authenticate thisprincipal using the keytab specified by the hive.metastore.client.keytabproperty, and will verify that the identity of the metastore matcheshive.metastore.service.principal.

Keytab files must be distributed to every node in the cluster that runs Presto.

Additional Information About Keytab Files.

HDFS Authentication

In a Kerberized Hadoop cluster, Presto authenticates to HDFS using Kerberos.Kerberos authentication for HDFS is configured in the connector’s propertiesfile using the following properties:

Property NameDescription
hive.hdfs.authentication.typeHDFS authentication type.Possible values are NONE or KERBEROS.
hive.hdfs.impersonation.enabledEnable HDFS end-user impersonation.
hive.hdfs.presto.principalThe Kerberos principal that Presto will use when connectingto HDFS.
hive.hdfs.presto.keytabHDFS client keytab location.

hive.hdfs.authentication.type

One of NONE or KERBEROS. When using the default value of NONE,Kerberos authentication is disabled and no other properties need to beconfigured.

When set to KERBEROS, the Hive connector authenticates to HDFS usingKerberos.

This property is optional; the default is NONE.

hive.hdfs.impersonation.enabled

Enable end-user HDFS impersonation.

The section End User Impersonation gives anin-depth explanation of HDFS impersonation.

This property is optional; the default is false.

hive.hdfs.presto.principal

The Kerberos principal that Presto will use when connecting to HDFS.

The _HOST placeholder can be used in this property value. When connectingto HDFS, the Hive connector will substitute in the hostname of the workernode Presto is running on. This is useful if each worker node has its ownKerberos principal.

Example: presto-hdfs-superuser/presto-server-node@EXAMPLE.COM orpresto-hdfs-superuser/_HOST@EXAMPLE.COM.

This property is optional; no default value.

hive.hdfs.presto.keytab

The path to the keytab file that contains a key for the principal specified byhive.hdfs.presto.principal. This file must be readable by the operatingsystem user running Presto.

This property is optional; no default value.

Example configuration with NONE authentication

  1. hive.hdfs.authentication.type=NONE

The default authentication type for HDFS is NONE. When the authenticationtype is NONE, Presto connects to HDFS using Hadoop’s simple authenticationmechanism. Kerberos is not used.

Example configuration with KERBEROS authentication

  1. hive.hdfs.authentication.type=KERBEROS
  2. hive.hdfs.presto.principal=hdfs@EXAMPLE.COM
  3. hive.hdfs.presto.keytab=/etc/presto/hdfs.keytab

When the authentication type is KERBEROS, Presto accesses HDFS as theprincipal specified by the hive.hdfs.presto.principal property. Presto willauthenticate this principal using the keytab specified by thehive.hdfs.presto.keytab keytab.

Keytab files must be distributed to every node in the cluster that runs Presto.

Additional Information About Keytab Files.

End User Impersonation

Impersonation Accessing HDFS

Presto can impersonate the end user who is running a query. In the case of auser running a query from the command line interface, the end user is theusername associated with the Presto CLI process or argument to the optional—user option. Impersonating the end user can provide additional securitywhen accessing HDFS if HDFS permissions or ACLs are used.

HDFS Permissions and ACLs are explained in the HDFS Permissions Guide.

NONE authentication with HDFS impersonation

  1. hive.hdfs.authentication.type=NONE
  2. hive.hdfs.impersonation.enabled=true

When using NONE authentication with impersonation, Presto impersonatesthe user who is running the query when accessing HDFS. The user Presto isrunning as must be allowed to impersonate this user, as discussed in thesection Impersonation in Hadoop. Kerberos is not used.

KERBEROS Authentication With HDFS Impersonation

  1. hive.hdfs.authentication.type=KERBEROS
  2. hive.hdfs.impersonation.enabled=true
  3. hive.hdfs.presto.principal=presto@EXAMPLE.COM
  4. hive.hdfs.presto.keytab=/etc/presto/hdfs.keytab

When using KERBEROS authentication with impersonation, Presto impersonatesthe user who is running the query when accessing HDFS. The principalspecified by the hive.hdfs.presto.principal property must be allowed toimpersonate this user, as discussed in the sectionImpersonation in Hadoop. Presto authenticateshive.hdfs.presto.principal using the keytab specified byhive.hdfs.presto.keytab.

Keytab files must be distributed to every node in the cluster that runs Presto.

Additional Information About Keytab Files.

Impersonation Accessing the Hive Metastore

Presto does not currently support impersonating the end user when accessing theHive metastore.

Impersonation in Hadoop

In order to use NONE authentication with HDFS impersonation orKERBEROS Authentication With HDFS Impersonation, the Hadoop cluster must beconfigured to allow the user or principal that Presto is running as toimpersonate the users who log in to Presto. Impersonation in Hadoop isconfigured in the file core-site.xml. A complete description of theconfiguration options can be found in the Hadoop documentation.

Additional Information About Keytab Files

Keytab files contain encryption keys that are used to authenticate principalsto the Kerberos KDC. These encryption keysmust be stored securely; you should take the same precautions to protect themthat you would to protect ssh private keys.

In particular, access to keytab files should be limited to the accounts thatactually need to use them to authenticate. In practice, this is the user thatthe Presto process runs as. The ownership and permissions on keytab filesshould be set to prevent other users from reading or modifying the files.

Keytab files need to be distributed to every node running Presto. Under commondeployment situations, the Hive connector configuration will be the same on allnodes. This means that the keytab needs to be in the same location on everynode.

You should ensure that the keytab files have the correct permissions on everynode after distributing them.

File Based Authorization

The config file is specified using JSON and is composed of three sections,each of which is a list of rules that are matched in the order specifiedin the config file. The user is granted the privileges from the firstmatching rule. All regexes default to .* if not specified.

Schema Rules

These rules govern who is considered an owner of a schema.

  • user (optional): regex to match against user name.
  • schema (optional): regex to match against schema name.
  • owner (required): boolean indicating ownership.

Table Rules

These rules govern the privileges granted on specific tables.

  • user (optional): regex to match against user name.
  • schema (optional): regex to match against schema name.
  • table (optional): regex to match against table name.
  • privileges (required): zero or more of SELECT, INSERT,DELETE, OWNERSHIP, GRANT_SELECT.

Session Property Rules

These rules govern who may set session properties.

  • user (optional): regex to match against user name.
  • property (optional): regex to match against session property name.
  • allowed (required): boolean indicating whether this session property may be set.See below for an example.
  1. {
  2. "schemas": [
  3. {
  4. "user": "admin",
  5. "schema": ".*",
  6. "owner": true
  7. },
  8. {
  9. "user": "guest",
  10. "owner": false
  11. },
  12. {
  13. "schema": "default",
  14. "owner": true
  15. }
  16. ],
  17. "tables": [
  18. {
  19. "user": "admin",
  20. "privileges": ["SELECT", "INSERT", "DELETE", "OWNERSHIP"]
  21. },
  22. {
  23. "user": "banned_user",
  24. "privileges": []
  25. },
  26. {
  27. "schema": "default",
  28. "table": ".*",
  29. "privileges": ["SELECT"]
  30. }
  31. ],
  32. "sessionProperties": [
  33. {
  34. "property": "force_local_scheduling",
  35. "allow": true
  36. },
  37. {
  38. "user": "admin",
  39. "property": "max_split_size",
  40. "allow": true
  41. }
  42. ]
  43. }

HDFS wire encryption

In a Kerberized Hadoop cluster with enabled HDFS wire encryption you can enablePresto to access HDFS by using below property.

Property NameDescription
hive.hdfs.wire-encryption.enabledEnables HDFS wire encryption.Possible values are true or false.

Note

Depending on Presto installation configuration, using wire encryption mayimpact query execution performance.