Running Presto with Alluxio

Slack Docker Pulls GitHub edit source

Presto is an open source distributed SQL query engine for running interactive analytic queries on data at a large scale. This guide describes how to run Presto to query Alluxio as a distributed cache layer, where the data sources can be AWS S3, Azure blob store, HDFS and many others. With this setup, Alluxio will help Presto access data regardless of the data source and transparently cache the data frequently accessed (e.g., tables commonly used) into Alluxio distributed storage. Co-locating Alluxio workers with Presto workers can benefit data locality and reduce the I/O access latency especially when data is remote or network is slow or congested.

Prerequisites

  • Setup Java for Java 8 Update 161 or higher (8u161+), 64-bit.
  • Deploy Presto. This guide is tested with presto-315.
  • Alluxio has been set up and is running.
  • Make sure that the Alluxio client jar is available. This Alluxio client jar file can be found at /<PATH_TO_ALLUXIO>/client/alluxio-2.0.1-client.jar in the tarball downloaded from Alluxio download page.
  • Make sure that Hive metastore is running to serve metadata information of Hive tables.

Basic Setup

Configure Presto to connect to Hive Metastore

Presto gets the database and table metadata information, as well as the file system location of table data from Hive Metastore. Edit the Presto configuration ${PRESTO_HOME}/etc/catalog/hive.properties:

  1. connector.name=hive-hadoop2
  2. hive.metastore.uri=thrift://localhost:9083

Distribute the Alluxio client jar to all Presto servers

Put Alluxio client jar /<PATH_TO_ALLUXIO>/client/alluxio-2.0.1-client.jar into directory ${PRESTO_HOME}/plugin/hive-hadoop2/ (this directory may differ across versions) on all Presto servers. Restart Presto service:

  1. $ ${PRESTO_HOME}/bin/launcher restart

After completing the basic configuration, Presto should be able to access data in Alluxio. To configure more advanced features for Presto (e.g., connect to Alluxio with HA), please follow the instructions at Advanced Setup.

Examples: Use Presto to Query Tables on Alluxio

Create a Hive table on Alluxio

Here is an example to create an internal table in Hive backed by files in Alluxio. You can download a data file (e.g., ml-100k.zip) from http://grouplens.org/datasets/movielens/. Unzip this file and upload the file u.user into /ml-100k/ on Alluxio:

  1. $ ./bin/alluxio fs mkdir /ml-100k
  2. $ ./bin/alluxio fs copyFromLocal /path/to/ml-100k/u.user alluxio:///ml-100k

Create an external Hive table from existing files in Alluxio.

  1. hive> CREATE TABLE u_user (
  2. userid INT,
  3. age INT,
  4. gender CHAR(1),
  5. occupation STRING,
  6. zipcode STRING)
  7. ROW FORMAT DELIMITED
  8. FIELDS TERMINATED BY '|'
  9. STORED AS TEXTFILE
  10. LOCATION 'alluxio://master_hostname:port/ml-100k';

View Alluxio WebUI at http://master_hostname:19999 and you can see the directory and file Hive creates:

HiveTableInAlluxio

Start Hive metastore

Ensure your Hive metastore service is running. Hive metastore listens on port 9083 by default. If it is not running,

  1. $ ${HIVE_HOME}/bin/hive --service metastore

Start Presto server

Start your Presto server. Presto server runs on port 8080 by default (set by http-server.http.port in ${PRESTO_HOME}/etc/config.properties ):

  1. $ ${PRESTO_HOME}/bin/launcher run

Query tables using Presto

Follow Presto CLI guidence to download the presto-cli-<PRESTO_VERSION>-executable.jar, rename it to presto, and make it executable with chmod +x (sometimes the executable presto exists in ${PRESTO_HOME}/bin/presto and you can use it directly).

Run a single query (replace localhost:8080 with your actual Presto server hostname and port):

  1. $ ./presto --server localhost:8080 --execute "use default;select * from u_user limit 10;" \
  2. --catalog hive --debug

And you can see the query results from console:

PrestoQueryResult

Presto Server log:

PrestoQueryLog

Advanced Setup

Customize Alluxio User Properties

To configure additional Alluxio properties, you can append the conf path (i.e. ${ALLUXIO_HOME}/conf) containing alluxio-site.properties to Presto’s JVM config at etc/jvm.config under Presto folder. The advantage of this approach is to have all the Alluxio properties set within the same file of alluxio-site.properties.

  1. ...
  2. -Xbootclasspath/p:<path-to-alluxio-conf>

Alternatively, one can add them to the Hadoop conf files (core-site.xml, hdfs-site.xml), and use Presto property hive.config.resources in file ${PRESTO_HOME}/etc/catalog/hive.properties to point to the file’s location for every Presto worker.

  1. hive.config.resources=/<PATH_TO_CONF>/core-site.xml,/<PATH_TO_CONF>/hdfs-site.xml

Example: connect to Alluxio with HA

If the Alluxio HA cluster uses internal leader election, set the Alluxio cluster property appropriately in the alluxio-site.properties file which is on the classpath.

  1. alluxio.master.rpc.addresses=master_hostname_1:19998,master_hostname_2:19998,master_hostname_3:19998

Alternatively you can add the property to the Hadoop core-site.xml configuration which is contained by hive.config.resources.

  1. <configuration>
  2. <property>
  3. <name>alluxio.master.rpc.addresses</name>
  4. <value>master_hostname_1:19998,master_hostname_2:19998,master_hostname_3:19998</value>
  5. </property>
  6. </configuration>

For information about how to connect to Alluxio HA cluster using Zookeeper-based leader election, please refer to HA mode client configuration parameters.

Example: change default Alluxio write type

For example, change alluxio.user.file.writetype.default from default ASYNC_THROUGH to CACHE_THROUGH.

One can specify the property in alluxio-site.properties and distribute this file to the classpath of each Hive node:

  1. alluxio.user.file.writetype.default=CACHE_THROUGH

Alternatively, modify conf/hive-site.xml to have:

  1. <property>
  2. <name>alluxio.user.file.writetype.default</name>
  3. <value>CACHE_THROUGH</value>
  4. </property>

Enable data locality

It is recommended to co-locate Presto workers with Alluxio workers so that Presto workers can read data locally. An important option to enable in Presto is hive.force-local-scheduling, which forces splits to be scheduled on the same node as the Alluxio worker serving the split data. By default, hive.force-local-scheduling in Presto is set to false, and Presto will not attempt to schedule the work on the same machine as the Alluxio worker node. In the mean time, node-scheduler.network-topology needs to be set to flat for splits to be scheduled with locality awareness.

Increase parallelism

Presto’s Hive integration uses the config hive.max-split-size to control the parallelism of the query. For Alluxio 1.6 or earlier, it is recommended to set this size no less than Alluxio’s block size to avoid the read contention within the same block. For later Alluxio versions, this is no more an issue due to async cache on Alluxio workers.

Avoid Presto timeout reading large files

It is recommended to increase alluxio.user.network.data.timeout to a bigger value (e.g 10min) to avoid a timeout failure when reading large files from remote worker.

Troubleshooting

Error message “No FileSystem for scheme: alluxio” on queries

When you see error messages like the following, it is likely that Alluxio client jar is not put into the classpath of Presto worker. Please follow instructions to fix this issue.

  1. Query 20180907_063430_00001_cm7xe failed: No FileSystem for scheme: alluxio
  2. com.facebook.presto.spi.PrestoException: No FileSystem for scheme: alluxio
  3. at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:189)
  4. at com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:47)
  5. at com.facebook.presto.hive.util.ResumableTasks.access$000(ResumableTasks.java:20)
  6. at com.facebook.presto.hive.util.ResumableTasks$1.run(ResumableTasks.java:35)
  7. at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)
  8. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  9. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  10. at java.lang.Thread.run(Thread.java:748)