你可以配置 Presto Pulsar 连接器,并通过以下说明部署一个集群。

配置 Presto Pulsar 连接器

你可以在 ${project.root}/conf/presto/catalog/pulsar.properties 属性文件中配置 Presto Pulsar 连接器。 连接器和默认值的配置如下。

  1. # 要在catalog中显示的连接器的名称
  2. connector.name=pulsar
  3. #Pulsar broker 服务URL
  4. pulsar.broker-service-url=http://localhost:8080
  5. # Zookeeper 集群URI
  6. pulsar.zookeeper-uri=localhost:2181
  7. # 一次读取的最小条目数
  8. pulsar.entry-read-batch-size=100
  9. # 每个查询使用的默认拆分数
  10. pulsar.target-num-splits=4

你可以通过多个主机连接 Presto 到 Pulsar 集群。 要为 broker 配置多个主机, 添加多个 URL 到 pulsar.broker-service-url。 要为 ZooKeeper 配置多个主机, 添加多个 URI 到 pulsar.zookeeper-uri。 The following is an example.

  1. pulsar.broker-service-url=http://localhost:8080,localhost:8081,localhost:8082
  2. pulsar.zookeeper-uri=localhost1,localhost2:2181

从现有 Presto 集群查询数据

If you already have a Presto cluster, you can copy the Presto Pulsar connector plugin to your existing cluster. Download the archived plugin package with the following command.

  1. $ wget https://archive.apache.org/dist/pulsar/pulsar-2.6.1/apache-pulsar-2.6.1-bin.tar.gz

部署一个新的分组

Since Pulsar SQL is powered by Presto, the configuration for deployment is the same for the Pulsar SQL worker.

Note For how to set up a standalone single node environment, refer to Query data.

您可以使用相同的 CLI 参数作为 Presto 启动器。

  1. $ ./bin/pulsar sql-worker --help
  2. Usage: launcher [options] command
  3. Commands: run, start, stop, restart, kill, status
  4. Options:
  5. -h, --help show this help message and exit
  6. -v, --verbose Run verbosely
  7. --etc-dir=DIR Defaults to INSTALL_PATH/etc
  8. --launcher-config=FILE
  9. Defaults to INSTALL_PATH/bin/launcher.properties
  10. --node-config=FILE Defaults to ETC_DIR/node.properties
  11. --jvm-config=FILE Defaults to ETC_DIR/jvm.config
  12. --config=FILE Defaults to ETC_DIR/config.properties
  13. --log-levels-file=FILE
  14. Defaults to ETC_DIR/log.properties
  15. --data-dir=DIR Defaults to INSTALL_PATH
  16. --pid-file=FILE Defaults to DATA_DIR/var/run/launcher.pid
  17. --launcher-log-file=FILE
  18. Defaults to DATA_DIR/var/log/launcher.log (only in
  19. daemon mode)
  20. --server-log-file=FILE
  21. Defaults to DATA_DIR/var/log/server.log (only in
  22. daemon mode)
  23. -D NAME=VALUE Set a Java system property

The default configuration for the cluster is located in ${project.root}/conf/presto. You can customize your deployment by modifying the default configuration.

你可以设置该工作器从不同的配置目录读取,或者设置不同的目录来写入数据。

  1. $ ./bin/pulsar sql-worker run --etc-dir /tmp/incubator-pulsar/conf/presto --data-dir /tmp/presto-1

你可以作为守护进程开始工作者。

  1. $ ./bin sql-worker start

在多个节点上部署一个分组

You can deploy a Pulsar SQL cluster or Presto cluster on multiple nodes. The following example shows how to deploy a cluster on three-node cluster.

  1. 复制 Pulsar 二进制文件并分布到三个节点。

The first node runs as Presto coordinator. The minimal configuration requirement in the ${project.root}/conf/presto/config.properties file is as follows.

  1. coordinator=true
  2. node-scheduler.include-coordinator=true
  3. http-server.http.port=8080
  4. query.max-memory=50GB
  5. query.max-memory-per-node=1GB
  6. discovery-server.enabled=true
  7. discovery.uri=<coordinator-url>

另两个节点作为 worker 节点,可以使用下面的配置:

  1. coordinator=false
  2. http-server.http.port=8080
  3. query.max-memory=50GB
  4. query.max-memory-per-node=1GB
  5. discovery.uri=<coordinator-url>
  1. 修改 pulsar.broker-service-url and pulsar.zoocheeper-uri 配置在 ${project.root}/conf/presto/catalog/pulsar.properties 相应地为三个节点配置文件。

  2. 启动 Coordinator 节点。

  1. $ ./bin/pulsar sql-worker run
  1. 启动 worker 节点。
  1. $ ./bin/pulsar sql-worker run
  1. 启动 SQL CLI 并检查集群的状态。
  1. $ ./bin/pulsar sql --server <coordinate_url>
  1. 检查节点的状态。
  1. presto> SELECT * FROM system.runtime.nodes;
  2. node_id | http_uri | node_version | coordinator | state
  3. ---------+-------------------------+--------------+-------------+--------
  4. 1 | http://192.168.2.1:8081 | testversion | true | active
  5. 3 | http://192.168.2.2:8081 | testversion | false | active
  6. 2 | http://192.168.2.3:8081 | testversion | false | active

For more information about deployment in Presto, refer to Presto deployment.