Platform Requirements

Platform Requirements

This topic describes the Greenplum Database 6 platform and operating system software requirements.

Important: Pivotal Support does not provide support for open source versions of Greenplum Database. Only Pivotal Greenplum Database is supported by Pivotal Support.

Parent topic: Installing and Upgrading Greenplum

Operating Systems

Greenplum 6 runs on the following operating system platforms:

  • Red Hat Enterprise Linux 64-bit 7.x (See the following Note.)
  • Red Hat Enterprise Linux 64-bit 6.x
  • CentOS 64-bit 7.x
  • CentOS 64-bit 6.x
  • Ubuntu 18.04 LTS
  • Oracle Linux 64-bit 7, using the Red Hat Compatible Kernel (RHCK)

Important: Significant Greenplum Database performance degradation has been observed when enabling resource group-based workload management on RedHat 6.x and CentOS 6.x systems. This issue is caused by a Linux cgroup kernel bug. This kernel bug has been fixed in CentOS 7.x and Red Hat 7.x systems.

If you use RedHat 6 and the performance with resource groups is acceptable for your use case, upgrade your kernel to version 2.6.32-696 or higher to benefit from other fixes to the cgroups implementation.

Note: For Greenplum Database that is installed on Red Hat Enterprise Linux 7.x or CentOS 7.x prior to 7.3, an operating system issue might cause Greenplum Database that is running large workloads to hang in the workload. The Greenplum Database issue is caused by Linux kernel bugs.

RHEL 7.3 and CentOS 7.3 resolves the issue.

Greenplum Database server supports TLS version 1.2 on RHEL/CentOS systems, and TLS version 1.3 on Ubuntu systems.

Software Dependencies

Greenplum Database 6 requires the following software packages on RHEL/CentOS 6/7 systems which are installed automatically as dependencies when you install the Greenplum Database RPM package):

  • apr
  • apr-util
  • bash
  • bzip2
  • curl
  • krb5
  • libcurl
  • libevent (or libevent2 on RHEL/CentOS 6)
  • libxml2
  • libyaml
  • zlib
  • openldap
  • openssh
  • openssl
  • openssl-libs (RHEL7/Centos7)
  • perl
  • readline
  • rsync
  • R
  • sed (used by gpinitsystem)
  • tar
  • zip

On Ubuntu systems, Greenplum Database 6 requires the following software packages, which are installed automatically as dependencies when you install Greenplum Database with the Debian package installer:

  • libapr1
  • libaprutil1
  • bash
  • bzip2
  • krb5-multidev
  • libcurl3-gnutls
  • libcurl4
  • libevent-2.1-6
  • libxml2
  • libyaml-0-2
  • zlib1g
  • libldap-2.4-2
  • openssh-client
  • openssh-client
  • openssl
  • perl
  • readline
  • rsync
  • sed
  • tar
  • zip
  • net-tools
  • less
  • iproute2

Greenplum Database 6 uses Python 2.7.12, which is included with the product installation (and not installed as a package dependency).

Important: SSL can be used only on the Greenplum Database master host system. It cannot be used on the segment host systems.

Important: For all Greenplum Database host systems, SELinux must be disabled. You should also disable firewall software, although firewall software can be enabled if it is required for security purposes. See Disabling SELinux and Firewall Software.

Java

Greenplum 6 can use these Java versions for PL/Java and PXF:

  • Open JDK 8 or Open JDK 11, available from AdoptOpenJDK
  • Oracle JDK 8 or Oracle JDK 11

Hardware and Network

The following table lists minimum recommended specifications for hardware servers intended to support Greenplum Database on Linux systems. All host servers in your Greenplum Database system must have the same hardware and software configuration.

Table 1. Minimum Hardware Requirements
Minimum CPUAny x86_64 compatible CPU
Minimum Memory16 GB RAM per server
Disk Space Requirements
  • 150MB per host for Greenplum installation
  • Approximately 300MB per segment instance for meta data
  • Appropriate free space for data with disks at no more than 70% capacity
Network Requirements10 Gigabit Ethernet within the array

NIC bonding is recommended when multiple interfaces are present

Pivotal Greenplum can use either IPV4 or IPV6 protocols.

Storage

You should run Greenplum Database on an XFS file system.

Greenplum Database can run on on network or shared storage if the shared storage is presented as a block device to the servers running Greenplum Database and the XFS file system is mounted on the block device. Network file systems are not recommended. When using network or shared storage, Greenplum Database mirroring must be used in the same way as with local storage, and no modifications should be made to the mirroring scheme or the recovery scheme of the segments.

Other features of the shared storage such as de-duplication and/or replication can be used with Greenplum Database as long as they do not interfere with the expected operation of Greenplum Database.

Greenplum Database can be deployed to virtualized systems only if the storage is presented as block devices and the XFS file system is mounted for the storage of the segment directories.

Greenplum Database can run on Amazon Web Services (AWS) servers using either Amazon instance store (Amazon uses the volume names ephemeral[0-20]) or Amazon Elastic Block Store (Amazon EBS) storage. If using Amazon EBS storage the storage should be RAID of Amazon EBS volumes and mounted with the XFS file system.

Hadoop Distributions

Greenplum Database provides access to HDFS with the Greenplum Platform Extension Framework (PXF). PXF v5.14.0 is integrated with Greenplum Database 6, and provides access to Hadoop, object store, and SQL external data stores. Refer to Accessing External Data with PXF in the Greenplum Database Administrator Guide for PXF configuration and usage information.

PXF can use Cloudera, Hortonworks Data Platform, MapR, and generic Apache Hadoop distributions. PXF bundles all of the JAR files on which it depends, including the following Hadoop libraries:

Table 2. PXF Hadoop Supported Platforms
PXF VersionHadoop VersionHive Server VersionHBase Server Version
5.14.0, 5.13.0, 5.12.0, 5.11.1, 5.10.12.x, 3.1+1.x, 2.x, 3.1+1.3.2
5.8.22.x1.x1.3.2
5.8.12.x1.x1.3.2

Note: If you plan to access JSON format data stored in a Cloudera Hadoop cluster, PXF requires a Cloudera version 5.8 or later Hadoop distribution.