HDFS Readable and Writable External Table Examples (Deprecated)

Note: The gphdfs external table protocol is deprecated and will be removed in the next major release of Greenplum Database.

The following code defines a readable external table for an HDFS file named filename.txt on port 8081.

  1. =# CREATE EXTERNAL TABLE ext_expenses (
  2. name text,
  3. date date,
  4. amount float4,
  5. category text,
  6. desc1 text )
  7. LOCATION ('gphdfs://hdfshost-1:8081/data/filename.txt')
  8. FORMAT 'TEXT' (DELIMITER ',');

The following code defines a set of readable external tables that have a custom format located in the same HDFS directory on port 8081.

  1. =# CREATE EXTERNAL TABLE ext_expenses (
  2. name text,
  3. date date,
  4. amount float4,
  5. category text,
  6. desc1 text )
  7. LOCATION ('gphdfs://hdfshost-1:8081/data/custdat*.dat')
  8. FORMAT 'custom' (formatter='gphdfs_import');

The following code defines an HDFS directory for a writable external table on port 8081 with all compression options specified.

  1. =# CREATE WRITABLE EXTERNAL TABLE ext_expenses (
  2. name text,
  3. date date,
  4. amount float4,
  5. category text,
  6. desc1 text )
  7. LOCATION ('gphdfs://hdfshost-1:8081/data/?compress=true&compression_type=RECORD
  8. &codec=org.apache.hadoop.io.compress.DefaultCodec')
  9. FORMAT 'custom' (formatter='gphdfs_export');

Because the previous code uses the default compression options for compression_type and codec, the following command is equivalent.

  1. =# CREATE WRITABLE EXTERNAL TABLE ext_expenses (
  2. name text,
  3. date date,
  4. amount float4,
  5. category text,
  6. desc1 text )
  7. LOCATION ('gphdfs://hdfshost-1:8081/data?compress=true')
  8. FORMAT 'custom' (formatter='gphdfs_export');

Parent topic: Accessing HDFS Data with gphdfs (Deprecated)