Monitoring a HAWQ System

You can monitor a HAWQ system using a variety of tools included with the system or available as add-ons.

Observing the HAWQ system day-to-day performance helps administrators understand the system behavior, plan workflow, and troubleshoot problems. This chapter discusses tools for monitoring database performance and activity.

Also, be sure to review Recommended Monitoring and Maintenance Tasks for monitoring activities you can script to quickly detect problems in the system.

Using hawq_toolkit

Use HAWQ’s administrative schema hawq_toolkit to query the system catalogs, log files, and operating environment for system status information. The hawq_toolkit schema contains several views you can access using SQL commands. The hawq_toolkit schema is accessible to all database users. Some objects require superuser permissions. Use a command similar to the following to add the hawq_toolkit schema to your schema search path:

  1. => SET ROLE 'gpadmin' ;
  2. =# SET search_path TO myschema, hawq_toolkit ;

Monitoring System State

As a HAWQ administrator, you must monitor the system for problem events such as a segment going down or running out of disk space on a segment host. The following topics describe how to monitor the health of a HAWQ system and examine certain state information for a HAWQ system.

Checking System State

A HAWQ system is comprised of multiple PostgreSQL instances (the master and segments) spanning multiple machines. To monitor a HAWQ system, you need to know information about the system as a whole, as well as status information of the individual instances. The hawq state utility provides status information about a HAWQ system.

Viewing Master and Segment Status and Configuration

The default hawq state action is to check segment instances and show a brief status of the valid and failed segments. For example, to see a quick status of your HAWQ system:

  1. $ hawq state -b

You can also display information about the HAWQ master data directory by invoking hawq state with the -d option:

  1. $ hawq state -d <master_data_dir>

Checking Disk Space Usage

Checking Sizing of Distributed Databases and Tables

The hawq_toolkit administrative schema contains several views that you can use to determine the disk space usage for a distributed HAWQ database, schema, table, or index.

Viewing Disk Space Usage for a Database

To see the total size of a database (in bytes), use the hawq_size_of_database view in the hawq_toolkit administrative schema. For example:

  1. => SELECT * FROM hawq_toolkit.hawq_size_of_database
  2. ORDER BY sodddatname;
Viewing Disk Space Usage for a Table

The hawq_toolkit administrative schema contains several views for checking the size of a table. The table sizing views list the table by object ID (not by name). To check the size of a table by name, you must look up the relation name (relname) in the pg_class table. For example:

  1. => SELECT relname AS name, sotdsize AS size, sotdtoastsize
  2. AS toast, sotdadditionalsize AS other
  3. FROM hawq_toolkit.hawq_size_of_table_disk AS sotd, pg_class
  4. WHERE sotd.sotdoid=pg_class.oid ORDER BY relname;
Viewing Disk Space Usage for Indexes

The hawq_toolkit administrative schema contains a number of views for checking index sizes. To see the total size of all index(es) on a table, use the hawq_size_of_all_table_indexes view. To see the size of a particular index, use the hawq_size_of_index view. The index sizing views list tables and indexes by object ID (not by name). To check the size of an index by name, you must look up the relation name (relname) in the pg_class table. For example:

  1. => SELECT soisize, relname AS indexname
  2. FROM pg_class, hawq_size_of_index
  3. WHERE pg_class.oid=hawq_size_of_index.soioid
  4. AND pg_class.relkind='i';

Viewing Metadata Information about Database Objects

HAWQ uses its system catalogs to track various metadata information about the objects stored in a database (tables, views, indexes and so on), as well as global objects including roles and tablespaces.

Viewing the Last Operation Performed

You can use the system views pg_stat_operations and pg_stat_partition_operations to look up actions performed on a database object. For example, to view when the cust table was created and when it was last analyzed:

  1. => SELECT schemaname AS schema, objname AS table,
  2. usename AS role, actionname AS action,
  3. subtype AS type, statime AS time
  4. FROM pg_stat_operations
  5. WHERE objname='cust';
  1. schema | table | role | action | type | time
  2. --------+-------+------+---------+-------+--------------------------
  3. sales | cust | main | CREATE | TABLE | 2010-02-09 18:10:07.867977-08
  4. sales | cust | main | VACUUM | | 2010-02-10 13:32:39.068219-08
  5. sales | cust | main | ANALYZE | | 2010-02-25 16:07:01.157168-08
  6. (3 rows)

Viewing the Definition of an Object

You can use the psql \d meta-command to display the definition of an object, such as a table or view. For example, to see the definition of a table named sales:

  1. => \d sales
  1. Append-Only Table "public.sales"
  2. Column | Type | Modifiers
  3. --------+---------+-----------
  4. id | integer |
  5. year | integer |
  6. qtr | integer |
  7. day | integer |
  8. region | text |
  9. Compression Type: None
  10. Compression Level: 0
  11. Block Size: 32768
  12. Checksum: f
  13. Distributed by: (id)

Viewing Query Workfile Usage Information

The HAWQ administrative schema hawq_toolkit contains views that display information about HAWQ workfiles. HAWQ creates workfiles on disk if it does not have sufficient memory to execute the query in memory. This information can be used for troubleshooting and tuning queries. The information in the views can also be used to specify the values for the HAWQ configuration parameters hawq_workfile_limit_per_query and hawq_workfile_limit_per_segment.

Views in the hawq_toolkit schema include:

  • hawq_workfile_entries - one row for each operator currently using disk space for workfiles on a segment
  • hawq_workfile_usage_per_query - one row for each running query currently using disk space for workfiles on a segment
  • hawq_workfile_usage_per_segment - one row for each segment where each row displays the total amount of disk space currently in use for workfiles on the segment

HAWQ Error Codes

The following section describes SQL error codes for certain database events.

SQL Standard Error Codes

The following table lists all the defined error codes. Some are not used, but are defined by the SQL standard. The error classes are also shown. For each error class there is a standard error code having the last three characters 000. This code is used only for error conditions that fall within the class but do not have any more-specific code assigned.

The PL/pgSQL condition name for each error code is the same as the phrase shown in the table, with underscores substituted for spaces. For example, code 22012, DIVISION BY ZERO, has condition name DIVISION_BY_ZERO. Condition names can be written in either upper or lower case.

Note: PL/pgSQL does not recognize warning, as opposed to error, condition names; those are classes 00, 01, and 02.

Error CodeMeaningConstant
Class 00— Successful Completion
00000SUCCESSFUL COMPLETIONsuccessful_completion
Class 01 — Warning
01000WARNINGwarning
0100CDYNAMIC RESULT SETS RETURNEDdynamic_result_sets_returned
01008IMPLICIT ZERO BIT PADDINGimplicit_zero_bit_padding
01003NULL VALUE ELIMINATED IN SET FUNCTIONnull_value_eliminated_in_set_function
01007PRIVILEGE NOT GRANTEDprivilege_not_granted
01006PRIVILEGE NOT REVOKEDprivilege_not_revoked
01004STRING DATA RIGHT TRUNCATIONstring_data_right_truncation
01P01DEPRECATED FEATUREdeprecated_feature
Class 02 — No Data (this is also a warning class per the SQL standard)
02000NO DATAno_data
02001NO ADDITIONAL DYNAMIC RESULT SETS RETURNEDno_additional_dynamic_result_sets_returned
Class 03 — SQL Statement Not Yet Complete
03000SQL STATEMENT NOT YET COMPLETEsql_statement_not_yet_complete
Class 08 — Connection Exception
08000CONNECTION EXCEPTIONconnection_exception
08003CONNECTION DOES NOT EXISTconnection_does_not_exist
08006CONNECTION FAILUREconnection_failure
08001SQLCLIENT UNABLE TO ESTABLISH SQLCONNECTIONsqlclient_unable_to_establish_sqlconnection
08004SQLSERVER REJECTED ESTABLISHMENT OF SQLCONNECTIONsqlserver_rejected_establishment_of_sqlconnection
08007TRANSACTION RESOLUTION UNKNOWNtransaction_resolution_unknown
08P01PROTOCOL VIOLATIONprotocol_violation
Class 09 — Triggered Action Exception
09000TRIGGERED ACTION EXCEPTIONtriggered_action_exception
Class 0A — Feature Not Supported
0A000FEATURE NOT SUPPORTEDfeature_not_supported
Class 0B — Invalid Transaction Initiation
0B000INVALID TRANSACTION INITIATIONinvalid_transaction_initiation
Class 0F — Locator Exception
0F000LOCATOR EXCEPTIONlocator_exception
0F001INVALID LOCATOR SPECIFICATIONinvalid_locator_specification
Class 0L — Invalid Grantor
0L000INVALID GRANTORinvalid_grantor
0LP01INVALID GRANT OPERATIONinvalid_grant_operation
Class 0P — Invalid Role Specification
0P000INVALID ROLE SPECIFICATIONinvalid_role_specification
Class 21 — Cardinality Violation
21000CARDINALITY VIOLATIONcardinality_violation
Class 22 — Data Exception
22000DATA EXCEPTIONdata_exception
2202EARRAY SUBSCRIPT ERRORarray_subscript_error
22021CHARACTER NOT IN REPERTOIREcharacter_not_in_repertoire
22008DATETIME FIELD OVERFLOWdatetime_field_overflow
22012DIVISION BY ZEROdivision_by_zero
22005ERROR IN ASSIGNMENTerror_in_assignment
2200BESCAPE CHARACTER CONFLICTescape_character_conflict
22022INDICATOR OVERFLOWindicator_overflow
22015INTERVAL FIELD OVERFLOWinterval_field_overflow
2201EINVALID ARGUMENT FOR LOGARITHMinvalid_argument_for_logarithm
2201FINVALID ARGUMENT FOR POWER FUNCTIONinvalid_argument_for_power_function
2201GINVALID ARGUMENT FOR WIDTH BUCKET FUNCTIONinvalid_argument_for_width_bucket_function
22018INVALID CHARACTER VALUE FOR CASTinvalid_character_value_for_cast
22007INVALID DATETIME FORMATinvalid_datetime_format
22019INVALID ESCAPE CHARACTERinvalid_escape_character
2200DINVALID ESCAPE OCTETinvalid_escape_octet
22025INVALID ESCAPE SEQUENCEinvalid_escape_sequence
22P06NONSTANDARD USE OF ESCAPE CHARACTERnonstandard_use_of_escape_character
22010INVALID INDICATOR PARAMETER VALUEinvalid_indicator_parameter_value
22020INVALID LIMIT VALUEinvalid_limit_value
22023INVALID PARAMETER VALUEinvalid_parameter_value
2201BINVALID REGULAR EXPRESSIONinvalid_regular_expression
22009INVALID TIME ZONE DISPLACEMENT VALUEinvalid_time_zone_displacement_value
2200CINVALID USE OF ESCAPE CHARACTERinvalid_use_of_escape_character
2200GMOST SPECIFIC TYPE MISMATCHmost_specific_type_mismatch
22004NULL VALUE NOT ALLOWEDnull_value_not_allowed
22002NULL VALUE NO INDICATOR PARAMETERnull_value_no_indicator_parameter
22003NUMERIC VALUE OUT OF RANGEnumeric_value_out_of_range
22026STRING DATA LENGTH MISMATCHstring_data_length_mismatch
22001STRING DATA RIGHT TRUNCATIONstring_data_right_truncation
22011SUBSTRING ERRORsubstring_error
22027TRIM ERRORtrim_error
22024UNTERMINATED C STRINGunterminated_c_string
2200FZERO LENGTH CHARACTER STRINGzero_length_character_string
22P01FLOATING POINT EXCEPTIONfloating_point_exception
22P02INVALID TEXT REPRESENTATIONinvalid_text_representation
22P03INVALID BINARY REPRESENTATIONinvalid_binary_representation
22P04BAD COPY FILE FORMATbad_copy_file_format
22P05UNTRANSLATABLE CHARACTERuntranslatable_character
Class 23 — Integrity Constraint Violation
23000INTEGRITY CONSTRAINT VIOLATIONintegrity_constraint_violation
23001RESTRICT VIOLATIONrestrict_violation
23502NOT NULL VIOLATIONnot_null_violation
23503FOREIGN KEY VIOLATIONforeign_key_violation
23505UNIQUE VIOLATIONunique_violation
23514CHECK VIOLATIONcheck_violation
Class 24 — Invalid Cursor State
24000INVALID CURSOR STATEinvalid_cursor_state
Class 25 — Invalid Transaction State
25000INVALID TRANSACTION STATEinvalid_transaction_state
25001ACTIVE SQL TRANSACTIONactive_sql_transaction
25002BRANCH TRANSACTION ALREADY ACTIVEbranch_transaction_already_active
25008HELD CURSOR REQUIRES SAME ISOLATION LEVELheld_cursor_requires_same_isolation_level
25003INAPPROPRIATE ACCESS MODE FOR BRANCH TRANSACTIONinappropriate_access_mode_for_branch_transaction
25004INAPPROPRIATE ISOLATION LEVEL FOR BRANCH TRANSACTIONinappropriate_isolation_level_for_branch_transaction
25005NO ACTIVE SQL TRANSACTION FOR BRANCH TRANSACTIONno_active_sql_transaction_for_branch_transaction
25006READ ONLY SQL TRANSACTIONread_only_sql_transaction
25007SCHEMA AND DATA STATEMENT MIXING NOT SUPPORTEDschema_and_data_statement_mixing_not_supported
25P01NO ACTIVE SQL TRANSACTIONno_active_sql_transaction
25P02IN FAILED SQL TRANSACTIONin_failed_sql_transaction
Class 26 — Invalid SQL Statement Name
26000INVALID SQL STATEMENT NAMEinvalid_sql_statement_name
Class 27 — Triggered Data Change Violation
27000TRIGGERED DATA CHANGE VIOLATIONtriggered_data_change_violation
Class 28 — Invalid Authorization Specification
28000INVALID AUTHORIZATION SPECIFICATIONinvalid_authorization_specification
Class 2B — Dependent Privilege Descriptors Still Exist
2B000DEPENDENT PRIVILEGE DESCRIPTORS STILL EXISTdependent_privilege_descriptors_still_exist
2BP01DEPENDENT OBJECTS STILL EXISTdependent_objects_still_exist
Class 2D — Invalid Transaction Termination
2D000INVALID TRANSACTION TERMINATIONinvalid_transaction_termination
Class 2F — SQL Routine Exception
2F000SQL ROUTINE EXCEPTIONsql_routine_exception
2F005FUNCTION EXECUTED NO RETURN STATEMENTfunction_executed_no_return_statement
2F002MODIFYING SQL DATA NOT PERMITTEDmodifying_sql_data_not_permitted
2F003PROHIBITED SQL STATEMENT ATTEMPTEDprohibited_sql_statement_attempted
2F004READING SQL DATA NOT PERMITTEDreading_sql_data_not_permitted
Class 34 — Invalid Cursor Name
34000INVALID CURSOR NAMEinvalid_cursor_name
Class 38 — External Routine Exception
38000EXTERNAL ROUTINE EXCEPTIONexternal_routine_exception
38001CONTAINING SQL NOT PERMITTEDcontaining_sql_not_permitted
38002MODIFYING SQL DATA NOT PERMITTEDmodifying_sql_data_not_permitted
38003PROHIBITED SQL STATEMENT ATTEMPTEDprohibited_sql_statement_attempted
38004READING SQL DATA NOT PERMITTEDreading_sql_data_not_permitted
Class 39 — External Routine Invocation Exception
39000EXTERNAL ROUTINE INVOCATION EXCEPTIONexternal_routine_invocation_exception
39001INVALID SQLSTATE RETURNEDinvalid_sqlstate_returned
39004NULL VALUE NOT ALLOWEDnull_value_not_allowed
39P01TRIGGER PROTOCOL VIOLATEDtrigger_protocol_violated
39P02SRF PROTOCOL VIOLATEDsrf_protocol_violated
Class 3B — Savepoint Exception
3B000SAVEPOINT EXCEPTIONsavepoint_exception
3B001INVALID SAVEPOINT SPECIFICATIONinvalid_savepoint_specification
Class 3D — Invalid Catalog Name
3D000INVALID CATALOG NAMEinvalid_catalog_name
Class 3F — Invalid Schema Name
3F000INVALID SCHEMA NAMEinvalid_schema_name
Class 40 — Transaction Rollback
40000TRANSACTION ROLLBACKtransaction_rollback
40002TRANSACTION INTEGRITY CONSTRAINT VIOLATIONtransaction_integrity_constraint_violation
40001SERIALIZATION FAILUREserialization_failure
40003STATEMENT COMPLETION UNKNOWNstatement_completion_unknown
40P01DEADLOCK DETECTEDdeadlock_detected
Class 42 — Syntax Error or Access Rule Violation
42000SYNTAX ERROR OR ACCESS RULE VIOLATIONsyntax_error_or_access_rule_violation
42601SYNTAX ERRORsyntax_error
42501INSUFFICIENT PRIVILEGEinsufficient_privilege
42846CANNOT COERCEcannot_coerce
42803GROUPING ERRORgrouping_error
42830INVALID FOREIGN KEYinvalid_foreign_key
42602INVALID NAMEinvalid_name
42622NAME TOO LONGname_too_long
42939RESERVED NAMEreserved_name
42804DATATYPE MISMATCHdatatype_mismatch
42P18INDETERMINATE DATATYPEindeterminate_datatype
42809WRONG OBJECT TYPEwrong_object_type
42703UNDEFINED COLUMNundefined_column
42883UNDEFINED FUNCTIONundefined_function
42P01UNDEFINED TABLEundefined_table
42P02UNDEFINED PARAMETERundefined_parameter
42704UNDEFINED OBJECTundefined_object
42701DUPLICATE COLUMNduplicate_column
42P03DUPLICATE CURSORduplicate_cursor
42P04DUPLICATE DATABASEduplicate_database
42723DUPLICATE FUNCTIONduplicate_function
42P05DUPLICATE PREPARED STATEMENTduplicate_prepared_statement
42P06DUPLICATE SCHEMAduplicate_schema
42P07DUPLICATE TABLEduplicate_table
42712DUPLICATE ALIASduplicate_alias
42710DUPLICATE OBJECTduplicate_object
42702AMBIGUOUS COLUMNambiguous_column
42725AMBIGUOUS FUNCTIONambiguous_function
42P08AMBIGUOUS PARAMETERambiguous_parameter
42P09AMBIGUOUS ALIASambiguous_alias
42P10INVALID COLUMN REFERENCEinvalid_column_reference
42611INVALID COLUMN DEFINITIONinvalid_column_definition
42P11INVALID CURSOR DEFINITIONinvalid_cursor_definition
42P12INVALID DATABASE DEFINITIONinvalid_database_definition
42P13INVALID FUNCTION DEFINITIONinvalid_function_definition
42P14INVALID PREPARED STATEMENT DEFINITIONinvalid_prepared_statement_definition
42P15INVALID SCHEMA DEFINITIONinvalid_schema_definition
42P16INVALID TABLE DEFINITIONinvalid_table_definition
42P17INVALID OBJECT DEFINITIONinvalid_object_definition
Class 44 — WITH CHECK OPTION Violation
44000WITH CHECK OPTION VIOLATIONwith_check_option_violation
Class 53 — Insufficient Resources
53000INSUFFICIENT RESOURCESinsufficient_resources
53100DISK FULLdisk_full
53200OUT OF MEMORYout_of_memory
53300TOO MANY CONNECTIONStoo_many_connections
Class 54 — Program Limit Exceeded
54000PROGRAM LIMIT EXCEEDEDprogram_limit_exceeded
54001STATEMENT TOO COMPLEXstatement_too_complex
54011TOO MANY COLUMNStoo_many_columns
54023TOO MANY ARGUMENTStoo_many_arguments
Class 55 — Object Not In Prerequisite State
55000OBJECT NOT IN PREREQUISITE STATEobject_not_in_prerequisite_state
55006OBJECT IN USEobject_in_use
55P02CANT CHANGE RUNTIME PARAMcant_change_runtime_param
55P03LOCK NOT AVAILABLElock_not_available
Class 57 — Operator Intervention
57000OPERATOR INTERVENTIONoperator_intervention
57014QUERY CANCELEDquery_canceled
57P01ADMIN SHUTDOWNadmin_shutdown
57P02CRASH SHUTDOWNcrash_shutdown
57P03CANNOT CONNECT NOWcannot_connect_now
Class 58 — System Error (errors external to HAWQ )
58030IO ERRORio_error
58P01UNDEFINED FILEundefined_file
58P02DUPLICATE FILEduplicate_file
Class F0 — Configuration File Error
F0000CONFIG FILE ERRORconfig_file_error
F0001LOCK FILE EXISTSlock_file_exists
Class P0 — PL/pgSQL Error
P0000PLPGSQL ERRORplpgsql_error
P0001RAISE EXCEPTIONraise_exception
P0002NO DATA FOUNDno_data_found
P0003TOO MANY ROWStoo_many_rows
Class XX — Internal Error
XX000INTERNAL ERRORinternal_error
XX001DATA CORRUPTEDdata_corrupted
XX002INDEX CORRUPTEDindex_corrupted