Analyzing Resource Manager Status

You can use several queries to force the resource manager to dump more details about active resource context status, current resource queue status, and HAWQ segment status.

Connection Track Status

Any query execution requiring resource allocation from HAWQ resource manager has one connection track instance tracking the whole resource usage lifecycle. You can find all resource requests and allocated resources in this dump.

The following is an example query to obtain connection track status:

  1. postgres=# SELECT * FROM dump_resource_manager_status(1);
  1. dump_resource_manager_status
  2. ----------------------------------------------------------------------------------------
  3. Dump resource manager connection track status to /tmp/resource_manager_conntrack_status
  4. (1 row)

The following output is an example of resource context (connection track) status.

  1. Number of free connection ids : 65535
  2. Number of connection tracks having requests to handle : 0
  3. Number of connection tracks having responses to send : 0SOCK(client=192.0.2.0:37396:time=2015-11-15-20:54:35.379006),
  4. CONN(id=44:user=role_2:queue=queue2:prog=3:time=2015-11-15-20:54:35.378631:lastact=2015-11-15-20:54:35.378631:
  5. headqueue=2015-11-15-20:54:35.378631),ALLOC(session=89:resource=(1024 MB, 0.250000 CORE)x(1:min=1:act=-1):
  6. slicesize=5:io bytes size=3905568:vseg limit per seg=8:vseg limit per query=1000:fixsegsize=1:reqtime=2015-11-15-20:54:35.379144:
  7. alloctime=2015-11-15-20:54:35.379144:stmt=128 MB x 0),LOC(size=3:host(sdw3:3905568):host(sdw2:3905568):
  8. host(sdw1:3905568)),RESOURCE(hostsize=0),MSG(id=259:size=96:contsize=96:recvtime=1969-12-31-16:00:00.0,
  9. client=192.0.2.0:37396),COMMSTAT(fd=5:readbuffer=0:writebuffer=0
  10. buffers:toclose=false:forceclose=false)
Output FieldDescription
Number of free connection idsProvides connection track id resource. HAWQ resource manager supports maximum 65536 live connection track instances.
Number of connection tracks having requests to handleCounts the number of requests accepted by resource manager but not processed yet.
Number of connection tracks having responses to sendCounts the number of responses generated by resource manager but not sent out yet.
SOCKProvides the request socket connection information.
CONNProvides the information about the role name, target queue, current status of the request:

prog=1 means the connection is established

prog=2 means the connection is registered by role id

prog=3 means the connection is waiting for resource in the target queue

prog=4 means the resource has been allocated to this connection

prog>5 means some failure or abnormal statuses
ALLOCProvides session id information, resource expectation, session level resource limits, statement level resource settings, estimated query workload by slice number, and so on.
LOCProvides query scan HDFS data locality information.
RESOURCEProvides information on the already allocated resource.
MSGProvides the latest received message information.
COMMSTATShows current socket communication buffer status.

Resource Queue Status

You can get more details of the status of resource queues.

Besides the information provided in pg_resqueue_status, you can also get YARN resource queue maximum capacity report, total number of HAWQ resource queues, and HAWQ resource queues’ derived resource capacities.

The following is a query to obtain resource queue status:

  1. postgres=# SELECT * FROM dump_resource_manager_status(2);
  1. dump_resource_manager_status
  2. -------------------------------------------------------------------------------------
  3. Dump resource manager resource queue status to /tmp/resource_manager_resqueue_status
  4. (1 row)

The possible output of resource queue status is shown as below.

  1. Maximum capacity of queue in global resource manager cluster 1.000000
  2. Number of resource queues : 4
  3. QUEUE(name=pg_root:parent=NULL:children=3:busy=0:paused=0),
  4. REQ(conn=0:request=0:running=0),
  5. SEGCAP(ratio=4096:ratioidx=-1:segmem=128MB:segcore=0.031250:segnum=1536:segnummax=1536),
  6. QUECAP(memmax=196608:coremax=48.000000:memper=100.000000:mempermax=100.000000:coreper=100.000000:corepermax=100.000000),
  7. QUEUSE(alloc=(0 MB,0.000000 CORE):request=(0 MB,0.000000 CORE):inuse=(0 MB,0.000000 CORE))
  8. QUEUE(name=pg_default:parent=pg_root:children=0:busy=0:paused=0),
  9. REQ(conn=0:request=0:running=0),
  10. SEGCAP(ratio=4096:ratioidx=-1:segmem=1024MB:segcore=0.250000:segnum=38:segnummax=76),
  11. QUECAP(memmax=78643:coremax=19.000000:memper=20.000000:mempermax=40.000000:coreper=20.000000:corepermax=40.000000),
  12. QUEUSE(alloc=(0 MB,0.000000 CORE):request=(0 MB,0.000000 CORE):inuse=(0 MB,0.000000 CORE))
Output FieldDescription
Maximum capacity of queue in global resource manager clusterYARN maximum capacity report for the resource queue.
Number of resource queuesTotal number of HAWQ resource queues.
QUEUEProvides basic structural information about the resource queue and whether it is busy dispatching resources to some queries.
REQProvides concurrency counter and the status of waiting queues.
SEGCAPProvides the virtual segment resource quota and dispatchable number of virtual segments.
QUECAPProvides derived resource queue capacity and actual percentage of the cluster resource a queue can use.
QUEUSEProvides information about queue resource usage.

HAWQ Segment Status

Use the following query to obtain the status of a HAWQ segment.

  1. postgres=# SELECT * FROM dump_resource_manager_status(3);
  1. dump_resource_manager_status
  2. -----------------------------------------------------------------------------------
  3. Dump resource manager resource pool status to /tmp/resource_manager_respool_status
  4. (1 row)

The following output shows the status of a HAWQ segment status. This example describes a host named sdw1 having resource capacity 64GB memory and 16 vcore. It now has 64GB available resource ready for use and 16 containers are held.

  1. HOST_ID(id=0:hostname:sdw1)
  2. HOST_INFO(FTSTotalMemoryMB=65536:FTSTotalCore=16:GRMTotalMemoryMB=0:GRMTotalCore=0)
  3. HOST_AVAILABLITY(HAWQAvailable=true:GLOBAvailable=false)
  4. HOST_RESOURCE(AllocatedMemory=65536:AllocatedCores=16.000000:AvailableMemory=65536:
  5. AvailableCores=16.000000:IOBytesWorkload=0:SliceWorkload=0:LastUpdateTime=1447661681125637:
  6. RUAlivePending=false)
  7. HOST_RESOURCE_CONTAINERSET(ratio=4096:AllocatedMemory=65536:AvailableMemory=65536:
  8. AllocatedCore=16.000000:AvailableCore:16.000000)
  9. RESOURCE_CONTAINER(ID=0:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  10. RESOURCE_CONTAINER(ID=1:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  11. RESOURCE_CONTAINER(ID=2:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  12. RESOURCE_CONTAINER(ID=3:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  13. RESOURCE_CONTAINER(ID=4:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  14. RESOURCE_CONTAINER(ID=5:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  15. RESOURCE_CONTAINER(ID=6:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  16. RESOURCE_CONTAINER(ID=7:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  17. RESOURCE_CONTAINER(ID=8:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  18. RESOURCE_CONTAINER(ID=9:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  19. RESOURCE_CONTAINER(ID=10:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  20. RESOURCE_CONTAINER(ID=11:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  21. RESOURCE_CONTAINER(ID=12:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  22. RESOURCE_CONTAINER(ID=13:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  23. RESOURCE_CONTAINER(ID=14:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
  24. RESOURCE_CONTAINER(ID=15:MemoryMB=4096:Core=1:Life=0:HostName=sdw1)
Output FieldDescription
HOST_IDProvides the recognized segment name and internal id.
HOST_INFOProvides the configured segment resource capacities. GRMTotalMemoryMB and GRMTotalCore shows the limits reported by YARN, FTSTotalMemoryMB and FTSTotalCore show the limits configured in HAWQ.
HOST_AVAILABILITYShows if the segment is available from HAWQ fault tolerance service (FTS) view or YARN view.
HOST_RESOURCEShows current allocated and available resource. Estimated workload counters are also shown here.
HOST_RESOURCE_CONTAINERSETShows each held containers.