4.5. Resource Groups

Resource groups place limits on resource usage, and can enforce queueing policies onqueries that run within them or divide their resources among sub-groups. A querybelongs to a single resource group, and consumes resources from that group (and its ancestors).Except for the limit on queued queries, when a resource group runs out of a resourceit does not cause running queries to fail; instead new queries become queued.A resource group may have sub-groups or may accept queries, but may not do both.

The resource groups and associated selection rules are configured by a manager which is pluggable.Add an etc/resource-groups.properties file with the following contents to enablethe built-in manager that reads a JSON config file:

  1. resource-groups.configuration-manager=file
  2. resource-groups.config-file=etc/resource_groups.json

Change the value of resource-groups.config-file to point to a JSON config file,which can be an absolute path, or a path relative to the Presto data directory.

Resource Group Properties

  • name (required): name of the group. May be a template (see below).

  • maxQueued (required): maximum number of queued queries. Once this limit is reachednew queries will be rejected.

  • hardConcurrencyLimit (required): maximum number of running queries.

  • softMemoryLimit (required): maximum amount of distributed memory thisgroup may use before new queries become queued. May be specified asan absolute value (i.e. 1GB) or as a percentage (i.e. 10%) of the cluster’s memory.

  • softCpuLimit (optional): maximum amount of CPU time thisgroup may use in a period (see cpuQuotaPeriod) before a penalty will be applied tothe maximum number of running queries. hardCpuLimit must also be specified.

  • hardCpuLimit (optional): maximum amount of CPU time thisgroup may use in a period.

  • schedulingPolicy (optional): specifies how queued queries are selected to run,and how sub-groups become eligible to start their queries. May be one of three values:

  • fair (default): queued queries are processed first-in-first-out, and sub-groupsmust take turns starting new queries (if they have any queued).
  • weighted_fair: sub-groups are selected based on their schedulingWeight and the number ofqueries they are already running concurrently. The expected share of running queries for asub-group is computed based on the weights for all currently eligible sub-groups. The sub-groupwith the least concurrency relative to its share is selected to start the next query.
  • weighted: queued queries are selected stochastically in proportion to their priority(specified via the query_priority session property). Sub groups are selectedto start new queries in proportion to their schedulingWeight.
  • query_priority: all sub-groups must also be configured with query_priority.Queued queries will be selected strictly according to their priority.
  • schedulingWeight (optional): weight of this sub-group. See above.Defaults to 1.

  • jmxExport (optional): If true, group statistics are exported to JMX for monitoring.Defaults to false.

  • subGroups (optional): list of sub-groups.

Selector Rules

  • user (optional): regex to match against user name.

  • source (optional): regex to match against source string.

    • queryType (optional): string to match against the type of the query submitted:
      • DATA_DEFINITION: Queries that alter/create/drop the metadata of schemas/tables/views, and that manageprepared statements, privileges, sessions, and transactions.
      • DELETE: DELETE queries.
      • DESCRIBE: DESCRIBE, DESCRIBE INPUT, DESCRIBE OUTPUT, and SHOW queries.
      • EXPLAIN: EXPLAIN queries.
      • INSERT: INSERT and CREATE TABLE AS queries.
      • SELECT: SELECT queries.
  • clientTags (optional): list of tags. To match, every tag in this list must be in the list ofclient-provided tags associated with the query.

  • group (required): the group these queries will run in.

Global Properties

  • cpuQuotaPeriod (optional): the period in which cpu quotas are enforced.Selectors are processed sequentially and the first one that matches will be used.

Providing Selector Properties

The source name can be set as follows:

  • CLI: use the —source option.
  • JDBC: set the ApplicationName client info property on the Connection instance.

Client tags can be set as follows:

  • CLI: use the —client-tags option.
  • JDBC: set the ClientTags client info property on the Connection instance.

Example

In the example configuration below, there are several resource groups, some of which are templates.Templates allow administrators to construct resource group trees dynamically. For example, inthe pipeline_${USER} group, ${USER} will be expanded to the name of the user that submittedthe query. ${SOURCE} is also supported, which will be expanded to the source that submitted thequery. You may also use custom named variables in the source and user regular expressions.

There are four selectors that define which queries run in which resource group:

  • The first selector matches queries from bob and places them in the admin group.
  • The second selector matches all data definition (DDL) queries from a source name that includes “pipeline”and places them in the global.data_definition group. This could help reduce queue times for thisclass of queries, since they are expected to be fast.
  • The third selector matches queries from a source name that includes “pipeline”, and places them in adynamically-created per-user pipeline group under the global.pipeline group.
  • The fourth selector matches queries that come from BI tools (which have a source matching the regularexpression "jdbc#(?<tool_name>.*)"), and have client provided tags that are a superset of “hi-pri”.These are placed in a dynamically-created sub-group under the global.pipeline.tools group. The dynamicsub-group will be created based on the named variable tool_name, which is extracted from the in theregular expression for source. Consider a query with a source “jdbc#powerfulbi”, user “kayla”, andclient tags “hipri” and “fast”. This query would be routed to the global.pipeline.bi-powerfulbi.kaylaresource group.
  • The last selector is a catch-all, which places all queries that have not yet been matched into a per-useradhoc group.

Together, these selectors implement the following policy:

  • The user “bob” is an admin and can run up to 50 concurrent queries. Queries will be run based on user-providedpriority.For the remaining users:

  • No more than 100 total queries may run concurrently.

  • Up to 5 concurrent DDL queries with a source “pipeline” can run. Queries are run in FIFO order.
  • Non-DDL queries will run under the global.pipeline group, with a total concurrency of 45, and a per-userconcurrency of 5. Queries are run in FIFO order.
  • For BI tools, each tool can run up to 10 concurrent queries, and each user can run up to 3. If the total demandexceeds the limit of 10, the user with the fewest running queries will get the next concurrency slot. This policyresults in fairness when under contention.
  • All remaining queries are placed into a per-user group under global.adhoc.other that behaves similarly.
  1. {
  2. "rootGroups": [
  3. {
  4. "name": "global",
  5. "softMemoryLimit": "80%",
  6. "hardConcurrencyLimit": 100,
  7. "maxQueued": 1000,
  8. "schedulingPolicy": "weighted",
  9. "jmxExport": true,
  10. "subGroups": [
  11. {
  12. "name": "data_definition",
  13. "softMemoryLimit": "10%",
  14. "hardConcurrencyLimit": 5,
  15. "maxQueued": 100,
  16. "schedulingWeight": 1
  17. },
  18. {
  19. "name": "adhoc",
  20. "softMemoryLimit": "10%",
  21. "hardConcurrencyLimit": 50,
  22. "maxQueued": 1,
  23. "schedulingWeight": 10,
  24. "subGroups": [
  25. {
  26. "name": "other",
  27. "softMemoryLimit": "10%",
  28. "hardConcurrencyLimit": 2,
  29. "maxQueued": 1,
  30. "schedulingWeight": 10,
  31. "schedulingPolicy": "weighted_fair",
  32. "subGroups": [
  33. {
  34. "name": "${USER}",
  35. "softMemoryLimit": "10%",
  36. "hardConcurrencyLimit": 1,
  37. "maxQueued": 100
  38. }
  39. ]
  40. },
  41. {
  42. "name": "bi-${tool_name}",
  43. "softMemoryLimit": "10%",
  44. "hardConcurrencyLimit": 10,
  45. "maxQueued": 100,
  46. "schedulingWeight": 10,
  47. "schedulingPolicy": "weighted_fair",
  48. "subGroups": [
  49. {
  50. "name": "${USER}",
  51. "softMemoryLimit": "10%",
  52. "hardConcurrencyLimit": 3,
  53. "maxQueued": 10
  54. }
  55. ]
  56. }
  57. ]
  58. },
  59. {
  60. "name": "pipeline",
  61. "softMemoryLimit": "80%",
  62. "hardConcurrencyLimit": 45,
  63. "maxQueued": 100,
  64. "schedulingWeight": 1,
  65. "jmxExport": true,
  66. "subGroups": [
  67. {
  68. "name": "pipeline_${USER}",
  69. "softMemoryLimit": "50%",
  70. "hardConcurrencyLimit": 5,
  71. "maxQueued": 100
  72. }
  73. ]
  74. }
  75. ]
  76. },
  77. {
  78. "name": "admin",
  79. "softMemoryLimit": "100%",
  80. "hardConcurrencyLimit": 50,
  81. "maxQueued": 100,
  82. "schedulingPolicy": "query_priority",
  83. "jmxExport": true
  84. }
  85. ],
  86. "selectors": [
  87. {
  88. "user": "bob",
  89. "group": "admin"
  90. },
  91. {
  92. "source": ".*pipeline.*",
  93. "queryType": "DATA_DEFINITION",
  94. "group": "global.data_definition"
  95. },
  96. {
  97. "source": ".*pipeline.*",
  98. "group": "global.pipeline.pipeline_${USER}"
  99. },
  100. {
  101. "source": "jdbc#(?<tool_name>.*)",
  102. "clientTags": ["hipri"],
  103. "group": "global.adhoc.bi-${tool_name}.${USER}"
  104. },
  105. {
  106. "group": "global.adhoc.other.${USER}"
  107. }
  108. ],
  109. "cpuQuotaPeriod": "1h"
  110. }