5.18. Thrift Connector

The Thrift connector makes it possible to integrate with external storage systemswithout a custom Presto connector implementation.

In order to use the Thrift connector with an external system, you need to implementthe PrestoThriftService interface, found below. Next, you configure the Thrift connectorto point to a set of machines, called Thrift servers, that implement the interface.As part of the interface implementation, the Thrift servers will provide metadata,splits and data. The Thrift server instances are assumed to be stateless and independentfrom each other.

Configuration

To configure the Thrift connector, create a catalog properties fileetc/catalog/thrift.properties with the following content,replacing the properties as appropriate:

  1. connector.name=presto-thrift
  2. presto.thrift.client.addresses=host:port,host:port

Multiple Thrift Systems

You can have as many catalogs as you need, so if you have additionalThrift systems to connect to, simply add another properties file to etc/catalogwith a different name (making sure it ends in .properties).

Configuration Properties

The following configuration properties are available:

Property Name Description
presto.thrift.client.addresses Location of Thrift servers
presto-thrift.max-response-size Maximum size of data returned from Thrift server
presto-thrift.metadata-refresh-threads Number of refresh threads for metadata cache
presto.thrift.client.max-retries Maximum number of retries for failed Thrift requests
presto.thrift.client.max-backoff-delay Maximum interval between retry attempts
presto.thrift.client.min-backoff-delay Minimum interval between retry attempts
presto.thrift.client.max-retry-time Maximum duration across all attempts of a Thrift request
presto.thrift.client.backoff-scale-factor Scale factor for exponential back off
presto.thrift.client.connect-timeout Connect timeout
presto.thrift.client.request-timeout Request timeout
presto.thrift.client.socks-proxy SOCKS proxy address
presto.thrift.client.max-frame-size Maximum size of a raw Thrift response
presto.thrift.client.transport Thrift transport type (UNFRAMED, FRAMED, HEADER)
presto.thrift.client.protocol Thrift protocol type (BINARY, COMPACT, FB_COMPACT)

presto.thrift.client.addresses

Comma-separated list of thrift servers in the form of host:port. For example:

  1. presto.thrift.client.addresses=192.0.2.3:7777,192.0.2.4:7779

This property is required; there is no default.

presto-thrift.max-response-size

Maximum size of a data response that the connector accepts. This value is sentby the connector to the Thrift server when requesting data, allowing it to sizethe response appropriately.

This property is optional; the default is 16MB.

presto-thrift.metadata-refresh-threads

Number of refresh threads for metadata cache.

This property is optional; the default is 1.

Thrift IDL File

The following IDL describes the PrestoThriftService that must be implemented:

  1. enumPrestoThriftBound{
  2. BELOW=1;
  3. EXACTLY=2;
  4. ABOVE=3;
  5. }
  6.  
  7. exceptionPrestoThriftServiceException{
  8. 1:stringmessage;
  9. 2:boolretryable;
  10. }
  11.  
  12. structPrestoThriftNullableSchemaName{
  13. 1:optionalstringschemaName;
  14. }
  15.  
  16. structPrestoThriftSchemaTableName{
  17. 1:stringschemaName;
  18. 2:stringtableName;
  19. }
  20.  
  21. structPrestoThriftTableMetadata{
  22. 1:PrestoThriftSchemaTableNameschemaTableName;
  23. 2:list<PrestoThriftColumnMetadata>columns;
  24. 3:optionalstringcomment;
  25.  
  26. /**
  27. * Returns a list of key sets which can be used for index lookups.
  28. * The list is expected to have only unique key sets.
  29. * {@code set<set<string>>} is not used here because some languages (like php) don't support it.
  30. */
  31. 4:optionallist<set<string>>indexableKeys;
  32. }
  33.  
  34. structPrestoThriftColumnMetadata{
  35. 1:stringname;
  36. 2:stringtype;
  37. 3:optionalstringcomment;
  38. 4:boolhidden;
  39. }
  40.  
  41. structPrestoThriftNullableColumnSet{
  42. 1:optionalset<string>columns;
  43. }
  44.  
  45. structPrestoThriftTupleDomain{
  46. /**
  47. * Return a map of column names to constraints.
  48. */
  49. 1:optionalmap<string,PrestoThriftDomain>domains;
  50. }
  51.  
  52. /**
  53. * Set that either includes all values, or excludes all values.
  54. */
  55. structPrestoThriftAllOrNoneValueSet{
  56. 1:boolall;
  57. }
  58.  
  59. /**
  60. * A set containing values that are uniquely identifiable.
  61. * Assumes an infinite number of possible values. The values may be collectively included (aka whitelist)
  62. * or collectively excluded (aka !whitelist).
  63. * This structure is used with comparable, but not orderable types like "json", "map".
  64. */
  65. structPrestoThriftEquatableValueSet{
  66. 1:boolwhiteList;
  67. 2:list<PrestoThriftBlock>values;
  68. }
  69.  
  70. /**
  71. * Elements of {@code nulls} array determine if a value for a corresponding row is null.
  72. * Elements of {@code ints} array are values for each row. If row is null then value is ignored.
  73. */
  74. structPrestoThriftInteger{
  75. 1:optionallist<bool>nulls;
  76. 2:optionallist<i32>ints;
  77. }
  78.  
  79. /**
  80. * Elements of {@code nulls} array determine if a value for a corresponding row is null.
  81. * Elements of {@code longs} array are values for each row. If row is null then value is ignored.
  82. */
  83. structPrestoThriftBigint{
  84. 1:optionallist<bool>nulls;
  85. 2:optionallist<i64>longs;
  86. }
  87.  
  88. /**
  89. * Elements of {@code nulls} array determine if a value for a corresponding row is null.
  90. * Elements of {@code doubles} array are values for each row. If row is null then value is ignored.
  91. */
  92. structPrestoThriftDouble{
  93. 1:optionallist<bool>nulls;
  94. 2:optionallist<double>doubles;
  95. }
  96.  
  97. /**
  98. * Elements of {@code nulls} array determine if a value for a corresponding row is null.
  99. * Each elements of {@code sizes} array contains the length in bytes for the corresponding element.
  100. * If row is null then the corresponding element in {@code sizes} is ignored.
  101. * {@code bytes} array contains UTF-8 encoded byte values.
  102. * Values for all rows are written to {@code bytes} array one after another.
  103. * The total number of bytes must be equal to the sum of all sizes.
  104. */
  105. structPrestoThriftVarchar{
  106. 1:optionallist<bool>nulls;
  107. 2:optionallist<i32>sizes;
  108. 3:optionalbinarybytes;
  109. }
  110.  
  111. /**
  112. * Elements of {@code nulls} array determine if a value for a corresponding row is null.
  113. * Elements of {@code booleans} array are values for each row. If row is null then value is ignored.
  114. */
  115. structPrestoThriftBoolean{
  116. 1:optionallist<bool>nulls;
  117. 2:optionallist<bool>booleans;
  118. }
  119.  
  120. /**
  121. * Elements of {@code nulls} array determine if a value for a corresponding row is null.
  122. * Elements of {@code dates} array are date values for each row represented as the number
  123. * of days passed since 1970-01-01.
  124. * If row is null then value is ignored.
  125. */
  126. structPrestoThriftDate{
  127. 1:optionallist<bool>nulls;
  128. 2:optionallist<i32>dates;
  129. }
  130.  
  131. /**
  132. * Elements of {@code nulls} array determine if a value for a corresponding row is null.
  133. * Elements of {@code timestamps} array are values for each row represented as the number
  134. * of milliseconds passed since 1970-01-01T00:00:00 UTC.
  135. * If row is null then value is ignored.
  136. */
  137. structPrestoThriftTimestamp{
  138. 1:optionallist<bool>nulls;
  139. 2:optionallist<i64>timestamps;
  140. }
  141.  
  142. /**
  143. * Elements of {@code nulls} array determine if a value for a corresponding row is null.
  144. * Each elements of {@code sizes} array contains the length in bytes for the corresponding element.
  145. * If row is null then the corresponding element in {@code sizes} is ignored.
  146. * {@code bytes} array contains UTF-8 encoded byte values for string representation of json.
  147. * Values for all rows are written to {@code bytes} array one after another.
  148. * The total number of bytes must be equal to the sum of all sizes.
  149. */
  150. structPrestoThriftJson{
  151. 1:optionallist<bool>nulls;
  152. 2:optionallist<i32>sizes;
  153. 3:optionalbinarybytes;
  154. }
  155.  
  156. /**
  157. * Elements of {@code nulls} array determine if a value for a corresponding row is null.
  158. * Each elements of {@code sizes} array contains the length in bytes for the corresponding element.
  159. * If row is null then the corresponding element in {@code sizes} is ignored.
  160. * {@code bytes} array contains encoded byte values for HyperLogLog representation as defined in
  161. * Airlift specification: href="https://github.com/airlift/airlift/blob/master/stats/docs/hll.md
  162. * Values for all rows are written to {@code bytes} array one after another.
  163. * The total number of bytes must be equal to the sum of all sizes.
  164. */
  165. structPrestoThriftHyperLogLog{
  166. 1:optionallist<bool>nulls;
  167. 2:optionallist<i32>sizes;
  168. 3:optionalbinarybytes;
  169. }
  170.  
  171. /**
  172. * Elements of {@code nulls} array determine if a value for a corresponding row is null.
  173. * Each elements of {@code sizes} array contains the number of elements in the corresponding values array.
  174. * If row is null then the corresponding element in {@code sizes} is ignored.
  175. * {@code values} is a bigint block containing array elements one after another for all rows.
  176. * The total number of elements in bigint block must be equal to the sum of all sizes.
  177. */
  178. structPrestoThriftBigintArray{
  179. 1:optionallist<bool>nulls;
  180. 2:optionallist<i32>sizes;
  181. 3:optionalPrestoThriftBigintvalues;
  182. }
  183.  
  184. /**
  185. * A set containing zero or more Ranges of the same type over a continuous space of possible values.
  186. * Ranges are coalesced into the most compact representation of non-overlapping Ranges.
  187. * This structure is used with comparable and orderable types like bigint, integer, double, varchar, etc.
  188. */
  189. structPrestoThriftRangeValueSet{
  190. 1:list<PrestoThriftRange>ranges;
  191. }
  192.  
  193. structPrestoThriftId{
  194. 1:binaryid;
  195. }
  196.  
  197. structPrestoThriftSplitBatch{
  198. 1:list<PrestoThriftSplit>splits;
  199. 2:optionalPrestoThriftIdnextToken;
  200. }
  201.  
  202. structPrestoThriftSplit{
  203. 1:PrestoThriftIdsplitId;
  204. 2:list<PrestoThriftHostAddress>hosts;
  205. }
  206.  
  207. structPrestoThriftHostAddress{
  208. 1:stringhost;
  209. 2:i32port;
  210. }
  211.  
  212. structPrestoThriftPageResult{
  213. /**
  214. * Returns data in a columnar format.
  215. * Columns in this list must be in the order they were requested by the engine.
  216. */
  217. 1:list<PrestoThriftBlock>columnBlocks;
  218.  
  219. 2:i32rowCount;
  220. 3:optionalPrestoThriftIdnextToken;
  221. }
  222.  
  223. structPrestoThriftNullableTableMetadata{
  224. 1:optionalPrestoThriftTableMetadatatableMetadata;
  225. }
  226.  
  227. structPrestoThriftValueSet{
  228. 1:optionalPrestoThriftAllOrNoneValueSetallOrNoneValueSet;
  229. 2:optionalPrestoThriftEquatableValueSetequatableValueSet;
  230. 3:optionalPrestoThriftRangeValueSetrangeValueSet;
  231. }
  232.  
  233. structPrestoThriftBlock{
  234. 1:optionalPrestoThriftIntegerintegerData;
  235. 2:optionalPrestoThriftBigintbigintData;
  236. 3:optionalPrestoThriftDoubledoubleData;
  237. 4:optionalPrestoThriftVarcharvarcharData;
  238. 5:optionalPrestoThriftBooleanbooleanData;
  239. 6:optionalPrestoThriftDatedateData;
  240. 7:optionalPrestoThriftTimestamptimestampData;
  241. 8:optionalPrestoThriftJsonjsonData;
  242. 9:optionalPrestoThriftHyperLogLoghyperLogLogData;
  243. 10:optionalPrestoThriftBigintArraybigintArrayData;
  244. }
  245.  
  246. /**
  247. * LOWER UNBOUNDED is specified with an empty value and an ABOVE bound
  248. * UPPER UNBOUNDED is specified with an empty value and a BELOW bound
  249. */
  250. structPrestoThriftMarker{
  251. 1:optionalPrestoThriftBlockvalue;
  252. 2:PrestoThriftBoundbound;
  253. }
  254.  
  255. structPrestoThriftNullableToken{
  256. 1:optionalPrestoThriftIdtoken;
  257. }
  258.  
  259. structPrestoThriftDomain{
  260. 1:PrestoThriftValueSetvalueSet;
  261. 2:boolnullAllowed;
  262. }
  263.  
  264. structPrestoThriftRange{
  265. 1:PrestoThriftMarkerlow;
  266. 2:PrestoThriftMarkerhigh;
  267. }
  268.  
  269. /**
  270. * Presto Thrift service definition.
  271. * This thrift service needs to be implemented in order to be used with Thrift Connector.
  272. */
  273. servicePrestoThriftService{
  274. /**
  275. * Returns available schema names.
  276. */
  277. list<string>prestoListSchemaNames()
  278. throws(1:PrestoThriftServiceExceptionex1);
  279.  
  280. /**
  281. * Returns tables for the given schema name.
  282. *
  283. * @param schemaNameOrNull a structure containing schema name or {@literal null}
  284. * @return a list of table names with corresponding schemas. If schema name is null then returns
  285. * a list of tables for all schemas. Returns an empty list if a schema does not exist
  286. */
  287. list<PrestoThriftSchemaTableName>prestoListTables(
  288. 1:PrestoThriftNullableSchemaNameschemaNameOrNull)
  289. throws(1:PrestoThriftServiceExceptionex1);
  290.  
  291. /**
  292. * Returns metadata for a given table.
  293. *
  294. * @param schemaTableName schema and table name
  295. * @return metadata for a given table, or a {@literal null} value inside if it does not exist
  296. */
  297. PrestoThriftNullableTableMetadataprestoGetTableMetadata(
  298. 1:PrestoThriftSchemaTableNameschemaTableName)
  299. throws(1:PrestoThriftServiceExceptionex1);
  300.  
  301. /**
  302. * Returns a batch of splits.
  303. *
  304. * @param schemaTableName schema and table name
  305. * @param desiredColumns a superset of columns to return; empty set means "no columns", {@literal null} set means "all columns"
  306. * @param outputConstraint constraint on the returned data
  307. * @param maxSplitCount maximum number of splits to return
  308. * @param nextToken token from a previous split batch or {@literal null} if it is the first call
  309. * @return a batch of splits
  310. */
  311. PrestoThriftSplitBatchprestoGetSplits(
  312. 1:PrestoThriftSchemaTableNameschemaTableName,
  313. 2:PrestoThriftNullableColumnSetdesiredColumns,
  314. 3:PrestoThriftTupleDomainoutputConstraint,
  315. 4:i32maxSplitCount,
  316. 5:PrestoThriftNullableTokennextToken)
  317. throws(1:PrestoThriftServiceExceptionex1);
  318.  
  319. /**
  320. * Returns a batch of index splits for the given batch of keys.
  321. * This method is called if index join strategy is chosen for a query.
  322. *
  323. * @param schemaTableName schema and table name
  324. * @param indexColumnNames specifies columns and their order for keys
  325. * @param outputColumnNames a list of column names to return
  326. * @param keys keys for which records need to be returned
  327. * @param outputConstraint constraint on the returned data
  328. * @param maxSplitCount maximum number of splits to return
  329. * @param nextToken token from a previous split batch or {@literal null} if it is the first call
  330. * @return a batch of splits
  331. */
  332. PrestoThriftSplitBatchprestoGetIndexSplits(
  333. 1:PrestoThriftSchemaTableNameschemaTableName,
  334. 2:list<string>indexColumnNames,
  335. 3:list<string>outputColumnNames,
  336. 4:PrestoThriftPageResultkeys,
  337. 5:PrestoThriftTupleDomainoutputConstraint,
  338. 6:i32maxSplitCount,
  339. 7:PrestoThriftNullableTokennextToken)
  340. throws(1:PrestoThriftServiceExceptionex1);
  341.  
  342. /**
  343. * Returns a batch of rows for the given split.
  344. *
  345. * @param splitId split id as returned in split batch
  346. * @param columns a list of column names to return
  347. * @param maxBytes maximum size of returned data in bytes
  348. * @param nextToken token from a previous batch or {@literal null} if it is the first call
  349. * @return a batch of table data
  350. */
  351. PrestoThriftPageResultprestoGetRows(
  352. 1:PrestoThriftIdsplitId,
  353. 2:list<string>columns,
  354. 3:i64maxBytes,
  355. 4:PrestoThriftNullableTokennextToken)
  356. throws(1:PrestoThriftServiceExceptionex1);
  357. }

原文: https://prestodb.io/docs/current/connector/thrift.html