Star Schema Benchmark

Star Schema Benchmark(SSB) is a lightweight performance test set in the data warehouse scenario. SSB provides a simplified star schema data based on TPC-H, which is mainly used to test the performance of multi-table JOIN query under star schema. In addition, the industry usually flattens SSB into a wide table model (Referred as: SSB flat) to test the performance of the query engine, refer to Clickhouse.

This document mainly introduces the performance of Doris on the SSB 100G test set.

Note 1: The standard test set including SSB usually has a large gap with the actual business scenario, and some tests will perform parameter tuning for the test set. Therefore, the test results of the standard test set can only reflect the performance of the database in a specific scenario. It is recommended that users use actual business data for further testing.

Note 2: The operations involved in this document are all performed in the Ubuntu Server 20.04 environment, and CentOS 7 as well.

Note 3: Doris starting from version 1.2.2, the page cache is turned off by default to reduce memory usage, which has a certain impact on performance. For performance testing, enable the page cache by adding disable_storage_page_cache=false to be.conf.

With 13 queries on the SSB standard test data set, we conducted a comparison test based on Apache Doris 1.2.0-rc01, Apache Doris 1.1.3 and Apache Doris 0.15.0 RC04 versions.

On the SSB flat wide table, the overall performance of Apache Doris 1.2.0-rc01 has been improved by nearly 4 times compared with Apache Doris 1.1.3, and nearly 10 times compared with Apache Doris 0.15.0 RC04.

On the SQL test with standard SSB, the overall performance of Apache Doris 1.2.0-rc01 has been improved by nearly 2 times compared with Apache Doris 1.1.3, and nearly 31 times compared with Apache Doris 0.15.0 RC04.

1. Hardware Environment

Number of machines4 Tencent Cloud Hosts (1 FE, 3 BEs)
CPUAMD EPYC™ Milan (2.55GHz/3.5GHz) 16 Cores
Memory64G
Network Bandwidth7Gbps
DiskHigh-performance Cloud Disk

2. Software Environment

  • Doris deployed 3BEs and 1FE;
  • Kernel version: Linux version 5.4.0-96-generic (buildd@lgw01-amd64-051)
  • OS version: Ubuntu Server 20.04 LTS 64-bit
  • Doris software versions: Apache Doris 1.2.0-rc01, Apache Doris 1.1.3 and Apache Doris 0.15.0 RC04
  • JDK: openjdk version “11.0.14” 2022-01-18

3. Test Data Volume

SSB Table NameRowsAnnotation
lineorder600,037,902Commodity Order Details
customer3,000,000Customer Information
part1,400,000Parts Information
supplier200,000Supplier Information
dates2,556Date
lineorder_flat600,037,902Wide Table after Data Flattening

4. Test Results

We use Apache Doris 1.2.0-rc01, Apache Doris 1.1.3 and Apache Doris 0.15.0 RC04 for comparative testing. The test results are as follows:

QueryApache Doris 1.2.0-rc01(ms)Apache Doris 1.1.3 (ms)Doris 0.15.0 RC04 (ms)
Q1.12090250
Q1.2101030
Q1.33070120
Q2.190360900
Q2.2903401,020
Q2.360260770
Q3.11605501,710
Q3.280290670
Q3.390240550
Q3.4202030
Q4.11404801,250
Q4.250240400
Q4.330200330
Total8803,1508,030

ssb_v11_v015_compare

Interpretation of Results

  • The data set corresponding to the test results is scale 100, about 600 million.
  • The test environment is configured as the user’s common configuration, with 4 cloud servers, 16-core 64G SSD, and 1 FE, 3 BEs deployment.
  • We select the user’s common configuration test to reduce the cost of user selection and evaluation, but the entire test process will not consume so many hardware resources.

5. Standard SSB Test Results

Here we use Apache Doris 1.2.0-rc01, Apache Doris 1.1.3 and Apache Doris 0.15.0 RC04 for comparative testing. In the test, we use Query Time(ms) as the main performance indicator. The test results are as follows:

QueryApache Doris 1.2.0-rc01 (ms)Apache Doris 1.1.3 (ms)Doris 0.15.0 RC04 (ms)
Q1.14018350
Q1.23010080
Q1.3207080
Q2.135094020,680
Q2.232075018,250
Q2.330072014,760
Q3.16502,15022,190
Q3.22605108,360
Q3.32204506,200
Q3.46070160
Q4.18401,48024,320
Q4.24605606,310
Q4.361066010,170
Total4,1608,478131,910

ssb_12_11_015

Interpretation of Results

  • The data set corresponding to the test results is scale 100, about 600 million.
  • The test environment is configured as the user’s common configuration, with 4 cloud servers, 16-core 64G SSD, and 1 FE 3 BEs deployment.
  • We select the user’s common configuration test to reduce the cost of user selection and evaluation, but the entire test process will not consume so many hardware resources.

6. Environment Preparation

Please first refer to the [official documentation](. /install/install-deploy.md) to install and deploy Apache Doris first to obtain a Doris cluster which is working well(including at least 1 FE 1 BE, 1 FE 3 BEs is recommended).

The scripts mentioned in the following documents are stored in the Apache Doris codebase: ssb-tools

7. Data Preparation

7.1 Download and Install the SSB Data Generation Tool.

Execute the following script to download and compile the ssb-dbgen tool.

  1. sh build-ssb-dbgen.sh

After successful installation, the dbgen binary will be generated under the ssb-dbgen/ directory.

7.2 Generate SSB Test Set

Execute the following script to generate the SSB dataset:

  1. sh gen-ssb-data.sh -s 100 -c 100

Note 1: Check the script help via sh gen-ssb-data.sh -h.

Note 2: The data will be generated under the ssb-data/ directory with the suffix .tbl. The total file size is about 60GB and may need a few minutes to an hour to generate.

Note 3: -s 100 indicates that the test set size factor is 100, -c 100 indicates that 100 concurrent threads generate the data of the lineorder table. The -c parameter also determines the number of files in the final lineorder table. The larger the parameter, the larger the number of files and the smaller each file.

With the -s 100 parameter, the resulting dataset size is:

TableRowsSizeFile Number
lineorder600,037,90260GB100
customer3,000,000277M1
part1,400,000116M1
supplier200,00017M1
dates2,556228K1

7.3 Create Table

7.3.1 Prepare the doris-cluster.conf File.

Before import the script, you need to write the FE’s ip port and other information in the doris-cluster.conf file.

The file is located under ${DORIS_HOME}/tools/ssb-tools/conf/.

The content of the file includes FE’s ip, HTTP port, user name, password and the DB name of the data to be imported:

  1. export FE_HOST="xxx"
  2. export FE_HTTP_PORT="8030"
  3. export FE_QUERY_PORT="9030"
  4. export USER="root"
  5. export PASSWORD='xxx'
  6. export DB="ssb"

7.3.2 Execute the Following Script to Generate and Create the SSB Table:

  1. sh create-ssb-tables.sh

Or copy the table creation statements in create-ssb-tables.sql and create-ssb-flat-table.sql and then execute them in the MySQL client.

The following is the lineorder_flat table build statement. Create the lineorder_flat table in the above create-ssb-flat-table.sh script, and perform the default number of buckets (48 buckets). You can delete this table and adjust the number of buckets according to your cluster scale node configuration, so as to obtain a better test result.

  1. CREATE TABLE `lineorder_flat` (
  2. `LO_ORDERDATE` date NOT NULL COMMENT "",
  3. `LO_ORDERKEY` int(11) NOT NULL COMMENT "",
  4. `LO_LINENUMBER` tinyint(4) NOT NULL COMMENT "",
  5. `LO_CUSTKEY` int(11) NOT NULL COMMENT "",
  6. `LO_PARTKEY` int(11) NOT NULL COMMENT "",
  7. `LO_SUPPKEY` int(11) NOT NULL COMMENT "",
  8. `LO_ORDERPRIORITY` varchar(100) NOT NULL COMMENT "",
  9. `LO_SHIPPRIORITY` tinyint(4) NOT NULL COMMENT "",
  10. `LO_QUANTITY` tinyint(4) NOT NULL COMMENT "",
  11. `LO_EXTENDEDPRICE` int(11) NOT NULL COMMENT "",
  12. `LO_ORDTOTALPRICE` int(11) NOT NULL COMMENT "",
  13. `LO_DISCOUNT` tinyint(4) NOT NULL COMMENT "",
  14. `LO_REVENUE` int(11) NOT NULL COMMENT "",
  15. `LO_SUPPLYCOST` int(11) NOT NULL COMMENT "",
  16. `LO_TAX` tinyint(4) NOT NULL COMMENT "",
  17. `LO_COMMITDATE` date NOT NULL COMMENT "",
  18. `LO_SHIPMODE` varchar(100) NOT NULL COMMENT "",
  19. `C_NAME` varchar(100) NOT NULL COMMENT "",
  20. `C_ADDRESS` varchar(100) NOT NULL COMMENT "",
  21. `C_CITY` varchar(100) NOT NULL COMMENT "",
  22. `C_NATION` varchar(100) NOT NULL COMMENT "",
  23. `C_REGION` varchar(100) NOT NULL COMMENT "",
  24. `C_PHONE` varchar(100) NOT NULL COMMENT "",
  25. `C_MKTSEGMENT` varchar(100) NOT NULL COMMENT "",
  26. `S_NAME` varchar(100) NOT NULL COMMENT "",
  27. `S_ADDRESS` varchar(100) NOT NULL COMMENT "",
  28. `S_CITY` varchar(100) NOT NULL COMMENT "",
  29. `S_NATION` varchar(100) NOT NULL COMMENT "",
  30. `S_REGION` varchar(100) NOT NULL COMMENT "",
  31. `S_PHONE` varchar(100) NOT NULL COMMENT "",
  32. `P_NAME` varchar(100) NOT NULL COMMENT "",
  33. `P_MFGR` varchar(100) NOT NULL COMMENT "",
  34. `P_CATEGORY` varchar(100) NOT NULL COMMENT "",
  35. `P_BRAND` varchar(100) NOT NULL COMMENT "",
  36. `P_COLOR` varchar(100) NOT NULL COMMENT "",
  37. `P_TYPE` varchar(100) NOT NULL COMMENT "",
  38. `P_SIZE` tinyint(4) NOT NULL COMMENT "",
  39. `P_CONTAINER` varchar(100) NOT NULL COMMENT ""
  40. ) ENGINE=OLAP
  41. DUPLICATE KEY(`LO_ORDERDATE`, `LO_ORDERKEY`)
  42. COMMENT "OLAP"
  43. PARTITION BY RANGE(`LO_ORDERDATE`)
  44. (PARTITION p1 VALUES [('0000-01-01'), ('1993-01-01')),
  45. PARTITION p2 VALUES [('1993-01-01'), ('1994-01-01')),
  46. PARTITION p3 VALUES [('1994-01-01'), ('1995-01-01')),
  47. PARTITION p4 VALUES [('1995-01-01'), ('1996-01-01')),
  48. PARTITION p5 VALUES [('1996-01-01'), ('1997-01-01')),
  49. PARTITION p6 VALUES [('1997-01-01'), ('1998-01-01')),
  50. PARTITION p7 VALUES [('1998-01-01'), ('1999-01-01')))
  51. DISTRIBUTED BY HASH(`LO_ORDERKEY`) BUCKETS 48
  52. PROPERTIES (
  53. "replication_num" = "1",
  54. "colocate_with" = "groupxx1",
  55. "in_memory" = "false",
  56. "storage_format" = "DEFAULT"
  57. );

7.4 Import data

We use the following command to complete all data import of SSB test set and SSB FLAT wide table data synthesis and then import into the table.

  1. sh bin/load-ssb-data.sh -c 10

-c 5 means start 10 concurrent threads to import (5 by default). In the case of a single BE node, the lineorder data generated by sh gen-ssb-data.sh -s 100 -c 100 will also generate the data of the ssb-flat table in the end. If more threads are enabled, the import speed can be accelerated. But it will cost extra memory.

Notes.

  1. To get faster import speed, you can add flush_thread_num_per_store=10 in be.conf and then restart BE. This configuration indicates the number of disk writing threads for each data directory, 6 by default. Larger data can improve write data throughput, but may increase IO Util. (Reference value: 1 mechanical disk, with 2 by default, the IO Util during the import process is about 12%. When it is set to 5, the IO Util is about 26%. If it is an SSD disk, it is almost 0%) .

  2. The flat table data is imported by ‘INSERT INTO … SELECT … ‘.

7.5 Checking Imported data

  1. select count(*) from part;
  2. select count(*) from customer;
  3. select count(*) from supplier;
  4. select count(*) from dates;
  5. select count(*) from lineorder;
  6. select count(*) from lineorder_flat;

The amount of data should be consistent with the number of rows of generated data.

TableRowsOrigin SizeCompacted Size(1 Replica)
lineorder_flat600,037,90259.709 GB
lineorder600,037,90260 GB14.514 GB
customer3,000,000277 MB138.247 MB
part1,400,000116 MB12.759 MB
supplier200,00017 MB9.143 MB
dates2,556228 KB34.276 KB

7.6 Query Test

7.6.1 SSB FLAT Test for SQL

  1. --Q1.1
  2. SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
  3. FROM lineorder_flat
  4. WHERE LO_ORDERDATE >= 19930101 AND LO_ORDERDATE <= 19931231 AND LO_DISCOUNT BETWEEN 1 AND 3 AND LO_QUANTITY < 25;
  5. --Q1.2
  6. SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
  7. FROM lineorder_flat
  8. WHERE LO_ORDERDATE >= 19940101 AND LO_ORDERDATE <= 19940131 AND LO_DISCOUNT BETWEEN 4 AND 6 AND LO_QUANTITY BETWEEN 26 AND 35;
  9. --Q1.3
  10. SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
  11. FROM lineorder_flat
  12. WHERE weekofyear(LO_ORDERDATE) = 6 AND LO_ORDERDATE >= 19940101 AND LO_ORDERDATE <= 19941231 AND LO_DISCOUNT BETWEEN 5 AND 7 AND LO_QUANTITY BETWEEN 26 AND 35;
  13. --Q2.1
  14. SELECT SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR, P_BRAND
  15. FROM lineorder_flat WHERE P_CATEGORY = 'MFGR#12' AND S_REGION = 'AMERICA'
  16. GROUP BY YEAR, P_BRAND
  17. ORDER BY YEAR, P_BRAND;
  18. --Q2.2
  19. SELECT SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR, P_BRAND
  20. FROM lineorder_flat
  21. WHERE P_BRAND >= 'MFGR#2221' AND P_BRAND <= 'MFGR#2228' AND S_REGION = 'ASIA'
  22. GROUP BY YEAR, P_BRAND
  23. ORDER BY YEAR, P_BRAND;
  24. --Q2.3
  25. SELECT SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR, P_BRAND
  26. FROM lineorder_flat
  27. WHERE P_BRAND = 'MFGR#2239' AND S_REGION = 'EUROPE'
  28. GROUP BY YEAR, P_BRAND
  29. ORDER BY YEAR, P_BRAND;
  30. --Q3.1
  31. SELECT C_NATION, S_NATION, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS revenue
  32. FROM lineorder_flat
  33. WHERE C_REGION = 'ASIA' AND S_REGION = 'ASIA' AND LO_ORDERDATE >= 19920101 AND LO_ORDERDATE <= 19971231
  34. GROUP BY C_NATION, S_NATION, YEAR
  35. ORDER BY YEAR ASC, revenue DESC;
  36. --Q3.2
  37. SELECT C_CITY, S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS revenue
  38. FROM lineorder_flat
  39. WHERE C_NATION = 'UNITED STATES' AND S_NATION = 'UNITED STATES' AND LO_ORDERDATE >= 19920101 AND LO_ORDERDATE <= 19971231
  40. GROUP BY C_CITY, S_CITY, YEAR
  41. ORDER BY YEAR ASC, revenue DESC;
  42. --Q3.3
  43. SELECT C_CITY, S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS revenue
  44. FROM lineorder_flat
  45. WHERE C_CITY IN ('UNITED KI1', 'UNITED KI5') AND S_CITY IN ('UNITED KI1', 'UNITED KI5') AND LO_ORDERDATE >= 19920101 AND LO_ORDERDATE <= 19971231
  46. GROUP BY C_CITY, S_CITY, YEAR
  47. ORDER BY YEAR ASC, revenue DESC;
  48. --Q3.4
  49. SELECT C_CITY, S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS revenue
  50. FROM lineorder_flat
  51. WHERE C_CITY IN ('UNITED KI1', 'UNITED KI5') AND S_CITY IN ('UNITED KI1', 'UNITED KI5') AND LO_ORDERDATE >= 19971201 AND LO_ORDERDATE <= 19971231
  52. GROUP BY C_CITY, S_CITY, YEAR
  53. ORDER BY YEAR ASC, revenue DESC;
  54. --Q4.1
  55. SELECT (LO_ORDERDATE DIV 10000) AS YEAR, C_NATION, SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit
  56. FROM lineorder_flat
  57. WHERE C_REGION = 'AMERICA' AND S_REGION = 'AMERICA' AND P_MFGR IN ('MFGR#1', 'MFGR#2')
  58. GROUP BY YEAR, C_NATION
  59. ORDER BY YEAR ASC, C_NATION ASC;
  60. --Q4.2
  61. SELECT (LO_ORDERDATE DIV 10000) AS YEAR,S_NATION, P_CATEGORY, SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit
  62. FROM lineorder_flat
  63. WHERE C_REGION = 'AMERICA' AND S_REGION = 'AMERICA' AND LO_ORDERDATE >= 19970101 AND LO_ORDERDATE <= 19981231 AND P_MFGR IN ('MFGR#1', 'MFGR#2')
  64. GROUP BY YEAR, S_NATION, P_CATEGORY
  65. ORDER BY YEAR ASC, S_NATION ASC, P_CATEGORY ASC;
  66. --Q4.3
  67. SELECT (LO_ORDERDATE DIV 10000) AS YEAR, S_CITY, P_BRAND, SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit
  68. FROM lineorder_flat
  69. WHERE S_NATION = 'UNITED STATES' AND LO_ORDERDATE >= 19970101 AND LO_ORDERDATE <= 19981231 AND P_CATEGORY = 'MFGR#14'
  70. GROUP BY YEAR, S_CITY, P_BRAND
  71. ORDER BY YEAR ASC, S_CITY ASC, P_BRAND ASC;

7.6.2 SSB Standard Test for SQL

  1. --Q1.1
  2. SELECT SUM(lo_extendedprice * lo_discount) AS REVENUE
  3. FROM lineorder, dates
  4. WHERE
  5. lo_orderdate = d_datekey
  6. AND d_year = 1993
  7. AND lo_discount BETWEEN 1 AND 3
  8. AND lo_quantity < 25;
  9. --Q1.2
  10. SELECT SUM(lo_extendedprice * lo_discount) AS REVENUE
  11. FROM lineorder, dates
  12. WHERE
  13. lo_orderdate = d_datekey
  14. AND d_yearmonth = 'Jan1994'
  15. AND lo_discount BETWEEN 4 AND 6
  16. AND lo_quantity BETWEEN 26 AND 35;
  17. --Q1.3
  18. SELECT
  19. SUM(lo_extendedprice * lo_discount) AS REVENUE
  20. FROM lineorder, dates
  21. WHERE
  22. lo_orderdate = d_datekey
  23. AND d_weeknuminyear = 6
  24. AND d_year = 1994
  25. AND lo_discount BETWEEN 5 AND 7
  26. AND lo_quantity BETWEEN 26 AND 35;
  27. --Q2.1
  28. SELECT SUM(lo_revenue), d_year, p_brand
  29. FROM lineorder, dates, part, supplier
  30. WHERE
  31. lo_orderdate = d_datekey
  32. AND lo_partkey = p_partkey
  33. AND lo_suppkey = s_suppkey
  34. AND p_category = 'MFGR#12'
  35. AND s_region = 'AMERICA'
  36. GROUP BY d_year, p_brand
  37. ORDER BY p_brand;
  38. --Q2.2
  39. SELECT SUM(lo_revenue), d_year, p_brand
  40. FROM lineorder, dates, part, supplier
  41. WHERE
  42. lo_orderdate = d_datekey
  43. AND lo_partkey = p_partkey
  44. AND lo_suppkey = s_suppkey
  45. AND p_brand BETWEEN 'MFGR#2221' AND 'MFGR#2228'
  46. AND s_region = 'ASIA'
  47. GROUP BY d_year, p_brand
  48. ORDER BY d_year, p_brand;
  49. --Q2.3
  50. SELECT SUM(lo_revenue), d_year, p_brand
  51. FROM lineorder, dates, part, supplier
  52. WHERE
  53. lo_orderdate = d_datekey
  54. AND lo_partkey = p_partkey
  55. AND lo_suppkey = s_suppkey
  56. AND p_brand = 'MFGR#2239'
  57. AND s_region = 'EUROPE'
  58. GROUP BY d_year, p_brand
  59. ORDER BY d_year, p_brand;
  60. --Q3.1
  61. SELECT
  62. c_nation,
  63. s_nation,
  64. d_year,
  65. SUM(lo_revenue) AS REVENUE
  66. FROM customer, lineorder, supplier, dates
  67. WHERE
  68. lo_custkey = c_custkey
  69. AND lo_suppkey = s_suppkey
  70. AND lo_orderdate = d_datekey
  71. AND c_region = 'ASIA'
  72. AND s_region = 'ASIA'
  73. AND d_year >= 1992
  74. AND d_year <= 1997
  75. GROUP BY c_nation, s_nation, d_year
  76. ORDER BY d_year ASC, REVENUE DESC;
  77. --Q3.2
  78. SELECT
  79. c_city,
  80. s_city,
  81. d_year,
  82. SUM(lo_revenue) AS REVENUE
  83. FROM customer, lineorder, supplier, dates
  84. WHERE
  85. lo_custkey = c_custkey
  86. AND lo_suppkey = s_suppkey
  87. AND lo_orderdate = d_datekey
  88. AND c_nation = 'UNITED STATES'
  89. AND s_nation = 'UNITED STATES'
  90. AND d_year >= 1992
  91. AND d_year <= 1997
  92. GROUP BY c_city, s_city, d_year
  93. ORDER BY d_year ASC, REVENUE DESC;
  94. --Q3.3
  95. SELECT
  96. c_city,
  97. s_city,
  98. d_year,
  99. SUM(lo_revenue) AS REVENUE
  100. FROM customer, lineorder, supplier, dates
  101. WHERE
  102. lo_custkey = c_custkey
  103. AND lo_suppkey = s_suppkey
  104. AND lo_orderdate = d_datekey
  105. AND (
  106. c_city = 'UNITED KI1'
  107. OR c_city = 'UNITED KI5'
  108. )
  109. AND (
  110. s_city = 'UNITED KI1'
  111. OR s_city = 'UNITED KI5'
  112. )
  113. AND d_year >= 1992
  114. AND d_year <= 1997
  115. GROUP BY c_city, s_city, d_year
  116. ORDER BY d_year ASC, REVENUE DESC;
  117. --Q3.4
  118. SELECT
  119. c_city,
  120. s_city,
  121. d_year,
  122. SUM(lo_revenue) AS REVENUE
  123. FROM customer, lineorder, supplier, dates
  124. WHERE
  125. lo_custkey = c_custkey
  126. AND lo_suppkey = s_suppkey
  127. AND lo_orderdate = d_datekey
  128. AND (
  129. c_city = 'UNITED KI1'
  130. OR c_city = 'UNITED KI5'
  131. )
  132. AND (
  133. s_city = 'UNITED KI1'
  134. OR s_city = 'UNITED KI5'
  135. )
  136. AND d_yearmonth = 'Dec1997'
  137. GROUP BY c_city, s_city, d_year
  138. ORDER BY d_year ASC, REVENUE DESC;
  139. --Q4.1
  140. SELECT /*+SET_VAR(parallel_fragment_exec_instance_num=4, enable_vectorized_engine=true, batch_size=4096, enable_cost_based_join_reorder=true, enable_projection=true) */
  141. d_year,
  142. c_nation,
  143. SUM(lo_revenue - lo_supplycost) AS PROFIT
  144. FROM dates, customer, supplier, part, lineorder
  145. WHERE
  146. lo_custkey = c_custkey
  147. AND lo_suppkey = s_suppkey
  148. AND lo_partkey = p_partkey
  149. AND lo_orderdate = d_datekey
  150. AND c_region = 'AMERICA'
  151. AND s_region = 'AMERICA'
  152. AND (
  153. p_mfgr = 'MFGR#1'
  154. OR p_mfgr = 'MFGR#2'
  155. )
  156. GROUP BY d_year, c_nation
  157. ORDER BY d_year, c_nation;
  158. --Q4.2
  159. SELECT /*+SET_VAR(parallel_fragment_exec_instance_num=2, enable_vectorized_engine=true, batch_size=4096, enable_cost_based_join_reorder=true, enable_projection=true) */
  160. d_year,
  161. s_nation,
  162. p_category,
  163. SUM(lo_revenue - lo_supplycost) AS PROFIT
  164. FROM dates, customer, supplier, part, lineorder
  165. WHERE
  166. lo_custkey = c_custkey
  167. AND lo_suppkey = s_suppkey
  168. AND lo_partkey = p_partkey
  169. AND lo_orderdate = d_datekey
  170. AND c_region = 'AMERICA'
  171. AND s_region = 'AMERICA'
  172. AND (
  173. d_year = 1997
  174. OR d_year = 1998
  175. )
  176. AND (
  177. p_mfgr = 'MFGR#1'
  178. OR p_mfgr = 'MFGR#2'
  179. )
  180. GROUP BY d_year, s_nation, p_category
  181. ORDER BY d_year, s_nation, p_category;
  182. --Q4.3
  183. SELECT /*+SET_VAR(parallel_fragment_exec_instance_num=2, enable_vectorized_engine=true, batch_size=4096, enable_cost_based_join_reorder=true, enable_projection=true) */
  184. d_year,
  185. s_city,
  186. p_brand,
  187. SUM(lo_revenue - lo_supplycost) AS PROFIT
  188. FROM dates, customer, supplier, part, lineorder
  189. WHERE
  190. lo_custkey = c_custkey
  191. AND lo_suppkey = s_suppkey
  192. AND lo_partkey = p_partkey
  193. AND lo_orderdate = d_datekey
  194. AND s_nation = 'UNITED STATES'
  195. AND (
  196. d_year = 1997
  197. OR d_year = 1998
  198. )
  199. AND p_category = 'MFGR#14'
  200. GROUP BY d_year, s_city, p_brand
  201. ORDER BY d_year, s_city, p_brand;