width_bucket

Description

构造等宽直方图,其中直方图范围被划分为相同大小的区间,并在计算后返回表达式的值所在的桶号。该函数返回一个整数值或空值(如果任何输入为空值则返回空值)。

Syntax

  1. width_bucket(expr, min_value, max_value, num_buckets)

Arguments

expr - 创建直方图的表达式。此表达式必须计算为数值或可隐式转换为数值的值。

此值的范围必须为 -(2^53 - 1)2^53 - 1 (含).

min_valuemax_value - 表达式可接受范围的最低值点和最高值点。这两个参数必须为数值并且不相等。

最低值点和最高值点的范围必须为 -(2^53 - 1) to 2^53 - 1 (含)). 此外,最高值点与最低值点的差必须小于 2^53 (例如: abs(max_value - min_value) < 2^53).

num_buckets - 分桶的数量,必须是正整数值。将表达式中的一个值分配给每个存储桶,然后该函数返回相应的存储桶编号。

Returned value

返回表达式值所在的桶号。

当表达式超出范围时,函数返回规则如下:

如果表达式的值小于min_value返回0.

如果表达式的值大于或等于max_value返回num_buckets + 1.

如果任意参数为null返回null.

example

  1. DROP TABLE IF EXISTS width_bucket_test;
  2. CREATE TABLE IF NOT EXISTS width_bucket_test (
  3. `k1` int NULL COMMENT "",
  4. `v1` date NULL COMMENT "",
  5. `v2` double NULL COMMENT "",
  6. `v3` bigint NULL COMMENT ""
  7. ) ENGINE=OLAP
  8. DUPLICATE KEY(`k1`)
  9. DISTRIBUTED BY HASH(`k1`) BUCKETS 1
  10. PROPERTIES (
  11. "replication_allocation" = "tag.location.default: 1",
  12. "storage_format" = "V2"
  13. );
  14. INSERT INTO width_bucket_test VALUES (1, "2022-11-18", 290000.00, 290000),
  15. (2, "2023-11-18", 320000.00, 320000),
  16. (3, "2024-11-18", 399999.99, 399999),
  17. (4, "2025-11-18", 400000.00, 400000),
  18. (5, "2026-11-18", 470000.00, 470000),
  19. (6, "2027-11-18", 510000.00, 510000),
  20. (7, "2028-11-18", 610000.00, 610000),
  21. (8, null, null, null);
  22. SELECT * FROM width_bucket_test ORDER BY k1;
  23. +------+------------+-----------+--------+
  24. | k1 | v1 | v2 | v3 |
  25. +------+------------+-----------+--------+
  26. | 1 | 2022-11-18 | 290000 | 290000 |
  27. | 2 | 2023-11-18 | 320000 | 320000 |
  28. | 3 | 2024-11-18 | 399999.99 | 399999 |
  29. | 4 | 2025-11-18 | 400000 | 400000 |
  30. | 5 | 2026-11-18 | 470000 | 470000 |
  31. | 6 | 2027-11-18 | 510000 | 510000 |
  32. | 7 | 2028-11-18 | 610000 | 610000 |
  33. | 8 | NULL | NULL | NULL |
  34. +------+------------+-----------+--------+
  35. SELECT k1, v1, v2, v3, width_bucket(v1, date('2023-11-18'), date('2027-11-18'), 4) AS w FROM width_bucket_test ORDER BY k1;
  36. +------+------------+-----------+--------+------+
  37. | k1 | v1 | v2 | v3 | w |
  38. +------+------------+-----------+--------+------+
  39. | 1 | 2022-11-18 | 290000 | 290000 | 0 |
  40. | 2 | 2023-11-18 | 320000 | 320000 | 1 |
  41. | 3 | 2024-11-18 | 399999.99 | 399999 | 2 |
  42. | 4 | 2025-11-18 | 400000 | 400000 | 3 |
  43. | 5 | 2026-11-18 | 470000 | 470000 | 4 |
  44. | 6 | 2027-11-18 | 510000 | 510000 | 5 |
  45. | 7 | 2028-11-18 | 610000 | 610000 | 5 |
  46. | 8 | NULL | NULL | NULL | NULL |
  47. +------+------------+-----------+--------+------+
  48. SELECT k1, v1, v2, v3, width_bucket(v2, 200000, 600000, 4) AS w FROM width_bucket_test ORDER BY k1;
  49. +------+------------+-----------+--------+------+
  50. | k1 | v1 | v2 | v3 | w |
  51. +------+------------+-----------+--------+------+
  52. | 1 | 2022-11-18 | 290000 | 290000 | 1 |
  53. | 2 | 2023-11-18 | 320000 | 320000 | 2 |
  54. | 3 | 2024-11-18 | 399999.99 | 399999 | 2 |
  55. | 4 | 2025-11-18 | 400000 | 400000 | 3 |
  56. | 5 | 2026-11-18 | 470000 | 470000 | 3 |
  57. | 6 | 2027-11-18 | 510000 | 510000 | 4 |
  58. | 7 | 2028-11-18 | 610000 | 610000 | 5 |
  59. | 8 | NULL | NULL | NULL | NULL |
  60. +------+------------+-----------+--------+------+
  61. SELECT k1, v1, v2, v3, width_bucket(v3, 200000, 600000, 4) AS w FROM width_bucket_test ORDER BY k1;
  62. +------+------------+-----------+--------+------+
  63. | k1 | v1 | v2 | v3 | w |
  64. +------+------------+-----------+--------+------+
  65. | 1 | 2022-11-18 | 290000 | 290000 | 1 |
  66. | 2 | 2023-11-18 | 320000 | 320000 | 2 |
  67. | 3 | 2024-11-18 | 399999.99 | 399999 | 2 |
  68. | 4 | 2025-11-18 | 400000 | 400000 | 3 |
  69. | 5 | 2026-11-18 | 470000 | 470000 | 3 |
  70. | 6 | 2027-11-18 | 510000 | 510000 | 4 |
  71. | 7 | 2028-11-18 | 610000 | 610000 | 5 |
  72. | 8 | NULL | NULL | NULL | NULL |
  73. +------+------------+-----------+--------+------+

keywords

WIDTH_BUCKET