Tutorial: First visualization in Vega-Lite

In this tutorial, you will learn about how to edit Vega-Lite in Kibana to create a stacked area chart from an Elasticsearch search query. It will give you a starting point for a more comprehensive introduction to Vega-Lite, while only covering the basics.

In this tutorial, you will build a stacked area chart from one of the Kibana sample data sets.

vega lite tutorial 1

Before beginning this tutorial, install the eCommerce sample data set.

When you first open the Vega editor in Kibana, you will see a pre-populated line chart which shows the total number of documents across all your indices within the time range.

vega lite default

The text editor contains a Vega-Lite spec written in HJSON, which is similar to JSON but optimized for human editing. HJSON supports:

  • Comments using // or /* syntax
  • Object keys without quotes
  • String values without quotes
  • Optional commas
  • Double or single quotes
  • Multiline strings

Small steps

Always work on Vega in the smallest steps possible, and save your work frequently. Small changes will cause unexpected results. Click the “Save” button now.

The first step is to change the index to one of the sample data sets. Change

  1. index: _all

to:

  1. index: kibana_sample_data_ecommerce

Click “Update”. The result is probably not what you expect. You should see a flat line with 0 results.

You’ve only changed the index, so the difference must be the query is returning no results. You can try the Vega debugging process, but intuition may be faster for this particular problem.

In this case, the problem is that you are querying the field @timestamp, which does not exist in the kibana_sample_data_ecommerce data. Find and replace @timestamp with order_date. This fixes the problem, leaving you with this spec:

Expand Vega-Lite spec

  1. {
  2. $schema: https://vega.github.io/schema/vega-lite/v4.json
  3. title: Event counts from ecommerce
  4. data: {
  5. url: {
  6. %context%: true
  7. %timefield%: order_date
  8. index: kibana_sample_data_ecommerce
  9. body: {
  10. aggs: {
  11. time_buckets: {
  12. date_histogram: {
  13. field: order_date
  14. interval: {%autointerval%: true}
  15. extended_bounds: {
  16. min: {%timefilter%: "min"}
  17. max: {%timefilter%: "max"}
  18. }
  19. min_doc_count: 0
  20. }
  21. }
  22. }
  23. size: 0
  24. }
  25. }
  26. format: {property: "aggregations.time_buckets.buckets" }
  27. }
  28. mark: line
  29. encoding: {
  30. x: {
  31. field: key
  32. type: temporal
  33. axis: { title: null }
  34. }
  35. y: {
  36. field: doc_count
  37. type: quantitative
  38. axis: { title: "Document count" }
  39. }
  40. }
  41. }

Now, let’s make the visualization more interesting by adding another aggregation to create a stacked area chart. To verify that you have constructed the right query, it is easiest to use the Kibana Dev Tools in a separate tab from the Vega editor. Open the Dev Tools from the Management section of the navigation.

This query is roughly equivalent to the one that is used in the default Vega-Lite spec. Copy it into the Dev Tools:

  1. POST kibana_sample_data_ecommerce/_search
  2. {
  3. "query": {
  4. "range": {
  5. "order_date": {
  6. "gte": "now-7d"
  7. }
  8. }
  9. },
  10. "aggs": {
  11. "time_buckets": {
  12. "date_histogram": {
  13. "field": "order_date",
  14. "fixed_interval": "1d",
  15. "extended_bounds": {
  16. "min": "now-7d"
  17. },
  18. "min_doc_count": 0
  19. }
  20. }
  21. },
  22. "size": 0
  23. }

There’s not enough data to create a stacked bar in the original query, so we will add a new terms aggregation:

  1. POST kibana_sample_data_ecommerce/_search
  2. {
  3. "query": {
  4. "range": {
  5. "order_date": {
  6. "gte": "now-7d"
  7. }
  8. }
  9. },
  10. "aggs": {
  11. "categories": {
  12. "terms": { "field": "category.keyword" },
  13. "aggs": {
  14. "time_buckets": {
  15. "date_histogram": {
  16. "field": "order_date",
  17. "fixed_interval": "1d",
  18. "extended_bounds": {
  19. "min": "now-7d"
  20. },
  21. "min_doc_count": 0
  22. }
  23. }
  24. }
  25. }
  26. },
  27. "size": 0
  28. }

You’ll see that the response format looks different from the previous query:

  1. {
  2. "aggregations" : {
  3. "categories" : {
  4. "doc_count_error_upper_bound" : 0,
  5. "sum_other_doc_count" : 0,
  6. "buckets" : [{
  7. "key" : "Men's Clothing",
  8. "doc_count" : 1661,
  9. "time_buckets" : {
  10. "buckets" : [{
  11. "key_as_string" : "2020-06-30T00:00:00.000Z",
  12. "key" : 1593475200000,
  13. "doc_count" : 19
  14. }, {
  15. "key_as_string" : "2020-07-01T00:00:00.000Z",
  16. "key" : 1593561600000,
  17. "doc_count" : 71
  18. }]
  19. }
  20. }]
  21. }
  22. }
  23. }

Now that we have data that we’re happy with, it’s time to convert from an isolated Elasticsearch query into a query with Kibana integration. Looking at the reference for writing Elasticsearch queries in Vega, you will see the full list of special tokens that are used in this query, such as %context: true. This query has also replaced "fixed_interval": "1d" with interval: {%autointerval%: true}. Copy the final query into your spec:

  1. data: {
  2. url: {
  3. %context%: true
  4. %timefield%: order_date
  5. index: kibana_sample_data_ecommerce
  6. body: {
  7. aggs: {
  8. categories: {
  9. terms: { field: "category.keyword" }
  10. aggs: {
  11. time_buckets: {
  12. date_histogram: {
  13. field: order_date
  14. interval: {%autointerval%: true}
  15. extended_bounds: {
  16. min: {%timefilter%: "min"}
  17. max: {%timefilter%: "max"}
  18. }
  19. min_doc_count: 0
  20. }
  21. }
  22. }
  23. }
  24. }
  25. size: 0
  26. }
  27. }
  28. format: {property: "aggregations.categories.buckets" }
  29. }

If you copy and paste that into your Vega-Lite spec, and click “Update”, you will see a warning saying Infinite extent for field "key": [Infinity, -Infinity]. Let’s use our Vega debugging skills to understand why.

Vega-Lite generates data using the names source_0 and data_0. source_0 contains the results from the Elasticsearch query, and data_0 contains the visually encoded results which are shown in the chart. To debug this problem, you need to compare both.

To look at the source, open the browser dev tools console and type VEGA_DEBUG.view.data('source_0'). You will see:

  1. [{
  2. doc_count: 454
  3. key: "Men's Clothing"
  4. time_buckets: {buckets: Array(57)}
  5. Symbol(vega_id): 12822
  6. }, ...]

To compare to the visually encoded data, open the browser dev tools console and type VEGA_DEBUG.view.data('data_0'). You will see:

  1. [{
  2. doc_count: 454
  3. key: NaN
  4. time_buckets: {buckets: Array(57)}
  5. Symbol(vega_id): 13879
  6. }]

The issue seems to be that the key property is not being converted the right way, which makes sense because the key is now Men's Clothing instead of a timestamp.

To fix this, try updating the encoding of your Vega-Lite spec to:

  1. encoding: {
  2. x: {
  3. field: time_buckets.buckets.key
  4. type: temporal
  5. axis: { title: null }
  6. }
  7. y: {
  8. field: time_buckets.buckets.doc_count
  9. type: quantitative
  10. axis: { title: "Document count" }
  11. }
  12. }

This will show more errors, and you can inspect VEGA_DEBUG.view.data('data_0') to understand why. This now shows:

  1. [{
  2. doc_count: 454
  3. key: "Men's Clothing"
  4. time_buckets: {buckets: Array(57)}
  5. time_buckets.buckets.doc_count: undefined
  6. time_buckets.buckets.key: null
  7. Symbol(vega_id): 14094
  8. }]

It looks like the problem is that the time_buckets inner array is not being extracted by Vega. The solution is to use a Vega-lite flatten transformation, available in Kibana 7.9 and later. If using an older version of Kibana, the flatten transformation is available in Vega but not Vega-Lite.

Add this section in between the data and encoding section:

  1. transform: [{
  2. flatten: ["time_buckets.buckets"]
  3. }]

This does not yet produce the results you expect. Inspect the transformed data by typing VEGA_DEBUG.view.data('data_0') into the console again:

  1. [{
  2. doc_count: 453
  3. key: "Men's Clothing"
  4. time_bucket.buckets.doc_count: undefined
  5. time_buckets: {buckets: Array(57)}
  6. time_buckets.buckets: {
  7. key_as_string: "2020-06-30T15:00:00.000Z",
  8. key: 1593529200000,
  9. doc_count: 2
  10. }
  11. time_buckets.buckets.key: null
  12. Symbol(vega_id): 21564
  13. }]

The debug view shows undefined values where you would expect to see numbers, and the cause is that there are duplicate names which are confusing Vega-Lite. This can be fixed by making this change to the transform and encoding blocks:

  1. transform: [{
  2. flatten: ["time_buckets.buckets"],
  3. as: ["buckets"]
  4. }]
  5. mark: area
  6. encoding: {
  7. x: {
  8. field: buckets.key
  9. type: temporal
  10. axis: { title: null }
  11. }
  12. y: {
  13. field: buckets.doc_count
  14. type: quantitative
  15. axis: { title: "Document count" }
  16. }
  17. color: {
  18. field: key
  19. type: nominal
  20. }
  21. }

At this point, you have a stacked area chart that shows the top categories, but the chart is still missing some common features that we expect from a Kibana visualization. Let’s add hover states and tooltips next.

Hover states are handled differently in Vega-Lite and Vega. In Vega-Lite this is done using a concept called selection, which has many permutations that are not covered in this tutorial. We will be adding a simple tooltip and hover state.

Because Kibana has enabled the Vega tooltip plugin, tooltips can be defined in several ways:

  • Automatic tooltip based on the data, via { content: "data" }
  • Array of fields, like [{ field: "key", type: "nominal" }]
  • Defining a custom Javascript object using the calculate transform

For the simple tooltip, add this to your encoding:

  1. encoding: {
  2. tooltip: [{
  3. field: buckets.key
  4. type: temporal
  5. title: "Date"
  6. }, {
  7. field: key
  8. type: nominal
  9. title: "Category"
  10. }, {
  11. field: buckets.doc_count
  12. type: quantitative
  13. title: "Count"
  14. }]
  15. }

As you hover over the area series in your chart, a multi-line tooltip will appear, but it won’t indicate the nearest point that it’s pointing to. To indicate the nearest point, we need to add a second layer.

The first step is to remove the mark: area from your visualization. Once you’ve removed the previous mark, add a composite mark at the end of the Vega-Lite spec:

  1. layer: [{
  2. mark: area
  3. }, {
  4. mark: point
  5. }]

You’ll see that the points are not appearing to line up with the area chart, and the reason is that the points are not being stacked. Change your Y encoding to this:

  1. y: {
  2. field: buckets.doc_count
  3. type: quantitative
  4. axis: { title: "Document count" }
  5. stack: true
  6. }

Now, we will add a selection block inside the point mark:

  1. layer: [{
  2. mark: area
  3. }, {
  4. mark: point
  5. selection: {
  6. pointhover: {
  7. type: single
  8. on: mouseover
  9. clear: mouseout
  10. empty: none
  11. fields: ["buckets.key", "key"]
  12. nearest: true
  13. }
  14. }
  15. encoding: {
  16. size: {
  17. condition: {
  18. selection: pointhover
  19. value: 100
  20. }
  21. value: 5
  22. }
  23. fill: {
  24. condition: {
  25. selection: pointhover
  26. value: white
  27. }
  28. }
  29. }
  30. }]

Now that you’ve enabled a selection, try moving the mouse around the visualization and seeing the points respond to the nearest position:

vega lite tutorial 2

The final result of this tutorial is this spec:

Expand final Vega-Lite spec

  1. {
  2. $schema: https://vega.github.io/schema/vega-lite/v4.json
  3. title: Event counts from ecommerce
  4. data: {
  5. url: {
  6. %context%: true
  7. %timefield%: order_date
  8. index: kibana_sample_data_ecommerce
  9. body: {
  10. aggs: {
  11. categories: {
  12. terms: { field: "category.keyword" }
  13. aggs: {
  14. time_buckets: {
  15. date_histogram: {
  16. field: order_date
  17. interval: {%autointerval%: true}
  18. extended_bounds: {
  19. min: {%timefilter%: "min"}
  20. max: {%timefilter%: "max"}
  21. }
  22. min_doc_count: 0
  23. }
  24. }
  25. }
  26. }
  27. }
  28. size: 0
  29. }
  30. }
  31. format: {property: "aggregations.categories.buckets" }
  32. }
  33. transform: [{
  34. flatten: ["time_buckets.buckets"]
  35. as: ["buckets"]
  36. }]
  37. encoding: {
  38. x: {
  39. field: buckets.key
  40. type: temporal
  41. axis: { title: null }
  42. }
  43. y: {
  44. field: buckets.doc_count
  45. type: quantitative
  46. axis: { title: "Document count" }
  47. stack: true
  48. }
  49. color: {
  50. field: key
  51. type: nominal
  52. title: "Category"
  53. }
  54. tooltip: [{
  55. field: buckets.key
  56. type: temporal
  57. title: "Date"
  58. }, {
  59. field: key
  60. type: nominal
  61. title: "Category"
  62. }, {
  63. field: buckets.doc_count
  64. type: quantitative
  65. title: "Count"
  66. }]
  67. }
  68. layer: [{
  69. mark: area
  70. }, {
  71. mark: point
  72. selection: {
  73. pointhover: {
  74. type: single
  75. on: mouseover
  76. clear: mouseout
  77. empty: none
  78. fields: ["buckets.key", "key"]
  79. nearest: true
  80. }
  81. }
  82. encoding: {
  83. size: {
  84. condition: {
  85. selection: pointhover
  86. value: 100
  87. }
  88. value: 5
  89. }
  90. fill: {
  91. condition: {
  92. selection: pointhover
  93. value: white
  94. }
  95. }
  96. }
  97. }]
  98. }

Most Popular