position_increment_gap

Analyzed text fields take term positions into account, in order to be able to support proximity or phrase queries. When indexing text fields with multiple values a “fake” gap is added between the values to prevent most phrase queries from matching across the values. The size of this gap is configured using position_increment_gap and defaults to 100.

For example:

  1. PUT my-index-000001/_doc/1
  2. {
  3. "names": [ "John Abraham", "Lincoln Smith"]
  4. }
  5. GET my-index-000001/_search
  6. {
  7. "query": {
  8. "match_phrase": {
  9. "names": {
  10. "query": "Abraham Lincoln"
  11. }
  12. }
  13. }
  14. }
  15. GET my-index-000001/_search
  16. {
  17. "query": {
  18. "match_phrase": {
  19. "names": {
  20. "query": "Abraham Lincoln",
  21. "slop": 101
  22. }
  23. }
  24. }
  25. }

This phrase query doesn’t match our document which is totally expected.

This phrase query matches our document, even though Abraham and Lincoln are in separate strings, because slop > position_increment_gap.

The position_increment_gap can be specified in the mapping. For instance:

  1. PUT my-index-000001
  2. {
  3. "mappings": {
  4. "properties": {
  5. "names": {
  6. "type": "text",
  7. "position_increment_gap": 0
  8. }
  9. }
  10. }
  11. }
  12. PUT my-index-000001/_doc/1
  13. {
  14. "names": [ "John Abraham", "Lincoln Smith"]
  15. }
  16. GET my-index-000001/_search
  17. {
  18. "query": {
  19. "match_phrase": {
  20. "names": "Abraham Lincoln"
  21. }
  22. }
  23. }

The first term in the next array element will be 0 terms apart from the last term in the previous array element.

The phrase query matches our document which is weird, but its what we asked for in the mapping.