Configuring built-in analyzers

The built-in analyzers can be used directly without any configuration. Some of them, however, support configuration options to alter their behaviour. For instance, the standard analyzer can be configured to support a list of stop words:

  1. PUT my-index-000001
  2. {
  3. "settings": {
  4. "analysis": {
  5. "analyzer": {
  6. "std_english": {
  7. "type": "standard",
  8. "stopwords": "_english_"
  9. }
  10. }
  11. }
  12. },
  13. "mappings": {
  14. "properties": {
  15. "my_text": {
  16. "type": "text",
  17. "analyzer": "standard",
  18. "fields": {
  19. "english": {
  20. "type": "text",
  21. "analyzer": "std_english"
  22. }
  23. }
  24. }
  25. }
  26. }
  27. }
  28. POST my-index-000001/_analyze
  29. {
  30. "field": "my_text",
  31. "text": "The old brown cow"
  32. }
  33. POST my-index-000001/_analyze
  34. {
  35. "field": "my_text.english",
  36. "text": "The old brown cow"
  37. }

We define the std_english analyzer to be based on the standard analyzer, but configured to remove the pre-defined list of English stopwords.

The my_text field uses the standard analyzer directly, without any configuration. No stop words will be removed from this field. The resulting terms are: [ the, old, brown, cow ]

The my_text.english field uses the std_english analyzer, so English stop words will be removed. The resulting terms are: [ old, brown, cow ]