This version of the OpenSearch documentation is no longer maintained. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.

IP2Geo

Introduced 2.10

The ip2geo processor adds information about the geographical location of an IPv4 or IPv6 address. The ip2geo processor uses IP geolocation (GeoIP) data from an external endpoint and therefore requires an additional component, datasource, that defines from where to download GeoIP data and how frequently to update the data.

info icon NOTE
The ip2geo processor maintains the GeoIP data mapping in system indexes. The GeoIP mapping is retrieved from these indexes during data ingestion to perform the IP-to-geolocation conversion on the incoming data. For optimal performance, it is preferable to have a node with both ingest and data roles, as this configuration avoids internode calls reducing latency. Also, as the ip2geo processor searches GeoIP mapping data from the indexes, search performance is impacted.

Getting started

To get started with the ip2geo processor, the opensearch-geospatial plugin must be installed. See Installing plugins to learn more.

Cluster settings

The IP2Geo data source and ip2geo processor node settings are listed in the following table.

KeyDescriptionDefault
plugins.geospatial.ip2geo.datasource.endpointDefault endpoint for creating the data source API.Defaults to https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json.
plugins.geospatial.ip2geo.datasource.update_interval_in_daysDefault update interval for creating the data source API.Defaults to 3.
plugins.geospatial.ip2geo.datasource.batch_sizeMaximum number of documents to ingest in a bulk request during the IP2Geo data source creation process.Defaults to 10,000.
plugins.geospatial.ip2geo.processor.cache_sizeMaximum number of results that can be cached. There is only one cache used for all IP2Geo processors in each nodeDefaults to 1,000.

Creating the IP2Geo data source

Before creating the pipeline that uses the ip2geo processor, create the IP2Geo data source. The data source defines the endpoint value that will download GeoIP data and specifies the update interval.

OpenSearch provides the following endpoints for GeoLite2 City, GeoLite2 Country, and GeoLite2 ASN databases from MaxMind, which is shared under the CC BY-SA 4.0 license:

If an OpenSearch cluster cannot update a data source from the endpoints within 30 days, the cluster does not add GeoIP data to the documents and instead adds "error":"ip2geo_data_expired".

Data source options

The following table lists the data source options for the ip2geo processor.

NameRequiredDefaultDescription
endpointOptionalhttps://geoip.maps.opensearch.org/v1/geolite2-city/manifest.jsonThe endpoint that downloads the GeoIP data.
update_interval_in_daysOptional3How frequently, in days, the GeoIP data is updated. The minimum value is 1.

To create an IP2Geo data source, run the following query:

  1. PUT /_plugins/geospatial/ip2geo/datasource/my-datasource
  2. {
  3. "endpoint" : "https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json",
  4. "update_interval_in_days" : 3
  5. }

copy

A true response means that the request was successful and that the server was able to process the request. A false response indicates that you should check the request to make sure it is valid, check the URL to make sure it is correct, or try again.

Sending a GET request

To get information about one or more IP2Geo data sources, send a GET request:

  1. GET /_plugins/geospatial/ip2geo/datasource/my-datasource

copy

You’ll receive the following response:

  1. {
  2. "datasources": [
  3. {
  4. "name": "my-datasource",
  5. "state": "AVAILABLE",
  6. "endpoint": "https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json",
  7. "update_interval_in_days": 3,
  8. "next_update_at_in_epoch_millis": 1685125612373,
  9. "database": {
  10. "provider": "maxmind",
  11. "sha256_hash": "0SmTZgtTRjWa5lXR+XFCqrZcT495jL5XUcJlpMj0uEA=",
  12. "updated_at_in_epoch_millis": 1684429230000,
  13. "valid_for_in_days": 30,
  14. "fields": [
  15. "country_iso_code",
  16. "country_name",
  17. "continent_name",
  18. "region_iso_code",
  19. "region_name",
  20. "city_name",
  21. "time_zone",
  22. "location"
  23. ]
  24. },
  25. "update_stats": {
  26. "last_succeeded_at_in_epoch_millis": 1684866730192,
  27. "last_processing_time_in_millis": 317640,
  28. "last_failed_at_in_epoch_millis": 1684866730492,
  29. "last_skipped_at_in_epoch_millis": 1684866730292
  30. }
  31. }
  32. ]
  33. }

Updating an IP2Geo data source

See the Creating the IP2Geo data source section for a list of endpoints and request field descriptions.

To update the date source, run the following query:

  1. PUT /_plugins/geospatial/ip2geo/datasource/my-datasource/_settings
  2. {
  3. "endpoint": https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json,
  4. "update_interval_in_days": 10
  5. }

copy

Deleting the IP2Geo data source

To delete the IP2Geo data source, you must first delete all processors associated with the data source. Otherwise, the request fails.

To delete the data source, run the following query:

  1. DELETE /_plugins/geospatial/ip2geo/datasource/my-datasource

copy

Creating the pipeline

Once the data source is created, you can create the pipeline. The following is the syntax for the ip2geo processor:

  1. {
  2. "ip2geo": {
  3. "field":"ip",
  4. "datasource":"my-datasource"
  5. }
  6. }

copy

Configuration parameters

The following table lists the required and optional parameters for the ip2geo processor.

NameRequiredDefaultDescription
datasourceRequired-The data source name to use to retrieve geographical information.
fieldRequired-The field that contains the IP address for geographical lookup.
ignore_missingOptionalfalseIf set to true, the processor does not modify the document if the field does not exist or is null. Default is false.
propertiesOptionalAll fields in datasourceThe field that controls which properties are added to target_field from datasource.
target_fieldOptionalip2geoThe field that contains the geographical information retrieved from the data source.

Using the processor

Follow these steps to use the processor in a pipeline.

Step 1: Create a pipeline.

The following query creates a pipeline, named my-pipeline, that converts the IP address to geographical information:

  1. PUT /_ingest/pipeline/my-pipeline
  2. {
  3. "description":"convert ip to geo",
  4. "processors":[
  5. {
  6. "ip2geo":{
  7. "field":"ip",
  8. "datasource":"my-datasource"
  9. }
  10. }
  11. ]
  12. }

copy

Step 2 (Optional): Test the pipeline.

info icon NOTE
It is recommended that you test your pipeline before you ingest documents.

To test the pipeline, run the following query:

  1. POST _ingest/pipeline/my-id/_simulate
  2. {
  3. "docs": [
  4. {
  5. "_index":"my-index",
  6. "_id":"my-id",
  7. "_source":{
  8. "my_ip_field":"172.0.0.1",
  9. "ip2geo":{
  10. "continent_name":"North America",
  11. "region_iso_code":"AL",
  12. "city_name":"Calera",
  13. "country_iso_code":"US",
  14. "country_name":"United States",
  15. "region_name":"Alabama",
  16. "location":"33.1063,-86.7583",
  17. "time_zone":"America/Chicago"
  18. }
  19. }
  20. }
  21. ]
  22. }

copy

Step 3: Ingest a document.

The following query ingests a document into an index named my-index:

  1. PUT /my-index/_doc/my-id?pipeline=ip2geo
  2. {
  3. "ip": "172.0.0.1"
  4. }

copy

Step 4 (Optional): Retrieve the document.

To retrieve the document, run the following query:

  1. GET /my-index/_doc/my-id

copy