Text/image embedding processor

The text_image_embedding processor is used to generate combined vector embeddings from text and image fields for multimodal neural search.

PREREQUISITE
Before using the text_image_embedding processor, you must set up a machine learning (ML) model. For more information, see Choosing a model.

The following is the syntax for the text_image_embedding processor:

  1. {
  2. "text_image_embedding": {
  3. "model_id": "<model_id>",
  4. "embedding": "<vector_field>",
  5. "field_map": {
  6. "text": "<input_text_field>",
  7. "image": "<input_image_field>"
  8. }
  9. }
  10. }

copy

Parameters

The following table lists the required and optional parameters for the text_image_embedding processor.

ParameterData typeRequired/OptionalDescription
model_idStringRequiredThe ID of the model that will be used to generate the embeddings. The model must be deployed in OpenSearch before it can be used in neural search. For more information, see Using custom models within OpenSearch and Multimodal search.
embeddingStringRequiredThe name of the vector field in which to store the generated embeddings. A single embedding is generated for both text and image fields.
field_mapObjectRequiredContains key-value pairs that specify the fields from which to generate embeddings.
field_map.textStringOptionalThe name of the field from which to obtain text for generating vector embeddings. You must specify at least one text or image.
field_map.imageStringOptionalThe name of the field from which to obtain the image for generating vector embeddings. You must specify at least one text or image.
descriptionStringOptionalA brief description of the processor.
tagStringOptionalAn identifier tag for the processor. Useful for debugging to distinguish between processors of the same type.

Using the processor

Follow these steps to use the processor in a pipeline. You must provide a model ID when creating the processor. For more information, see Using custom models within OpenSearch.

Step 1: Create a pipeline.

The following example request creates an ingest pipeline where the text from image_description and the image from image_binary will be converted into vector embeddings and the embeddings will be stored in vector_embedding:

  1. PUT /_ingest/pipeline/nlp-ingest-pipeline
  2. {
  3. "description": "A text/image embedding pipeline",
  4. "processors": [
  5. {
  6. "text_image_embedding": {
  7. "model_id": "bQ1J8ooBpBj3wT4HVUsb",
  8. "embedding": "vector_embedding",
  9. "field_map": {
  10. "text": "image_description",
  11. "image": "image_binary"
  12. }
  13. }
  14. }
  15. ]
  16. }

copy

You can set up multiple processors in one pipeline to generate embeddings for multiple fields.

Step 2 (Optional): Test the pipeline.

It is recommended that you test your pipeline before you ingest documents.

To test the pipeline, run the following query:

  1. POST _ingest/pipeline/nlp-ingest-pipeline/_simulate
  2. {
  3. "docs": [
  4. {
  5. "_index": "testindex1",
  6. "_id": "1",
  7. "_source":{
  8. "image_description": "Orange table",
  9. "image_binary": "bGlkaHQtd29rfx43..."
  10. }
  11. }
  12. ]
  13. }

copy

Response

The response confirms that in addition to the image_description and image_binary fields, the processor has generated vector embeddings in the vector_embedding field:

  1. {
  2. "docs": [
  3. {
  4. "doc": {
  5. "_index": "testindex1",
  6. "_id": "1",
  7. "_source": {
  8. "vector_embedding": [
  9. -0.048237972,
  10. -0.07612712,
  11. 0.3262124,
  12. ...
  13. -0.16352308
  14. ],
  15. "image_description": "Orange table",
  16. "image_binary": "bGlkaHQtd29rfx43..."
  17. },
  18. "_ingest": {
  19. "timestamp": "2023-10-05T15:15:19.691345393Z"
  20. }
  21. }
  22. }
  23. ]
  24. }

Once you have created an ingest pipeline, you need to create an index for ingestion and ingest documents into the index. To learn more, see Step 2: Create an index for ingestion and Step 3: Ingest documents into the index of Multimodal search.

Next steps