Caption

pipeline pipeline

The caption pipeline reads a list of images and returns a list of captions for those images.

Example

The following shows a simple example using this pipeline.

  1. from txtai.pipeline import Caption
  2. # Create and run pipeline
  3. caption = Caption()
  4. caption("path to image file")

See the link below for a more detailed example.

NotebookDescription
Generate image captions and detect objectsCaptions and object detection for imagesOpen In Colab

Configuration-driven example

Pipelines are run with Python or configuration. Pipelines can be instantiated in configuration using the lower case name of the pipeline. Configuration-driven pipelines are run with workflows or the API.

config.yml

  1. # Create pipeline using lower case class name
  2. caption:
  3. # Run pipeline with workflow
  4. workflow:
  5. caption:
  6. tasks:
  7. - action: caption

Run with Workflows

  1. from txtai.app import Application
  2. # Create and run pipeline with workflow
  3. app = Application("config.yml")
  4. list(app.workflow("caption", ["path to image file"]))

Run with API

  1. CONFIG=config.yml uvicorn "txtai.api:app" &
  2. curl \
  3. -X POST "http://localhost:8000/workflow" \
  4. -H "Content-Type: application/json" \
  5. -d '{"name":"caption", "elements":["path to image file"]}'

Methods

Python documentation for the pipeline.

Source code in txtai/pipeline/image/caption.py

  1. 21
  2. 22
  3. 23
  4. 24
  5. 25
  6. 26
  1. def init(self, path=None, quantize=False, gpu=True, model=None, kwargs):
  2. if not PIL:
  3. raise ImportError(‘Captions pipeline is not available - install pipeline extra to enable’)
  4. # Call parent constructor
  5. super().init(“image-to-text”, path, quantize, gpu, model, kwargs)

Builds captions for images.

This method supports a single image or a list of images. If the input is an image, the return type is a string. If text is a list, a list of strings is returned

Parameters:

NameTypeDescriptionDefault
images

image|list

required

Returns:

TypeDescription

list of captions

Source code in txtai/pipeline/image/caption.py

  1. 28
  2. 29
  3. 30
  4. 31
  5. 32
  6. 33
  7. 34
  8. 35
  9. 36
  10. 37
  11. 38
  12. 39
  13. 40
  14. 41
  15. 42
  16. 43
  17. 44
  18. 45
  19. 46
  20. 47
  21. 48
  22. 49
  23. 50
  24. 51
  25. 52
  26. 53
  27. 54
  28. 55
  1. def call(self, images):
  2. “””
  3. Builds captions for images.
  4. This method supports a single image or a list of images. If the input is an image, the return
  5. type is a string. If text is a list, a list of strings is returned
  6. Args:
  7. images: image|list
  8. Returns:
  9. list of captions
  10. “””
  11. # Convert single element to list
  12. values = [images] if not isinstance(images, list) else images
  13. # Open images if file strings
  14. values = [Image.open(image) if isinstance(image, str) else image for image in values]
  15. # Get and clean captions
  16. captions = []
  17. for result in self.pipeline(values):
  18. text = “.join([r[“generated_text”] for r in result]).strip()
  19. captions.append(text)
  20. # Return single element if single element passed in
  21. return captions[0] if not isinstance(images, list) else captions