Image Embedding
As of version 0.3.0 fastembed supports computation of image embeddings.
The process is as easy and straightforward as with text embeddings. Let’s see how it works.
from fastembed import ImageEmbeddingmodel = ImageEmbedding("Qdrant/resnet50-onnx")embeddings_generator = model.embed(["../../tests/misc/image.jpeg", "../../tests/misc/small_image.jpeg"])embeddings_list = list(embeddings_generator)embeddings_list
Fetching 3 files: 100%|██████████| 3/3 [00:00<00:00, 47482.69it/s]
[array([0. , 0. , 0. , ..., 0. , 0.01139933,0. ], dtype=float32),array([0.02169187, 0. , 0. , ..., 0. , 0.00848291,0. ], dtype=float32)]
Preprocessing
Preprocessing is encapsulated in the ImageEmbedding class, applied operations are identical to the ones provided by Hugging Face Transformers. You don’t need to think about batching, opening/closing files, resizing images, etc., Fastembed will take care of it.
Supported models
List of supported image embedding models can either be found here or by calling the ImageEmbedding.list_supported_models() method.
ImageEmbedding.list_supported_models()
[{'model': 'Qdrant/clip-ViT-B-32-vision','dim': 512,'description': 'CLIP vision encoder based on ViT-B/32','size_in_GB': 0.34,'sources': {'hf': 'Qdrant/clip-ViT-B-32-vision'},'model_file': 'model.onnx'},{'model': 'Qdrant/resnet50-onnx','dim': 2048,'description': 'ResNet-50 from `Deep Residual Learning for Image Recognition <https://arxiv.org/abs/1512.03385>`__.','size_in_GB': 0.1,'sources': {'hf': 'Qdrant/resnet50-onnx'},'model_file': 'model.onnx'}]