The LLM pipeline runs prompts through a large language model (LLM). This pipeline autodetects the LLM framework based on the model path.


The following shows a simple example using this pipeline.

  1. from txtai.pipeline import LLM
  2. # Create and run LLM pipeline
  3. llm = LLM()
  4. llm(
  5. """
  6. Answer the following question using the provided context.
  7. Question:
  8. What are the applications of txtai?
  9. Context:
  10. txtai is an open-source platform for semantic search and
  11. workflows powered by language models.
  12. """
  13. )

The LLM pipeline automatically detects the underlying LLM framework. This can also be manually set.

  1. from txtai.pipeline import LLM
  2. # Set method as litellm
  3. llm = LLM("vllm/Open-Orca/Mistral-7B-OpenOrca", method="litellm")
  4. # Set method as llama.cpp
  5. llm = LLM("TheBloke/Mistral-7B-OpenOrca-GGUF/mistral-7b-openorca.Q4_K_M.gguf",
  6. method="llama.cpp")

Models can be externally loaded and passed to pipelines. This is useful for models that are not yet supported by Transformers and/or need special initialization.

  1. import torch
  2. from transformers import AutoModelForCausalLM, AutoTokenizer
  3. from txtai.pipeline import LLM
  4. # Load Mistral-7B-OpenOrca
  5. path = "Open-Orca/Mistral-7B-OpenOrca"
  6. model = AutoModelForCausalLM.from_pretrained(
  7. path,
  8. torch_dtype=torch.bfloat16,
  9. )
  10. tokenizer = AutoTokenizer.from_pretrained(path)
  11. llm = LLM((model, tokenizer))

See the links below for more detailed examples.

Configuration-driven example

Pipelines are run with Python or configuration. Pipelines can be instantiated in configuration using the lower case name of the pipeline. Configuration-driven pipelines are run with workflows or the API.


  1. # Create pipeline using lower case class name
  2. # Use `generator` or `sequences` to force model type
  3. llm:
  4. # Run pipeline with workflow
  5. workflow:
  6. llm:
  7. tasks:
  8. - action: llm

Similar to the Python example above, the underlying Hugging Face pipeline parameters and model parameters can be set in pipeline configuration.

  1. llm:
  2. path: Open-Orca/Mistral-7B-OpenOrca
  3. torch_dtype: torch.bfloat16

Run with Workflows

  1. from txtai.app import Application
  2. # Create and run pipeline with workflow
  3. app = Application("config.yml")
  4. list(app.workflow("llm", [
  5. """
  6. Answer the following question using the provided context.
  7. Question:
  8. What are the applications of txtai?
  9. Context:
  10. txtai is an open-source platform for semantic search and
  11. workflows powered by language models.
  12. """
  13. ]))

Run with API

  1. CONFIG=config.yml uvicorn "txtai.api:app" &
  2. curl \
  3. -X POST "http://localhost:8000/workflow" \
  4. -H "Content-Type: application/json" \
  5. -d '{"name":"sequences", "elements": ["Answer the following question..."]}'


Python documentation for the pipeline.

Creates a new LLM.



model path


llm model framework, infers from path if not provided


model keyword arguments


  1. def init(self, path=None, method=None, kwargs):
  2. “””
  3. Creates a new LLM.
  4. Args:
  5. path: model path
  6. method: llm model framework, infers from path if not provided
  7. kwargs: model keyword arguments
  8. “””
  9. # Default LLM if not provided
  10. path = path if path else google/flan-t5-base
  11. # Generation instance
  12. self.generator = GenerationFactory.create(path, method, kwargs)

Generates text using input text





maximum sequence length


additional generation keyword arguments




generated text

  1. def call(self, text, maxlength=512, kwargs):
  2. “””
  3. Generates text using input text
  4. Args:
  5. text: text|list
  6. maxlength: maximum sequence length
  7. kwargs: additional generation keyword arguments
  8. Returns:
  9. generated text
  10. “””
  11. # Run LLM generation
  12. return self.generator(text, maxlength, kwargs)