LLM

pipeline pipeline

The LLM pipeline runs prompts through a large language model (LLM). This pipeline autodetects if the model path is a text generation or sequence to sequence model.

Example

The following shows a simple example using this pipeline.

  1. from txtai.pipeline import LLM
  2. # Create and run LLM pipeline
  3. llm = LLM()
  4. llm(
  5. """
  6. Answer the following question using the provided context.
  7. Question:
  8. What are the applications of txtai?
  9. Context:
  10. txtai is an open-source platform for semantic search and
  11. workflows powered by language models.
  12. """
  13. )

The LLM pipeline automatically detects the underlying model type (text-generation or sequence-sequence). This can also be manually set.

  1. from txtai.pipeline import LLM, Generator, Sequences
  2. # Set model type via task parameter
  3. llm = LLM("google/flan-t5-xl", task="sequence-sequence")
  4. # Create sequences pipeline (same as previous statement)
  5. sequences = Sequences("google/flan-t5-xl")
  6. # Set model type via task parameter
  7. llm = LLM("openlm-research/open_llama_3b", task="language-generation")
  8. # Create generator pipeline (same as previous statement)
  9. generator = Generator("openlm-research/open_llama_3b")

Models can be externally loaded and passed to pipelines. This is useful for models that are not yet supported by Transformers and/or need special initialization.

  1. import torch
  2. from transformers import AutoModelForCausalLM, AutoTokenizer
  3. from txtai.pipeline import LLM
  4. # Load Falcon-7B-Instruct
  5. path = "tiiuae/falcon-7b-instruct"
  6. model = AutoModelForCausalLM.from_pretrained(
  7. path,
  8. torch_dtype=torch.bfloat16,
  9. trust_remote_code=True
  10. )
  11. tokenizer = AutoTokenizer.from_pretrained(path)
  12. llm = LLM((model, tokenizer))

See the links below for more detailed examples.

NotebookDescription
Prompt-driven search with LLMsEmbeddings-guided and Prompt-driven search with Large Language Models (LLMs)Open In Colab
Prompt templates and task chainsBuild model prompts and connect tasks together with workflowsOpen In Colab

Configuration-driven example

Pipelines are run with Python or configuration. Pipelines can be instantiated in configuration using the lower case name of the pipeline. Configuration-driven pipelines are run with workflows or the API.

config.yml

  1. # Create pipeline using lower case class name
  2. # Use `generator` or `sequences` to force model type
  3. llm:
  4. # Run pipeline with workflow
  5. workflow:
  6. llm:
  7. tasks:
  8. - action: llm

Similar to the Python example above, the underlying Hugging Face pipeline parameters and model parameters can be set in pipeline configuration.

  1. llm:
  2. path: tiiuae/falcon-7b-instruct
  3. torch_dtype: torch.bfloat16
  4. trust_remote_code: True

Run with Workflows

  1. from txtai.app import Application
  2. # Create and run pipeline with workflow
  3. app = Application("config.yml")
  4. list(app.workflow("llm", [
  5. """
  6. Answer the following question using the provided context.
  7. Question:
  8. What are the applications of txtai?
  9. Context:
  10. txtai is an open-source platform for semantic search and
  11. workflows powered by language models.
  12. """
  13. ]))

Run with API

  1. CONFIG=config.yml uvicorn "txtai.api:app" &
  2. curl \
  3. -X POST "http://localhost:8000/workflow" \
  4. -H "Content-Type: application/json" \
  5. -d '{"name":"sequences", "elements": ["Answer the following question..."]}'

Methods

Python documentation for the pipeline.

__init__(self, path=None, quantize=False, gpu=True, model=None, task=None, **kwargs) special

Source code in txtai/pipeline/text/llm.py

  1. def __init__(self, path=None, quantize=False, gpu=True, model=None, task=None, **kwargs):
  2. super().__init__(self.task(path, task, **kwargs), path if path else "google/flan-t5-base", quantize, gpu, model, **kwargs)
  3. # Load tokenizer, if necessary
  4. self.pipeline.tokenizer = self.pipeline.tokenizer if self.pipeline.tokenizer else Models.tokenizer(path, **kwargs)

__call__(self, text, prefix=None, maxlength=512, workers=0, **kwargs) special

Generates text using input text

Parameters:

NameTypeDescriptionDefault
text

text|list

required
prefix

optional prefix to prepend to text elements

None
maxlength

maximum sequence length

512
workers

number of concurrent workers to use for processing data, defaults to None

0
kwargs

additional generation keyword arguments

{}

Returns:

TypeDescription

generated text

Source code in txtai/pipeline/text/llm.py

  1. def __call__(self, text, prefix=None, maxlength=512, workers=0, **kwargs):
  2. """
  3. Generates text using input text
  4. Args:
  5. text: text|list
  6. prefix: optional prefix to prepend to text elements
  7. maxlength: maximum sequence length
  8. workers: number of concurrent workers to use for processing data, defaults to None
  9. kwargs: additional generation keyword arguments
  10. Returns:
  11. generated text
  12. """
  13. # List of texts
  14. texts = text if isinstance(text, list) else [text]
  15. # Add prefix, if necessary
  16. if prefix:
  17. texts = [f"{prefix}{x}" for x in texts]
  18. # Run pipeline
  19. results = self.pipeline(texts, max_length=maxlength, num_workers=workers, **kwargs)
  20. # Get generated text
  21. results = [self.clean(texts[x], result) for x, result in enumerate(results)]
  22. return results[0] if isinstance(text, str) else results