Home - 图1

Home

Build AI-powered semantic search applications

Version GitHub last commit GitHub issues Join Slack Build Status Coverage Status


txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications.

demo

Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords.

search search

Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings). Innovation is happening at a rapid pace, models can understand concepts in documents, audio, images and video.

Summary of txtai features:

  • 🔎 Large-scale similarity search with multiple index backends (Faiss, Annoy, Hnswlib) and support for external vector databases
  • 📄 Create embeddings for text snippets, documents, audio, images and video
  • 💡 Machine-learning pipelines that run question-answering, labeling, transcription, translation, summarization, LLM prompts and more
  • ↪️️ Workflows to join pipelines together and aggregate business logic. txtai processes can be microservices or full-fledged indexing workflows.
  • ⚙️ Build with Python or YAML. API bindings available for JavaScript, Java, Rust and Go.
  • ☁️ Cloud-native architecture that scales out with container orchestration systems (e.g. Kubernetes)

Applications range from similarity search to NLP-driven data extractions that generate structured data. Semantic workflows transform and find data driven by user intent.

flows flows

The following applications are powered by txtai.

apps

ApplicationDescription
paperaiSemantic search and workflows for medical/scientific papers
codequestionSemantic search for developers
tldrstorySemantic search for headlines and story text
neuspoFact-driven, real-time sports event and news site

txtai is built with Python 3.7+, Hugging Face Transformers, Sentence Transformers and FastAPI