Home - 图1

Highly scalable and standards based

Model Inference Platform on Kubernetes
for Trusted AI

Get Started

Why KServe

  • KServe is a standard Model Inference Platform on Kubernetes, built for highly scalable use cases.
  • Provides performant, standardized inference protocol across ML frameworks.
  • Support modern serverless inference workload with Autoscaling including Scale to Zero on GPU.
  • Provides high scalability, density packing and intelligent routing using ModelMesh
  • Simple and Pluggable production serving for production ML serving including prediction, pre/post processing, monitoring and explainability.
  • Advanced deployments with canary rollout, experiments, ensembles and transformers.

KServe Components

Model Serving

Single Model Serving

Provides Serverless deployment of single model inference on CPU/GPU for common ML frameworks Scikit-Learn, XGBoost, Tensorflow, PyTorch as well as pluggable custom model runtime.

ModelMesh Serving

ModelMesh

ModelMesh is designed for high-scale, high-density and frequently-changing model use cases. ModelMesh intelligently loads and unloads AI models to and from memory to strike an intelligent trade-off between responsiveness to users and computational footprint.

explainer

Model Explainability

Provides ML model inspection and interpretation, KServe integrates Alibi, AI Explainability 360, Captum to help explain the predictions and gauge the confidence of those predictions.

model monitoring

Model Monitoring

Enables payload logging, outlier, adversarial and drift detection, KServe integrates Alibi-detect, AI Fairness 360, Adversarial Robustness Toolbox (ART) to help monitor the ML models on production.

Advanced Deployments

Advanced deployments

Supports canary rollout, model experiments/ensembles and feature transformers including Feast as well as custom pre/post processing.


Adopters

Home - 图7

Home - 图8

Home - 图9

Home - 图10

Home - 图11

Home - 图12

Home - 图13

Home - 图14

..and more!