Control Plane API

Packages:

serving.kserve.io/v1beta1

Package v1beta1 contains API Schema definitions for the serving v1beta1 API group

Resource Types:

AIXExplainerSpec

(Appears on:ExplainerSpec)

AIXExplainerSpec defines the arguments for configuring an AIX Explanation Server

FieldDescription
type
AIXExplainerType

The type of AIX explainer

ExplainerExtensionSpec
ExplainerExtensionSpec

(Members of ExplainerExtensionSpec are embedded into this type.)

Contains fields shared across all explainers

AIXExplainerType (string alias)

(Appears on:AIXExplainerSpec)

ValueDescription

“LimeImages”

ARTExplainerSpec

(Appears on:ExplainerSpec)

ARTExplainerType defines the arguments for configuring an ART Explanation Server

FieldDescription
type
ARTExplainerType

The type of ART explainer

ExplainerExtensionSpec
ExplainerExtensionSpec

(Members of ExplainerExtensionSpec are embedded into this type.)

Contains fields shared across all explainers

ARTExplainerType (string alias)

(Appears on:ARTExplainerSpec)

ValueDescription

“SquareAttack”

AlibiExplainerSpec

(Appears on:ExplainerSpec)

AlibiExplainerSpec defines the arguments for configuring an Alibi Explanation Server

FieldDescription
type
AlibiExplainerType

The type of Alibi explainer
Valid values are:
- “AnchorTabular”;
- “AnchorImages”;
- “AnchorText”;
- “Counterfactuals”;
- “Contrastive”;

ExplainerExtensionSpec
ExplainerExtensionSpec

(Members of ExplainerExtensionSpec are embedded into this type.)

Contains fields shared across all explainers

AlibiExplainerType (string alias)

(Appears on:AlibiExplainerSpec)

AlibiExplainerType is the explanation method

ValueDescription

“AnchorImages”

“AnchorTabular”

“AnchorText”

“Contrastive”

“Counterfactuals”

Batcher

(Appears on:ComponentExtensionSpec)

Batcher specifies optional payload batching available for all components

FieldDescription
maxBatchSize
int
(Optional)

Specifies the max number of requests to trigger a batch

maxLatency
int
(Optional)

Specifies the max latency to trigger a batch

timeout
int
(Optional)

Specifies the timeout of a batch

Component

Component interface is implemented by all specs that contain component implementations, e.g. PredictorSpec, ExplainerSpec, TransformerSpec.

ComponentExtensionSpec

(Appears on:ExplainerSpec, PredictorSpec, TransformerSpec)

ComponentExtensionSpec defines the deployment configuration for a given InferenceService component

FieldDescription
minReplicas
int
(Optional)

Minimum number of replicas, defaults to 1 but can be set to 0 to enable scale-to-zero.

maxReplicas
int
(Optional)

Maximum number of replicas for autoscaling.

containerConcurrency
int64
(Optional)

ContainerConcurrency specifies how many requests can be processed concurrently, this sets the hard limit of the container concurrency(https://knative.dev/docs/serving/autoscaling/concurrency).

timeout
int64
(Optional)

TimeoutSeconds specifies the number of seconds to wait before timing out a request to the component.

canaryTrafficPercent
int64
(Optional)

CanaryTrafficPercent defines the traffic split percentage between the candidate revision and the last ready revision

logger
LoggerSpec
(Optional)

Activate request/response logging and logger configurations

batcher
Batcher
(Optional)

Activate request batching and batching configurations

ComponentImplementation

ComponentImplementation interface is implemented by predictor, transformer, and explainer implementations

ComponentStatusSpec

(Appears on:InferenceServiceStatus)

ComponentStatusSpec describes the state of the component

FieldDescription
latestReadyRevision
string
(Optional)

Latest revision name that is in ready state

latestCreatedRevision
string
(Optional)

Latest revision name that is created

previousRolledoutRevision
string
(Optional)

Previous revision name that is rolled out with 100 percent traffic

latestRolledoutRevision
string
(Optional)

Latest revision name that is rolled out with 100 percent traffic

traffic
[]knative.dev/serving/pkg/apis/serving/v1.TrafficTarget
(Optional)

Traffic holds the configured traffic distribution for latest ready revision and previous rolled out revision.

url
knative.dev/pkg/apis.URL
(Optional)

URL holds the url that will distribute traffic over the provided traffic targets. It generally has the form http[s]://{route-name}.{route-namespace}.{cluster-level-suffix}

address
knative.dev/pkg/apis/duck/v1.Addressable
(Optional)

Addressable endpoint for the InferenceService

ComponentType (string alias)

ComponentType contains the different types of components of the service

ValueDescription

“explainer”

“predictor”

“transformer”

CustomExplainer

CustomExplainer defines arguments for configuring a custom explainer.

FieldDescription
PodSpec
Kubernetes core/v1.PodSpec

(Members of PodSpec are embedded into this type.)

CustomPredictor

CustomPredictor defines arguments for configuring a custom server.

FieldDescription
PodSpec
Kubernetes core/v1.PodSpec

(Members of PodSpec are embedded into this type.)

CustomTransformer

CustomTransformer defines arguments for configuring a custom transformer.

FieldDescription
PodSpec
Kubernetes core/v1.PodSpec

(Members of PodSpec are embedded into this type.)

ExplainerConfig

(Appears on:ExplainersConfig)

FieldDescription
image
string

explainer docker image name

defaultImageVersion
string

default explainer docker image version

ExplainerExtensionSpec

(Appears on:AIXExplainerSpec, ARTExplainerSpec, AlibiExplainerSpec)

ExplainerExtensionSpec defines configuration shared across all explainer frameworks

FieldDescription
storageUri
string

The location of a trained explanation model

runtimeVersion
string

Defaults to latest Explainer Version

config
map[string]string

Inline custom parameter settings for explainer

Container
Kubernetes core/v1.Container

(Members of Container are embedded into this type.)

(Optional)

Container enables overrides for the predictor. Each framework will have different defaults that are populated in the underlying container spec.

ExplainerSpec

(Appears on:InferenceServiceSpec)

ExplainerSpec defines the container spec for a model explanation server, The following fields follow a “1-of” semantic. Users must specify exactly one spec.

FieldDescription
alibi
AlibiExplainerSpec

Spec for alibi explainer

aix
AIXExplainerSpec

Spec for AIX explainer

art
ARTExplainerSpec

Spec for ART explainer

PodSpec
PodSpec

(Members of PodSpec are embedded into this type.)

This spec is dual purpose. 1) Users may choose to provide a full PodSpec for their custom explainer. The field PodSpec.Containers is mutually exclusive with other explainers (i.e. Alibi). 2) Users may choose to provide a Explainer (i.e. Alibi) and specify PodSpec overrides in the PodSpec. They must not provide PodSpec.Containers in this case.

ComponentExtensionSpec
ComponentExtensionSpec

(Members of ComponentExtensionSpec are embedded into this type.)

Component extension defines the deployment configurations for explainer

ExplainersConfig

(Appears on:InferenceServicesConfig)

FieldDescription
alibi
ExplainerConfig
aix
ExplainerConfig
art
ExplainerConfig

InferenceService

InferenceService is the Schema for the InferenceServices API

FieldDescription
metadata
Kubernetes meta/v1.ObjectMeta
Refer to the Kubernetes API documentation for the fields of the metadata field.
spec
InferenceServiceSpec


predictor
PredictorSpec

Predictor defines the model serving spec

explainer
ExplainerSpec
(Optional)

Explainer defines the model explanation service spec, explainer service calls to predictor or transformer if it is specified.

transformer
TransformerSpec
(Optional)

Transformer defines the pre/post processing before and after the predictor call, transformer service calls to predictor service.

status
InferenceServiceStatus

InferenceServiceSpec

(Appears on:InferenceService)

InferenceServiceSpec is the top level type for this resource

FieldDescription
predictor
PredictorSpec

Predictor defines the model serving spec

explainer
ExplainerSpec
(Optional)

Explainer defines the model explanation service spec, explainer service calls to predictor or transformer if it is specified.

transformer
TransformerSpec
(Optional)

Transformer defines the pre/post processing before and after the predictor call, transformer service calls to predictor service.

InferenceServiceStatus

(Appears on:InferenceService)

InferenceServiceStatus defines the observed state of InferenceService

FieldDescription
Status
knative.dev/pkg/apis/duck/v1.Status

(Members of Status are embedded into this type.)

Conditions for the InferenceService
- PredictorReady: predictor readiness condition;
- TransformerReady: transformer readiness condition;
- ExplainerReady: explainer readiness condition;
- RoutesReady: aggregated routing condition;
- Ready: aggregated condition;

address
knative.dev/pkg/apis/duck/v1.Addressable
(Optional)

Addressable endpoint for the InferenceService

url
knative.dev/pkg/apis.URL
(Optional)

URL holds the url that will distribute traffic over the provided traffic targets. It generally has the form http[s]://{route-name}.{route-namespace}.{cluster-level-suffix}

components
map[kserve.io/v1beta1/pkg/apis/serving/v1beta1.ComponentType]kserve.io/v1beta1/pkg/apis/serving/v1beta1.ComponentStatusSpec

Statuses for the components of the InferenceService

InferenceServicesConfig

FieldDescription
transformers
TransformersConfig

Transformer configurations

predictors
PredictorsConfig

Predictor configurations

explainers
ExplainersConfig

Explainer configurations

IngressConfig

FieldDescription
ingressGateway
string
ingressService
string
localGateway
string
localGatewayService
string
ingressDomain
string

LightGBMSpec

(Appears on:PredictorSpec)

LightGBMSpec defines arguments for configuring LightGBMSpec model serving.

FieldDescription
PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

LoggerSpec

(Appears on:ComponentExtensionSpec)

LoggerSpec specifies optional payload logging available for all components

FieldDescription
url
string
(Optional)

URL to send logging events

mode
LoggerType
(Optional)

Specifies the scope of the loggers.
Valid values are:
- “all” (default): log both request and response;
- “request”: log only request;
- “response”: log only response

LoggerType (string alias)

(Appears on:LoggerSpec)

LoggerType controls the scope of log publishing

ValueDescription

“all”

Logger mode to log both request and response

“request”

Logger mode to log only request

“response”

Logger mode to log only response

ONNXRuntimeSpec

(Appears on:PredictorSpec)

ONNXRuntimeSpec defines arguments for configuring ONNX model serving.

FieldDescription
PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

PMMLSpec

(Appears on:PredictorSpec)

PMMLSpec defines arguments for configuring PMML model serving.

FieldDescription
PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

PaddleServerSpec

(Appears on:PredictorSpec)

FieldDescription
PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

PodSpec

(Appears on:ExplainerSpec, PredictorSpec, TransformerSpec)

PodSpec is a description of a pod.

FieldDescription
volumes
[]Kubernetes core/v1.Volume
(Optional)

List of volumes that can be mounted by containers belonging to the pod. More info: https://kubernetes.io/docs/concepts/storage/volumes

initContainers
[]Kubernetes core/v1.Container

List of initialization containers belonging to the pod. Init containers are executed in order prior to containers being started. If any init container fails, the pod is considered to have failed and is handled according to its restartPolicy. The name for an init container or normal container must be unique among all containers. Init containers may not have Lifecycle actions, Readiness probes, Liveness probes, or Startup probes. The resourceRequirements of an init container are taken into account during scheduling by finding the highest request/limit for each resource type, and then using the max of of that value or the sum of the normal containers. Limits are applied to init containers in a similar fashion. Init containers cannot currently be added or removed. Cannot be updated. More info: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

containers
[]Kubernetes core/v1.Container

List of containers belonging to the pod. Containers cannot currently be added or removed. There must be at least one container in a Pod. Cannot be updated.

ephemeralContainers
[]Kubernetes core/v1.EphemeralContainer
(Optional)

List of ephemeral containers run in this pod. Ephemeral containers may be run in an existing pod to perform user-initiated actions such as debugging. This list cannot be specified when creating a pod, and it cannot be modified by updating the pod spec. In order to add an ephemeral container to an existing pod, use the pod’s ephemeralcontainers subresource. This field is alpha-level and is only honored by servers that enable the EphemeralContainers feature.

restartPolicy
Kubernetes core/v1.RestartPolicy
(Optional)

Restart policy for all containers within the pod. One of Always, OnFailure, Never. Default to Always. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy

terminationGracePeriodSeconds
int64
(Optional)

Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request. Value must be non-negative integer. The value zero indicates delete immediately. If this value is nil, the default grace period will be used instead. The grace period is the duration in seconds after the processes running in the pod are sent a termination signal and the time when the processes are forcibly halted with a kill signal. Set this value longer than the expected cleanup time for your process. Defaults to 30 seconds.

activeDeadlineSeconds
int64
(Optional)

Optional duration in seconds the pod may be active on the node relative to StartTime before the system will actively try to mark it failed and kill associated containers. Value must be a positive integer.

dnsPolicy
Kubernetes core/v1.DNSPolicy
(Optional)

Set DNS policy for the pod. Defaults to “ClusterFirst”. Valid values are ‘ClusterFirstWithHostNet’, ‘ClusterFirst’, ‘Default’ or ‘None’. DNS parameters given in DNSConfig will be merged with the policy selected with DNSPolicy. To have DNS options set along with hostNetwork, you have to specify DNS policy explicitly to ‘ClusterFirstWithHostNet’.

nodeSelector
map[string]string
(Optional)

NodeSelector is a selector which must be true for the pod to fit on a node. Selector which must match a node’s labels for the pod to be scheduled on that node. More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

serviceAccountName
string
(Optional)

ServiceAccountName is the name of the ServiceAccount to use to run this pod. More info: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/

serviceAccount
string
(Optional)

DeprecatedServiceAccount is a depreciated alias for ServiceAccountName. Deprecated: Use serviceAccountName instead.

automountServiceAccountToken
bool
(Optional)

AutomountServiceAccountToken indicates whether a service account token should be automatically mounted.

nodeName
string
(Optional)

NodeName is a request to schedule this pod onto a specific node. If it is non-empty, the scheduler simply schedules this pod onto that node, assuming that it fits resource requirements.

hostNetwork
bool
(Optional)

Host networking requested for this pod. Use the host’s network namespace. If this option is set, the ports that will be used must be specified. Default to false.

hostPID
bool
(Optional)

Use the host’s pid namespace. Optional: Default to false.

hostIPC
bool
(Optional)

Use the host’s ipc namespace. Optional: Default to false.

shareProcessNamespace
bool
(Optional)

Share a single process namespace between all of the containers in a pod. When this is set containers will be able to view and signal processes from other containers in the same pod, and the first process in each container will not be assigned PID 1. HostPID and ShareProcessNamespace cannot both be set. Optional: Default to false.

securityContext
Kubernetes core/v1.PodSecurityContext
(Optional)

SecurityContext holds pod-level security attributes and common container settings. Optional: Defaults to empty. See type description for default values of each field.

imagePullSecrets
[]Kubernetes core/v1.LocalObjectReference
(Optional)

ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec. If specified, these secrets will be passed to individual puller implementations for them to use. For example, in the case of docker, only DockerConfig type secrets are honored. More info: https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod

hostname
string
(Optional)

Specifies the hostname of the Pod If not specified, the pod’s hostname will be set to a system-defined value.

subdomain
string
(Optional)

If specified, the fully qualified Pod hostname will be “...svc.”. If not specified, the pod will not have a domainname at all.

affinity
Kubernetes core/v1.Affinity
(Optional)

If specified, the pod’s scheduling constraints

schedulerName
string
(Optional)

If specified, the pod will be dispatched by specified scheduler. If not specified, the pod will be dispatched by default scheduler.

tolerations
[]Kubernetes core/v1.Toleration
(Optional)

If specified, the pod’s tolerations.

hostAliases
[]Kubernetes core/v1.HostAlias
(Optional)

HostAliases is an optional list of hosts and IPs that will be injected into the pod’s hosts file if specified. This is only valid for non-hostNetwork pods.

priorityClassName
string
(Optional)

If specified, indicates the pod’s priority. “system-node-critical” and “system-cluster-critical” are two special keywords which indicate the highest priorities with the former being the highest priority. Any other name must be defined by creating a PriorityClass object with that name. If not specified, the pod priority will be default or zero if there is no default.

priority
int32
(Optional)

The priority value. Various system components use this field to find the priority of the pod. When Priority Admission Controller is enabled, it prevents users from setting this field. The admission controller populates this field from PriorityClassName. The higher the value, the higher the priority.

dnsConfig
Kubernetes core/v1.PodDNSConfig
(Optional)

Specifies the DNS parameters of a pod. Parameters specified here will be merged to the generated DNS configuration based on DNSPolicy.

readinessGates
[]Kubernetes core/v1.PodReadinessGate
(Optional)

If specified, all readiness gates will be evaluated for pod readiness. A pod is ready when all its containers are ready AND all conditions specified in the readiness gates have status equal to “True” More info: https://git.k8s.io/enhancements/keps/sig-network/0007-pod-ready%2B%2B.md

runtimeClassName
string
(Optional)

RuntimeClassName refers to a RuntimeClass object in the node.k8s.io group, which should be used to run this pod. If no RuntimeClass resource matches the named class, the pod will not be run. If unset or empty, the “legacy” RuntimeClass will be used, which is an implicit class with an empty definition that uses the default runtime handler. More info: https://git.k8s.io/enhancements/keps/sig-node/runtime-class.md This is a beta feature as of Kubernetes v1.14.

enableServiceLinks
bool
(Optional)

EnableServiceLinks indicates whether information about services should be injected into pod’s environment variables, matching the syntax of Docker links. Optional: Defaults to true.

preemptionPolicy
Kubernetes core/v1.PreemptionPolicy
(Optional)

PreemptionPolicy is the Policy for preempting pods with lower priority. One of Never, PreemptLowerPriority. Defaults to PreemptLowerPriority if unset. This field is beta-level, gated by the NonPreemptingPriority feature-gate.

overhead
Kubernetes core/v1.ResourceList
(Optional)

Overhead represents the resource overhead associated with running a pod for a given RuntimeClass. This field will be autopopulated at admission time by the RuntimeClass admission controller. If the RuntimeClass admission controller is enabled, overhead must not be set in Pod create requests. The RuntimeClass admission controller will reject Pod create requests which have the overhead already set. If RuntimeClass is configured and selected in the PodSpec, Overhead will be set to the value defined in the corresponding RuntimeClass, otherwise it will remain unset and treated as zero. More info: https://git.k8s.io/enhancements/keps/sig-node/20190226-pod-overhead.md This field is alpha-level as of Kubernetes v1.16, and is only honored by servers that enable the PodOverhead feature.

topologySpreadConstraints
[]Kubernetes core/v1.TopologySpreadConstraint
(Optional)

TopologySpreadConstraints describes how a group of pods ought to spread across topology domains. Scheduler will schedule pods in a way which abides by the constraints. All topologySpreadConstraints are ANDed.

setHostnameAsFQDN
bool
(Optional)

If true the pod’s hostname will be configured as the pod’s FQDN, rather than the leaf name (the default). In Linux containers, this means setting the FQDN in the hostname field of the kernel (the nodename field of struct utsname). In Windows containers, this means setting the registry value of hostname for the registry key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters to FQDN. If a pod does not have FQDN, this has no effect. Default to false.

PredictorConfig

(Appears on:PredictorProtocols, PredictorsConfig)

FieldDescription
image
string

predictor docker image name

defaultImageVersion
string

default predictor docker image version on cpu

defaultGpuImageVersion
string

default predictor docker image version on gpu

defaultTimeout,string
int64

Default timeout of predictor for serving a request, in seconds

multiModelServer,boolean
bool

Flag to determine if multi-model serving is supported

supportedFrameworks
[]string

frameworks the model agent is able to run

PredictorExtensionSpec

(Appears on:LightGBMSpec, ONNXRuntimeSpec, PMMLSpec, PaddleServerSpec, SKLearnSpec, TFServingSpec, TorchServeSpec, TritonSpec, XGBoostSpec)

PredictorExtensionSpec defines configuration shared across all predictor frameworks

FieldDescription
storageUri
string
(Optional)

This field points to the location of the trained model which is mounted onto the pod.

runtimeVersion
string
(Optional)

Runtime version of the predictor docker image

protocolVersion
github.com/kserve/kserve/pkg/constants.InferenceServiceProtocol
(Optional)

Protocol version to use by the predictor (i.e. v1 or v2)

Container
Kubernetes core/v1.Container

(Members of Container are embedded into this type.)

(Optional)

Container enables overrides for the predictor. Each framework will have different defaults that are populated in the underlying container spec.

PredictorImplementation

PredictorImplementation defines common functions for all predictors e.g Tensorflow, Triton, etc

PredictorProtocols

(Appears on:PredictorsConfig)

FieldDescription
v1
PredictorConfig
v2
PredictorConfig

PredictorSpec

(Appears on:InferenceServiceSpec)

PredictorSpec defines the configuration for a predictor, The following fields follow a “1-of” semantic. Users must specify exactly one spec.

FieldDescription
sklearn
SKLearnSpec

Spec for SKLearn model server

xgboost
XGBoostSpec

Spec for XGBoost model server

tensorflow
TFServingSpec

Spec for TFServing (https://github.com/tensorflow/serving)

pytorch
TorchServeSpec

Spec for TorchServe (https://pytorch.org/serve)

triton
TritonSpec

Spec for Triton Inference Server (https://github.com/triton-inference-server/server)

onnx
ONNXRuntimeSpec

Spec for ONNX runtime (https://github.com/microsoft/onnxruntime)

pmml
PMMLSpec

Spec for PMML (http://dmg.org/pmml/v4-1/GeneralStructure.html)

lightgbm
LightGBMSpec

Spec for LightGBM model server

paddle
PaddleServerSpec

Spec for Paddle model server (https://github.com/PaddlePaddle/Serving)

PodSpec
PodSpec

(Members of PodSpec are embedded into this type.)

This spec is dual purpose.
1) Provide a full PodSpec for custom predictor. The field PodSpec.Containers is mutually exclusive with other predictors (i.e. TFServing).
2) Provide a predictor (i.e. TFServing) and specify PodSpec overrides, you must not provide PodSpec.Containers in this case.

ComponentExtensionSpec
ComponentExtensionSpec

(Members of ComponentExtensionSpec are embedded into this type.)

Component extension defines the deployment configurations for a predictor

PredictorsConfig

(Appears on:InferenceServicesConfig)

FieldDescription
tensorflow
PredictorConfig
triton
PredictorConfig
xgboost
PredictorProtocols
sklearn
PredictorProtocols
pytorch
PredictorProtocols
onnx
PredictorConfig
pmml
PredictorConfig
lightgbm
PredictorConfig
paddle
PredictorConfig

SKLearnSpec

(Appears on:PredictorSpec)

SKLearnSpec defines arguments for configuring SKLearn model serving.

FieldDescription
PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

TFServingSpec

(Appears on:PredictorSpec)

TFServingSpec defines arguments for configuring Tensorflow model serving.

FieldDescription
PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

TorchServeSpec

(Appears on:PredictorSpec)

TorchServeSpec defines arguments for configuring PyTorch model serving.

FieldDescription
modelClassName
string
(Optional)

When this field is specified KFS chooses the KFServer implementation, otherwise KFS uses the TorchServe implementation

PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

TransformerConfig

(Appears on:TransformersConfig)

FieldDescription
image
string

transformer docker image name

defaultImageVersion
string

default transformer docker image version

TransformerSpec

(Appears on:InferenceServiceSpec)

TransformerSpec defines transformer service for pre/post processing

FieldDescription
PodSpec
PodSpec

(Members of PodSpec are embedded into this type.)

This spec is dual purpose.
1) Provide a full PodSpec for custom transformer. The field PodSpec.Containers is mutually exclusive with other transformers.
2) Provide a transformer and specify PodSpec overrides, you must not provide PodSpec.Containers in this case.

ComponentExtensionSpec
ComponentExtensionSpec

(Members of ComponentExtensionSpec are embedded into this type.)

Component extension defines the deployment configurations for a transformer

TransformersConfig

(Appears on:InferenceServicesConfig)

FieldDescription
feast
TransformerConfig

TritonSpec

(Appears on:PredictorSpec)

TritonSpec defines arguments for configuring Triton model serving.

FieldDescription
PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

XGBoostSpec

(Appears on:PredictorSpec)

XGBoostSpec defines arguments for configuring XGBoost model serving.

FieldDescription
PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors


Generated with gen-crd-api-reference-docs on git commit d3910e0f.