TFJob Common

Reference documentation for TFJob

Packages:

kubeflow.org

Package v1 is the v1 version of the API.

Resource Types:

CleanPodPolicy(string alias)

CleanPodPolicy describes how to deal with pods when the job is finished. Can be oneof: All, Running, or None.

JobCondition

(Appears on:JobStatus)

JobCondition describes the state of the job at a certain point.

FieldDescription
typeJobConditionTypeType of job condition.
statusKubernetes core/v1.ConditionStatusStatus of the condition, one of True, False, or Unknown.
reasonstringThe reason for the condition’s last transition.
messagestringA readable message indicating details about the transition.
lastUpdateTimeKubernetes meta/v1.TimeThe last time this condition was updated.
lastTransitionTimeKubernetes meta/v1.TimeLast time the condition transitioned from one status to another.

JobConditionType(string alias)

(Appears on:JobCondition)

JobConditionType defines all possible types of JobStatus. Can be one of:Created, Running, Restarting, Succeeded, or Failed.

JobStatus

JobStatus represents the current observed state of the training job.

FieldDescription
conditions[][]github.com/kubeflow/tf-operator/pkg/apis/common/v1.JobConditionAn array of current observed job conditions.
replicaStatusesmap[github.com/kubeflow/tf-operator/pkg/apis/common/v1.ReplicaType]*github.com/kubeflow/tf-operator/pkg/apis/common/v1.ReplicaStatusA map from ReplicaType (key) to ReplicaStatus (value), specifying the status of each replica.
startTimeKubernetes meta/v1.TimeRepresents the time when the job was acknowledged by the job controller.It is not guaranteed to be set in happens-before order across separate operations.It is represented in RFC3339 form and is in UTC.
completionTimeKubernetes meta/v1.TimeRepresents the time when the job was completed. It is not guaranteed tobe set in happens-before order across separate operations.It is represented in RFC3339 form and is in UTC.
lastReconcileTimeKubernetes meta/v1.TimeRepresents the last time when the job was reconciled. It is not guaranteed tobe set in happens-before order across separate operations.It is represented in RFC3339 form and is in UTC.

ReplicaSpec

ReplicaSpec is a description of the job replica.

FieldDescription
replicasint32The desired number of replicas of the given template.If unspecified, defaults to 1.
templateKubernetes core/v1.PodTemplateSpecDescribes the pod that will be created for this replica. Note thatRestartPolicy in PodTemplateSpec will be overidden by RestartPolicy in ReplicaSpec.
restartPolicyRestartPolicyRestart policy for all replicas within the job.One of Always, OnFailure, Never, or ExitCode.Defaults to Never.

ReplicaStatus

(Appears on:JobStatus)

ReplicaStatus represents the current observed state of the replica.

FieldDescription
activeint32The number of actively running pods.
succeededint32The number of pods which reached phase Succeeded.
failedint32The number of pods which reached phase Failed.

ReplicaType(string alias)

ReplicaType represents the type of the job replica. Each operator (e.g. TensorFlow, PyTorch)needs to define its own set of ReplicaTypes.

RestartPolicy(string alias)

(Appears on:ReplicaSpec)

RestartPolicy describes how the replicas should be restarted.Can be one of: Always, OnFailure, Never, or ExitCode.If none of the following policies is specified, the default oneis RestartPolicyAlways.


Generated with gen-crd-api-reference-docson git commit fd76deec.

Feedback

Was this page helpful?

Glad to hear it! Please tell us how we can improve.

Sorry to hear that. Please tell us how we can improve.

Last modified 17.06.2019: update tfjob, pytorchjob ref scripts, style tables (#805) (affc79c5)

Copyright © 2018-2020 The Kubeflow Authors. Documentation Distributed under CC BY 4.0Privacy Policy