Predict on a InferenceService with saved model on Azure

Using Public Azure Blobs

By default, KServe uses anonymous client to download artifacts. To point to an Azure Blob, specify StorageUri to point to an Azure Blob Storage with the format: https://{$STORAGE_ACCOUNT_NAME}.blob.core.windows.net/{$CONTAINER}/{$PATH}

e.g. https://modelstoreaccount.blob.core.windows.net/model-store/model.joblib

Using Private Blobs

KServe supports authenticating using an Azure Service Principle.

Create an authorized Azure Service Principle

  • To create an Azure Service Principle follow the steps here.
  • Assign the SP the Storage Blob Data Owner role on your blob (KServe needs this permission as it needs to list contents at the blob path to filter items to download).
  • Details on assigning storage roles here.
  1. az ad sp create-for-rbac --name model-store-sp --role "Storage Blob Data Owner" \
  2. --scopes /subscriptions/2662a931-80ae-46f4-adc7-869c1f2bcabf/resourceGroups/cognitive/providers/Microsoft.Storage/storageAccounts/modelstoreaccount

Create Azure Secret and attach to Service Account

Create Azure secret

yaml

  1. apiVersion: v1
  2. kind: Secret
  3. metadata:
  4. name: azcreds
  5. type: Opaque
  6. stringData: # use `stringData` for raw credential string or `data` for base64 encoded string
  7. AZ_CLIENT_ID: xxxxx
  8. AZ_CLIENT_SECRET: xxxxx
  9. AZ_SUBSCRIPTION_ID: xxxxx
  10. AZ_TENANT_ID: xxxxx

Attach secret to a service account

yaml

  1. apiVersion: v1
  2. kind: ServiceAccount
  3. metadata:
  4. name: sa
  5. secrets:
  6. - name: azcreds

kubectl

  1. kubectl apply -f create-azure-secret.yaml

Deploy the model on Azure with InferenceService

Create the InferenceService with the azure storageUri and the service account with azure credential attached.

yaml

  1. apiVersion: "serving.kserve.io/v1beta1"
  2. kind: "InferenceService"
  3. metadata:
  4. name: "sklearn-azure"
  5. spec:
  6. predictor:
  7. serviceAccountName: sa
  8. sklearn:
  9. storageUri: "https://modelstoreaccount.blob.core.windows.net/model-store/model.joblib"

kubectl

  1. kubectl apply -f sklearn-azure.yaml

Run a prediction

Now, the ingress can be accessed at ${INGRESS_HOST}:${INGRESS_PORT} or follow this instruction to find out the ingress IP and port.

  1. SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-azure -o jsonpath='{.status.url}' | cut -d "/" -f 3)
  2. MODEL_NAME=sklearn-azure
  3. INPUT_PATH=@./input.json
  4. curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH

Expected Output

  1. * Trying 127.0.0.1:8080...
  2. * TCP_NODELAY set
  3. * Connected to localhost (127.0.0.1) port 8080 (#0)
  4. > POST /v1/models/sklearn-azure:predict HTTP/1.1
  5. > Host: sklearn-azure.default.example.com
  6. > User-Agent: curl/7.68.0
  7. > Accept: */*
  8. > Content-Length: 84
  9. > Content-Type: application/x-www-form-urlencoded
  10. >
  11. * upload completely sent off: 84 out of 84 bytes
  12. * Mark bundle as not supporting multiuse
  13. < HTTP/1.1 200 OK
  14. < content-length: 23
  15. < content-type: application/json; charset=UTF-8
  16. < date: Mon, 20 Sep 2021 04:55:50 GMT
  17. < server: istio-envoy
  18. < x-envoy-upstream-service-time: 6
  19. <
  20. * Connection #0 to host localhost left intact
  21. {"predictions": [1, 1]}