Jina on Kubernetes

Jina natively supports deploying your Flow and Executors into Kubernetes.

Preliminaries

Please first set up a Kubernetes cluster and configure the cluster access locally.

Tip

For local testing minikube is recommended.

See also

Here are some managed Kubernetes cluster solutions you could use:

Deploy your Flow

To deploy a Flow on Kubernetes, first, you have to generate kubernetes YAML configuration files from a Jina Flow. Then, you can use the kubectl apply command to create or update your Flow resources within your cluster.

Caution

All Executors in the Flow should be used with jinahub+docker://... or docker://....

To generate YAML configurations for Kubernetes from a Jina Flow, one just needs to call:

  1. flow.to_k8s_yaml('flow_k8s_configuration')

This will create a folder ‘flow_k8s_configuration’ with a set of Kubernetes yaml configurations for all the deployments composing the Flow

Examples

Indexing and searching images using CLIP image encoder and PQLiteIndexer

This example shows how to build and deploy a Flow in Kubernetes with CLIPImageEncoder as encoder and PQLiteIndexer as indexer.

  1. from jina import Flow
  2. f = Flow(port_expose=8080, protocol='http').add(
  3. name='encoder', uses='jinahub+docker://CLIPImageEncoder', replicas=2
  4. ).add(name='indexer', uses='jinahub+docker://PQLiteIndexer', uses_with={'dim': 512}, shards=2)

Now, we can generate Kubernetes YAML configs from the Flow:

  1. f.to_k8s_yaml('./k8s_flow', k8s_namespace='custom-namespace')

You should expect the following file structure generated:

  1. .
  2. └── k8s_flow
  3. ├── gateway
  4. └── gateway.yml
  5. └── encoder
  6. ├── encoder.yml
  7. └── encoder-head.yml
  8. └── indexer
  9. ├── indexer-0.yml
  10. ├── indexer-1.yml
  11. └── indexer-head.yml

As you can see, the Flow contains configuration for the gateway and the rest of executors

Let’s create a kubernetes namespace for our Flow:

  1. kubectl create namespace custom-namespace

Now, you can deploy this Flow to you cluster in the following way:

  1. kubectl apply -R -f ./k8s_flow

We can check that the pods were created:

  1. kubectl get pods -n custom-namespace
  1. NAME READY STATUS RESTARTS AGE
  2. encoder-8b5575cb9-bh2x8 1/1 Running 0 60m
  3. encoder-8b5575cb9-gx78g 1/1 Running 0 60m
  4. encoder-head-55bbb477ff-p2bmk 1/1 Running 0 60m
  5. gateway-7df8765bd9-xf5tf 1/1 Running 0 60m
  6. indexer-0-8f676fc9d-4fh52 1/1 Running 0 60m
  7. indexer-1-55b6cc9dd8-gtpf6 1/1 Running 0 60m
  8. indexer-head-6fcc679d95-8mrm6 1/1 Running 0 60m

Note that the Jina gateway was deployed with name gateway-7df8765bd9-xf5tf.

Once we see that all the Deployments in the Flow are ready, we can start indexing documents.

  1. import portforward
  2. from jina.clients import Client
  3. from jina import DocumentArray
  4. with portforward.forward(
  5. 'custom-namespace', 'gateway-7df8765bd9-xf5tf', 8080, 8080
  6. ):
  7. client = Client(host='localhost', port=8080)
  8. client.show_progress = True
  9. docs = client.post(
  10. '/index', inputs=DocumentArray.from_files('./imgs/*.jpg').apply(lambda d: d.load_uri_to_image_blob()),
  11. return_results=True
  12. )
  13. print(f' Indexed documents: {len(docs)}')

Caution

We heavily recommend you to deploy each Flow into a separate namespace. In particular, it should not be deployed into namespaces, where other essential non Jina services are running. If custom-namespace has been used by another Flow, please set a different k8s_namespace name.

Caution

In the default deployment dumped by the Flow, no Persistent Volume Object is added. You may want to edit the deployment files to add them if needed.

Exposing your Flow

The previous examples use port-forwarding to index documents to the Flow. Thinking about real world applications, you might want to expose your service to make it reachable by the users, so that you can serve search requests

Caution

Exposing your Flow only works if the environment of your Kubernetes cluster supports External Loadbalancers.

Once the Flow is deployed, you can expose a service.

  1. kubectl expose deployment gateway --name=gateway-exposed --type LoadBalancer --port 80 --target-port 8080 -n custom-namespace
  2. sleep 60 # wait until the external ip is configured

Export the external ip which is needed for the client in the next section when sending documents to the Flow.

  1. export EXTERNAL_IP=`kubectl get service gateway-exposed -n custom-namespace -o=jsonpath='{.status.loadBalancer.ingress[0].ip}'`

Client

The client sends an image to the exposed Flow on $EXTERNAL_IP and retrieves the matches retrieved from the Flow. Finally, it prints the uri of the closest matches.

  1. import requests
  2. from jina import DocumentArray
  3. import os
  4. host = os.environ['EXTERNAL_IP']
  5. port = 80
  6. url = f'http://{host}:{port}'
  7. doc = DocumentArray.from_files('./imgs/*.jpg').apply(lambda d: d.load_uri_to_image_blob())[0].to_dict()
  8. resp = requests.post(f'{url}/search', json={'data': [doc]})
  9. matches = resp.json()['data']['docs'][0]['matches']
  10. print(f'Matched documents: {len(matches)}')

Scaling Executors on Kubernetes

In Jina we support two ways of scaling:

  • Replicas can be used with any Executor type and is typically used for performance and availability.

  • Shards are used for partitioning data and should only be used with Indexers since they store a state.

Check here for more information.

Jina creates a separate Deployment in Kubernetes per Shard and uses Kubernetes native replica scaling to create multiple Replicas per Shard.