Deploy Your ML Model in Microservice Architecture Through Jina

Yoshi Gao included in Software Engineer Notes

2023-02-12 772 words 4 minutes

/introduce_deploy_ml_model_with_jina_flow/featured-image.png

Contents

What is Jina?

A MLOps framework handles the infrastructure complexity like deployment, service monitor in micro service architecture. Jina treat the process pipeline’s component as a micro service which can be scalable, flexible, resilient.

Introduce the Jina Components

Flow

Gateway Service
The gateway service would launched by flow which means it’s the first and last station between the client and the services.
- Support network protocols: gRPC, HTTP, WebSocket, GraphQL
- health check
- performance monitor
Orchestrate the Executors into a processing pipeline.

Executor

A self-contained component of the pipeline and can transmit the data by DocumentArray through the Flow, which means the you only have to take DocumentArray as the only data structure while communicating between client side and each of Executor.

Note: Each Executor can host by the container.

DocumentArray

DocumentArray is formed by the Documents, each Document contains the attributes like id, text, embedding…

DocumentArray can view as the interface between the client and the Executor and even for different kinds of databases

Best Practice: Serve the Sentence-Transformer Model by Jina Flow

Step 1. Install Jina

1
pip install -U jina

Step 2. Create a new project template

1
jina new try_jina

The template would be created

1
2
3
4
5
6
7
.
├── client.py
├── executor1
│   ├── config.yml
│   ├── executor.py
│   └── requirements.txt
└── flow.yml

Step 3. Modify the Executor

Implement the Executor which would host the sentence-transformer model in Jina Flow.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
from jina import DocumentArray, Executor, requests
from sentence_transformers import SentenceTransformer


SENTENCE_TRANSFORMER = "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"

model = SentenceTransformer(SENTENCE_TRANSFORMER).eval()


class MyExecutor(Executor):
    @requests
    def encode(self, docs: DocumentArray, **kwargs):
        docs.embeddings = model.encode(docs.texts)

Step 4. Serve the Executor by Flow

Start the flow service

1
jina flow --uses flow.yml

Modify the code in client.py, which retrieve the sentence embeddings from served up model.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from jina import Client, DocumentArray

if __name__ == '__main__':
    c = Client(host='grpc://0.0.0.0:54321')

    da = DocumentArray().empty(2)
    da.texts = ["hello", "world!!"]

    res_da = c.post('/', da)
    print(res_da.embeddings)

Serve the Sentence-Transformer Model in Executor with Jina Flow in Docker Container

Step 1. Create the Dockerfile

Create the Dockerfile under the directory executor1.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
 FROM jinaai/jina:3-py37-perf

# make sure the files are copied into the image
COPY . /executor_root/

WORKDIR /executor_root

RUN pip install -r requirements.txt

ENTRYPOINT ["jina", "executor", "--uses", "config.yml"]

Modify the requirements.txt under the directory executor1

1
2
3
--extra-index-url https://download.pytorch.org/whl/cpu
torch
sentence-transformers

Step 2. Modify the Flow setting

Modify the Executor’s setting in flow.yml

1
2
3
4
5
6
7
8
9
jtype: Flow
version: '1'
gateway:
  protocol: [grpc, http, websocket]
  port: [54321, 54322, 54323]
executors:
  - uses: docker://my_containerized_executor
    name: MyExecutor
    replicas: 1

Step 3. Create Docker Compose File with Jina Flow

Create the docker-compose file for Flow and Executor

1
jina export docker-compose flow.yml docker-compose.yml

Setting the volume in docker-compose.yml for mounting the sentence-transformer model file outside the container for the purpose of saving the disk space and specify the location for building Dockerfile.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
version: '3.3'
networks:
  jina-network:
    driver: bridge
services:
  myexecutor:
    image: my_containerized_executor
    build: # Modify here
      context: ./executor1/.
      dockerfile: Dockerfile
    entrypoint:
    - jina
    command:
    - executor
    - --name
    - MyExecutor
    - --extra-search-paths
    - ''
    - --uses
    - config.yml
    - --host
    - 0.0.0.0
    - --port
    - '8081'
    - --port-monitoring
    - '56544'
    - --uses-metas
    - '{}'
    - --native
    - --workspace
    - /app/.cache/jina
    healthcheck:
      test: jina ping executor 127.0.0.1:8081
      interval: 2s
    environment:
    - JINA_LOG_LEVEL=INFO
    volumes: # Modify here
    - ${HOME}/.cache/jina:/app
    - ${HOME}/.cache:/root/.cache
    networks:
    - jina-network
  gateway:
    image: jinaai/jina:3.13.2-py38-standard
    entrypoint:
    - jina
    command:
    - gateway
    - --extra-search-paths
    - ''
    - --expose-endpoints
    - '{}'
    - --protocol
    - GRPC
    - HTTP
    - WEBSOCKET
    - --uses
    - CompositeGateway
    - --graph-description
    - '{"MyExecutor": ["end-gateway"], "start-gateway": ["MyExecutor"]}'
    - --deployments-addresses
    - '{"MyExecutor": ["myexecutor:8081"]}'
    - --port
    - '54321'
    - '54322'
    - '54323'
    - --port-monitoring
    - '61263'
    expose:
    - 54321
    - 54322
    - 54323
    ports:
    - 54321:54321
    - 54322:54322
    - 54323:54323
    healthcheck:
      test: jina ping gateway grpc://127.0.0.1:54321
      interval: 2s
    environment:
    - JINA_LOG_LEVEL=INFO
    networks:
    - jina-network

Step 4. Build and Run the Containers in the Background

1
docker-compose up -d

Reference

DocumentArray

Containerize