Deploy Models with Tensorflow and Docker in Production

Deploy Models with Tensorflow and Docker in Production

Walkthrough to demonstrate the deployment of ML models using TensorflowServing with Docker

Problem Faced

After models are trained and ready to deploy in a production environment, a lack of consistency with model deployment and serving workflows can present challenges in scaling your model deployments to meet the increasing number of ML use cases across your business.

Tensorflow Serving Introduction

For providing machine learning models in production settings, TensorFlow Serving delivers a versatile, high-performance infrastructure. Thanks to TensorFlow Serving, the same server architecture and APIs may be used while deploying new algorithms and experiments. TensorFlow Serving offers pre-built integration with TensorFlow models but is easily expandable to serve additional models and data types.

Prerequisties

Install Docker

General installation instructions are on the Docker site, but we give some quick links here:

Save Model

import tensorflow as tf
model = tf. keras.applications.v9g16.VG616(weights="imagenet", include _top = True)

# Freezing the weights
for layer in model. layers:
    layer. trainable = False

# Saving model using Keras save function
model. save ("vgg16/1/")

Here, the Base VGG16 model is used as an example for serving

Serving with Docker

Pulling a serving image

Once you have Docker installed, you can pull the latest TensorFlow Serving docker image by running:

docker pull tensorflow/serving

Running a serving image

When the serving image runs ModelServer, it runs as follows:

tensorflow_model_server -d --port=8500 --rest_api_port=8501 \
  --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME}

The serving images (both CPU and GPU) have the following properties:

  • Port 8500 exposed for gRPC
  • Port 8501 exposed for the REST API
  • Optional environment variable MODEL_NAME (defaults to model)
  • Optional environment variable MODEL_BASE_PATH (defaults to /models)

Now, try to run the following command to serve Vgg16 model in detached mode

sudo docker run -p 8501:8501 -p 8500:8500 --mount type=bind,source=/home/Khushiyant/Desktop/Deployment/vgg16, target/models
/V9916 -e MODEL_NAME=vgg16 -t tensorflow/serving: latest

NOTE: Trying to run the GPU model on a machine without a GPU or without a working GPU build of the TensorFlow Model Server will result in an error

How to Test for REST

import requests import json
import numpy as np import cv2
import tensorflow as tf
import time
image = cv2. imread ("image.jpg" )
image = cv2. cvtColor (image, cV2. COLOR BGR2RGB)
image = cv2. resize (image, (224,224))
image = np.expand dims (image, axis = 0)
# create inference request
start time = time. time ()
url = "http://localhost:8501/v1/models/vgg16:predict"image
data = json.dumps ({"signature name": "serving default", "instances" :image.tolist ()])
headers = {"content-type":"application/json"
response = requests.post (url, data = data, headers = headers)
prediction = json. loads (response text]["predictions"]
print (prediction)

This should print a set of values:

[[( 'n02100735', 'English setter', 0.945853472), 
('no2101556', 'clumber', 0.018291885), 
('n02090622', 'borzoi', 0.00679293927),
('n02088094',Afghan_hound', 0.005323024), 
('n02006656', 'spoonbill', 0.00221g97637)]]

Did you find this article valuable?

Support Khushiyant Chauhan by becoming a sponsor. Any amount is appreciated!