Optimizing TensorFlow Models for the DeepLens

In this blog post on ML Cave we will walkthrough in detail the process of optimizing our TensorFlow models with the Intel OpenVino Toolkit and deploying them onto the AWS DeepLens.

Why we have to optimize the models

To be honest I don't know the actual real reason that we have to optimize the models for the DeepLens, my hypothesis, though, is to make the model inference faster on lower end hardware. One of the goals of the model optimization process, outside of faster performance, would be to also execute the model on Intel GPUs, in this case the Intel Gen9. Per Intel their model optimization application is described as

Compute Library for Deep Neural Networks (clDNN) is an open source performance library for Deep Learning (DL) applications intended for acceleration of DL Inference on Intel® Processor Graphics – including HD Graphics and Iris® Graphics.

I haven't actually used this library by itself, instead I take advantage of the model optimizer through the Intel OpenVino toolkit which we will install next.

Setting up the OpenVino toolkit

I have only installed this toolkit on Linux, hopefully, for other platforms the steps are similar.

  1. Go to Download Intel OpenVino Toolkit and download the software for your operating system. Yes you have to register to download the software.
  2. Once given the download page be sure to choose 2018 R3. While this version isn't the latest I wasn't able to get 2018 R4 to work correctly on the DeepLens, the model kept giving incorrect bounding boxes, I haven't looked further into the cause. If you do choose to attemp to use 2018 R4 take note that the DeepLens does not support the new model structure so you will have to use the command line flag --generate_deprecated_IR_V2
  3. Install the software package to your local machine
  4. After installation navigate to the directory computer_vision_sdk/deployment_tools/model_optimizer/
  5. Within that directory execute the command pipenv install --three (more information about pipenv). This will install the requirements within requirements.txt
  6. After that execute pipenv uninstall protobuf and then pipenv install protobuf, this will get us to the latest version of the protobuf library, 3.6.1, that works with the TensorFlow Object Detection API. If you have trouble you can download my Pipefile and my Pipfile.lock. After downloading them put in the computer_vision_sdk/deployment_tools/model_optimizer/ directory and execute pipenv install.

Optimizing a Sample model

Next we will export a pre-trained TensorFlow model.

Exporting the Model

  1. Clone the tensorflow/models git repository
  2. Perform the Protobuf Compilation
  3. Navigate to the Intel OpenVino directory computer_vision_sdk/deployment_tools/model_optimizer/ and execute pipenv shell to enter into the Python virtual environment
  4. Navigate back to the tensorflow/models/research/ directory and properly add libraries to the PYTHONPATH
  5. Export the trained model by running this script

    python object_detection/export_inference_graph.py \
    --input_type=${INPUT_TYPE} \
    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
    --trained_checkpoint_prefix=${TRAINED_CKPT_PREFIX} \

    This will give us a frozen_inference_graph.pb file that we will use in the next step.

Performing Optimization

Now we will use the Intel OpenVino toolkit to optimize our exported model.

  1. Make sure you are still in the Python virtual environment that we setup above, if you aren't, navigate to the Intel OpenVino directory computer_vision_sdk/deployment_tools/model_optimizer/ and execute pipenv shell to enter into the Python virtual environment
  2. Also make sure your shell is in the directory computer_vision_sdk/deployment_tools/model_optimizer/, there should be a mo_tf.py file in there
  3. Execute the command

    python mo_tf.py --log_level=DEBUG --input_model /path/to/my/frozen_inference_graph.pb --tensorflow_use_custom_operations_config extensions/front/tf/ssd_v2_support.json --tensorflow_object_detection_api_pipeline_config /path/to/my/pipeline.config --reverse_input_channels

    making sure to update the paths for the input_model and tensorflow_object_detection_api_pipeline_config. You can find more details about the options for the script at Intel's OpenVino Documentation. For the object detection API the important ones are tensorflow_use_custom_operations_config and tensorflow_object_detection_api_pipeline_config.

If all went to plan you should be presented with three files, a bin, xml and mapping file that represent the optimized model

Loading onto the DeepLens

The last step is to load the optimized model onto the DeepLens.

  1. Log into the AWS Console and navigate to your S3 buckets. Navigate to your DeepLens bucket, if you don't already have one just create a new bucket that starts with the prefix deeplens. Create a new folder called openvino_optimized_model and upload the files frozen_inference_graph.bin and frozen_inference_graph.xml that we created previously. These two files represent the optimized model that we will load onto the DeepLens.
  2. Once the files are in S3, the next step is to create new DeepLens Model. Click on Import Model and choose Externally trained model. For the Model artifact path choose the S3 path that you used in step 1. above, it will look something like s3://deeplens-mybucket/openvino_optimized_model. Enter a Model name and choose a Model framework (TensorFlow is okay), and choose Import model.
  3. Create a project that uses the model you defined in Step 2., the model that I have uploaded to BitBucket was from my Counting Vehicles project, so you can experiment with that as an option
  4. Deploy the project you created, the DeepLens will download the contents of the Model artifact path to the directory /opt/awscam/artifacts, so there should be a file named /opt/awscam/artifacts/frozen_inference_graph.bin and /opt/awscam/artifacts/frozen_inference_graph.xml, essentially the contents of the S3 folder.
  5. Now to use those pre-optimized models in your Lambda function you use something like

    print('Loading optimized model')
    model_path = '/opt/awscam/artifacts/frozen_inference_graph.xml'
    model = awscam.Model(model_path, {'GPU': 1})
    print('Done loading optimized model')
    parsed_inference_results = model.parseResult(model_type, model.doInference(frame_resize))
  6. The DeepLens also has a built in model tester tool available if you SSH into the device, so if you cd /opt/awscam/test/ (as root), you can see three files, model_test.py, quick_cam.py and tesh.sh. test.sh is a sample shell script that runs the model_test.py on a model that came with the device, if you run that script as-is (with no project deployed to the DeepLens), it will report to you the average FPS that the camera can capture content. For my device the Avg FPS was 15, so no matter what I cannot run inference faster than 15 frame per second. If you actually log into the Ubuntu desktop with a keyboard, mouse and monitor you can run a more in depth test of the actual model inference. Once logged into Ubuntu open a Terminal and navigate to the same directory /opt/awscam/test and execute the script test.sh again. This time it will also measure the inference performance of a sample model, in my case I was able to get 38 FPS for model inference only, so in this case the model inference is faster than the camera capture. This tool can also be used to test your own models, you can execute the python script model_test.py, here is a sample execution

    python model_test.py --mjpeg-pipe /opt/awscam/out/ch2_out.mjpeg --artifact-file /opt/awscam/artifacts/frozen_inference_graph.xml --test-name vehicle-detector --iter-num 200 --thresh-num 75 --img-width 429 --img-height 240

    This will test our optimized model and tell us how quicly the Intel GPU can run inference. For my vehicle detector model the average FPS was 12, so it is running a little slower than the camera.


With some work and tinkering around we are able to optimize our TensorFlow models before having them deployed to the DeepLens device. The advantage of this is we are able to expand our usage of TensorFlow as the Intel OpenVINO toolkit is updated to support more model topologies, one example being TensorFlow's Object Detection API. By default, the software on the DeepLens is unable to optimize this model, however, we are able to optimize it on our computers and only rely on the DeepLens to load the optimized model. By uploading our optimized model to S3 and loading it properly within Lambda it gives us more flexibility and a better deployment strategy than just copying the optimized files over manually. We also saw how we can test our models on the device itself, giving us performance numbers for the camera itself and the model inference.

Thank you for reading.





Model Improvements