Counting Vehicles - Collect Training Data - Part 2

In Part 2 we will work together to collect and label our training data.

As with any machine learning algorithm, having training data the represents the problem you are trying to solve is paramount. With deep learning having many examples (multiple thousands to millions) is important to allow the neural network to properly learn the problem and generalize. For our case, since we will be building our model on top of existing models, with a technique called transfer learning, we will only need a couple hundred of examples to get good enough results.

Collecting Training Data

In order to collect the training data for our task we will setup up the Deeplens with an initial lambda function that uses a simple motion detection algorithm based on OpenCV. We will use techniques described on the PyImageSearch blog that describes how to do motion detection along with setting up an AWS DeepLens inference Lambda function from Amazon.

The Lambda Function

Our Lambda function will be responsible for detecting motion in the scene and uploading the scene to a S3 bucket. The complete code is available at my deeplens-count-vehicles repository. Within an infinite loop it continually processes the data from the DeepLens through ret, frame = awscam.getLastFrame(). Then it proceeds to pass the frame onto the motion detector, which is based upon the one from the PyImageSearch blog. If we detect motion the code then passes the frame onto the ImageUpload thread. We use a thread for the uploading so that the network IO doesn't delay the image capturing. When deployed to my DeepLens I am able to get a fairly consistent rate of 10 frames per second. There is also another thread present, LocalDisplay, this is code from Amazon that allows us to inspect the image processing that is going on by passing a modified frame into local_display.set_frame_data(motion_frame). We use it here to show bounding boxes where we detect motion. If you have a monitor connected to the DeepLens device you can view the output in mplayer with the command mplayer -demuxer lavf -lavfdopts format=mjpeg:probesize=32 -fs /tmp/results.mjpeg.

Vehicle Driving by with a Bounding Box mplayer showing the output of the LocalDisplay thread with a bounding box drawn around a vehicle

Deploying to the DeepLens

To begin, if you haven't already setup your DeepLens do it now, follow Amazon's guide to setting up your DeepLens. Once the device is setup be sure to deploy at least one project to the device to make sure everything works and because we will need a model to setup our motion detector, Amazon provides an overview of sample projects. After the sample project is working correctly we can move onto the motion detector project.


Before setting up the Lambda function create a S3 bucket that starts with the name deeplens, this will be the bucket that we write images to, and later on will serve as the bucket when we perform inference. We also need to update the AWSDeepLensGreengrassGroupRole IAM role to allow the DeepLens write access to the S3 bucket, Amazon describes the different roles within their documentation. Edit the IAM role and attach a new policy with the following contents

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "DeepLensS3WriteBuckets",
            "Effect": "Allow",
            "Action": [
            "Resource": [

This policy document will allow the DeepLens the ability to PutObject to any S3 bucket that has a name starting with deeplens, in other words we will be able to upload our captured images.

Setting up the Lambda function

The easiest way that I found to setup the Lambda function is to follow Amazon's tutorial on creating and publishing an AWS DeepLens inference Lambda function. After the Lambda function is setup there are three things we need to do

  1. Copy the code for our motion detector into, get a copy of it at my deeplens-count-vehicles repository
  2. Add in enironment variable named MOTION_DETECTION_BUCKET and set its value to be the S3 bucket created earlier
  3. Publish a new version of the Lambda function - the published Lambda function will be the one we choose when creating a new DeepLens project

Deploy the DeepLens project

The next step is to create and deploy the DeepLens project. Create a new blank project and fill in the required fields. For the model, it doesn't matter which model is chosen, we won't be using one, if there isn't a model listed the simplest way to get one is to create a sample project. For the function choose the Lambda function that we created above. Lastly, click create and be sure to deploy the project to your DeepLens device. If everything went well then as motion is detected by the DeepLens the images will be uploaded to the S3 bucket as specified by the MOTION_DETECTION_BUCKET environment variable on the Lambda function. The images are within the directory motion corresponding to the date and time that the capture happened, for example here is the location of my captures s3://deeplens-anthony-vehicles/motion/2018/11/21/15/43 and a sample capture is below.

AWS DeepLens Captured Frame Captured frame of a Police car driving by

DeepLens Tips and Tricks

If you have troubles with getting images to show up you can check the Lambda logs. Since the Lambda function executes locally on the device we need to extract the logs from the device itself. Amazon provides a list of the locations of DeepLens project logs, the important one for us will be located in /opt/awscam/greengrass/ggc/var/log/user.

It is also possible to edit the Lambda function directly on the device in order to make quick changes, be sure to save them back into the Lambda function within the AWS console so the changes that you make are not overwritten with subsequent project deploys. The location of the code will be located within the directory /opt/awscam/greengrass/ggc/deployment/lambda/ on the device. After making changes and saving the files the Lambda can be restarted by executing the command systemctl restart greengrassd.service --no-block (requires root).

Labeling the Training Data

Once the data is collected the next step is to label it correctly. To do this we will use the tool Labelimg, this tool allows for drawing bounding boxes around objects and specifying the label for the bounding box. The output format is the same as Pascal VOC, which will aid us later when we need to convert it into a format for the TensorFlow object detection API. In each image we will label the 5 vehicle types (car, SUV, van, truck and motorcycle) going left (up the street) or right (down the street) the 11th class will be for people walking on the street (if you happen to capture any). To copy all of the images to your local computer you can use the awscli utility, more specifically the s3 sync command.

LabelIMG Example SUV Right labeled within LabelIMG

Training Data Stats

As part of this project I labeled the images and came up with the following distribution

Label Name Amount
Car Right 247
Car Left 359
Truck Right 78
Truck Left 121
SUV Right 318
SUV Left 428
Van Left 55
Van Right 56
Motorcycle Left 10
Motorcycle Right 0
Person 26
Total 1698

Head over to Part 3 to find out how to setup the TensorFlow training.







Setup TensorFlow Training