Category Archives: Uncategorized

Setting up Docker and TensorFlow for Windows 7 / Windows 10 Home

Installing Docker

  • Download and run the docker-toolbox installer here.
  • Reboot into your BIOS/UEFI settings and enable Virtualization
  • Run ‘Docker Quickstart shell’. Once the VM starts up, run the hello-world image to ensure Docker is working properly

Installing TensorFlow

  • In the existing Docker shell, pull the TensoFlow docker image:
  • Test running the Docker TensorFlow image:
  • Copy the URL with your login Jupyter login token from the Docker Quickstart shell and go to it in your web browser

If you were able to access the page, Docker and TensorFlow have been installed correctly.

Getting the TensorFlow Tutorials

Note: For this tutorial, we are cloning the TensorFlow-Tutorials repo to the root of our user directory, you can put it anywhere you like, but the rest of the tutorial will assume it is located at:

  • Clone the github repo https://github.com/lexfridman/deepcars to your user directory
  • In Docker Quickstart shell, run the TensorFlow docker image and mount the notebooks.
  • In your browser, navigate to URL provided by Docker
  • Ensure that the notebooks for the tutorials are available (you should see ‘1_python_perceptron.ipynb’ as the first notebook).

Congratulations!  If you were able to access the deepcars Notebooks from within your browser, everything should be working!

Note: We recommend adding the command to run the Docker image and mount the notebooks to a script for easy execution. Simply open notepad and paste in the lines

Save the script as ‘start-tensorflow.sh’ in the root of your user directory and run the script within the Docker shell

Installing OpenCV to the TensorFlow Docker Image

This tutorial will walk you through installing OpenCV into an existing TensorFlow Docker image. OpenCV is a library that provides C/C++, Python, and java interfaces for computer vision applications.  Primarily, we will be using OpenCV to read in images for training and testing networks with TensorFlow.

Download the Required Files and Install OpenCV to Your Docker Image

  • Pull the new version of the deepcars repo from https://github.com/lexfridman/deepcars (if you downloaded the repo as a zip previously, replace your old ‘deepcars-master’ directory with the new one from the zip)
  • Download the Dockerfile for installing OpenCV here (Make sure the files saves with no extension, if your browser appends ‘.txt’ to the file, please delete the extension)
  • Open Powershell (Windows) or a terminal (Mac OS X/Linux) and navigate to the directory where you saved ‘Dockerfile’
  • Rebuild the Docker image with OpenCV and save the image as ‘deepcars’

Updating Your Docker Script

  • Open the script you created in the previous tutorial for starting the TensorFlow docker image in a text editor.
  • Change the line

    to

    or, if you are not using a script, execute the above line from now on to launch your Docker image

Now, when you navigate to the URL given to you by Docker, you should have an an additional notebook titled ‘5_tensorflow_traffic_light_classification.ipynb’ that can be run with OpenCV support.

Setting Up Docker and TensorFlow for Linux

Installing Docker

  • Follow the instruction here for your distro
  • Open terminal and run the docker hello-world image

Installing TensorFlow

  • Open a terminal
  • Pull the tensorflow docker image:
  • Test running the Docker TensorFlow image:
  • Copy the URL with your login Jupyter login token from the terminal and go to it in your web browser

If you were able to access the page, Docker and TensorFlow have been installed correctly.

Getting the TensorFlow Tutorials

Note: For this tutorial, we are cloning the deepcars repo to our home directory, you can put it anywhere you like, but the rest of the tutorial will assume it is located at:

  • Clone the github repo https://github.com/lexfridman/deepcars
  • Open a terminal
  • Run the TensorFlow docker image and mount the notebooks.
  • In your browser, navigate to URL provided by Docker inside of your terminal
  • Ensure that the notebooks for the tutorials are available (you should see ‘1_python_perceptron.ipynb’ as the first notebook).

Congratulations!  If you were able to access the deepcars Notebooks from within your browser, everything should be working!

Note: We recommend adding the command to run the Docker image and mount the notebooks to a script for easy execution. Simply open a your favorite text editor and paste in the lines

Save the script as ‘start-tensorflow.sh’ and run

Then run the script

 

 

Setting Up Docker and TensorFlow for Mac OS X

Installing Docker

  • Download the Docker installer here.
  • Mount ‘Docker.dmg’
  • Copy Docker.app to your Application directory
  • Double click Docker.app and wait for Docker to finish starting up
  • Open terminal and run the docker hello-world image

Installing TensorFlow

  • Open a terminal
  • Pull the tensorflow/docker image:
  • Test running the Docker TensorFlow image:
  • Copy the URL with your login Jupyter login token from the terminal and go to it in your web browser

If you were able to access the page, Docker and TensorFlow have been installed correctly.

Getting the TensorFlow Tutorials

Note: For this tutorial, we are cloning the deepcars repo to our home directory, you can put it anywhere you like, but the rest of the tutorial will assume it is located at:

  • Clone the github repo https://github.com/lexfridman/deepcars
  • Enable sharing of the drive you cloned the deepcars repo to in Docker
    • Right Click on the Docker icon on the top of your screen.
    • Click ‘Settings’
    • Go to ‘File Sharing’ and add your home direcory to the list of shared directories.
    • Click ‘Apply and Restart’
  • Open terminal
  • Run the TensorFlow docker image and mount the notebooks.
  • In your browser, navigate to URL provided by Docker inside of the terminal
  • Ensure that the notebooks for the tutorials are available (you should see ‘1_python_perceptron.ipynb’ as the first notebook).

Congratulations!  If you were able to access the deepcars Notebooks from within your browser, everything should be working!

Note: We recommend adding the command to run the Docker image and mount the notebooks to a script for easy execution. Simply open a your favorite text editor and paste in the lines

Save the script as ‘start-tensorflow.sh’ and run

Then run the script

 

 

Setting up Docker and TensorFlow for Windows 10 Professional

Installing Docker

  • Download the Docker installer here.
  • Run ‘InstallDocker.msi’
  • Launch Docker when the installer finishes
  • If Docker warns you about Hyper-V not being enabled, allow Docker to enable Hyper-V and automatically restart your machine
  • Open PowerShell or ‘cmd.exe’ and run the Docker hello-world image to ensure Docker is working properly

Installing TensorFlow

  • Open PowerShell
  • Pull the tensorflow docker image:
  • Test running the Docker TensorFlow image:
  • Copy the URL with your login Jupyter login token from PowerShell and go to it in your web browser

If you were able to access the page, Docker and TensorFlow have been installed correctly.

Getting the TensorFlow Tutorials

Note: For this tutorial, we are cloning the TensorFlow-Tutorials repo to the root of our C: drive, you can put it anywhere you like, but the rest of the tutorial will assume it is located at:

  • Clone the github repo https://github.com/lexfridman/deepcars
  • Enable sharing of the drive you cloned the deepcars repo to in Docker
    • Right Click on the Docker System Try icon.
    • Click ‘Settings’
    • Go to ‘Shared Drives’ and check the box for the drive deepcars is located on.
    • Click ‘Apply’
  • Open PowerShell
  • Run the TensorFlow docker image and mount the notebooks.
  • In your browser, navigate to URL provided by Docker inside of PowerShell
  • Ensure that the notebooks for the tutorials are available (you should see ‘1_python_perceptron.ipynb’ as the first notebook).

Congratulations!  If you were able to access the deepcars Notebooks from within your browser, everything should be working!

Note: We recommend adding the command to run the Docker image and mount the notebooks to a script for easy execution. Simply open notepad and paste in the line

Save the script as ‘start-tensorflow.PS1’ and right click on the file and click ‘Run with PowerShell’ to start the TensorFlow Docker image.

DeepTraffic

DeepTraffic is a gamified simulation of typical highway traffic. Your task is to build a neural agent – more specifically design and train a neural network that performs well on high traffic roads. Your neural network gets to control one of the cars (displayed in red) and has to learn how to navigate efficiently to go as fast as possible. The car already comes with a safety system, so you don’t have to worry about the basic task of driving – the net only has to tell the car if it should accelerate/slow down or change lanes, and it will do so if that is possible without crashing into other cars.

Overview

The page consists of three different areas: on the left you can find a real time simulation of the road, with different display options, using the current state of the net. On the upper half of the right side of the page there is a coding area where you can change the design of the neural network and below that you can find some information about the state of the neural network and also buttons to train and test it.

The simulation area shows some basic information like the current speed of the car and the number of cars that have been passed since you opened the site. It also allows you to change the way the simulation is displayed.

The simulation uses frames as an internal measure of time – so neither a slow computer, nor a slow net influences the result. The Simulation Speed setting lets you control how the simulation is displayed to you – using the Normal setting the simulation tries to draw the frames matching real time, so it waits if the actual calculation is going faster – Fast displays frames as soon as they are finished, which may be much faster.

Internally the whole game runs on a grid system. You can see it if you change the Road Overlay to Full Map:

For each car the grid cells below it are filled with the car’s speed, empty cells are filled with a high value to symbolize the potential for speed.

Your car gets a car-centric cutout of that map to use as an input to the neural network. You can have a look at it by changing the Road Overlay to Learning Input:

The following variables control the size of the input the net gets – a larger input area provides more information about the traffic situation, but it also makes it harder to learn the relevant parts, and may require longer learning times. (But you should definitely change the input size of one we have in the starting sample – that one makes the car essentially blind)

The basic algorithm that powers all the other cars, and also the basis of yours is called Safety System you can have a look at it by switching the Road Overlay:

The highlighted cells tell you what it is looking at, if they are red the Safety System currently blocks going in that direction. The front facing part of the Safety System makes the car slow down to avoid hitting obstacles. Lane switching is disabled while there is any other car in the highlighted area, and if you are already in the process of switching lanes. The checked area increases depending how fast you are trying to go – so just flooring it is not always a good idea.

The agent is controlled by a function called learn that receives the current state (provided as a flattened array of the defined learning input cutout), a reward for the last step (in this case the average speed in mph) and has to return one of the following actions:

The most basic learn function that simply tells the agent to hold its speed and lane would look like:

For the competition you are supposed to use a neural network to control the car – the learn function

to make this happen is already provided in the initial code sample and can stay the same; you are of course free to do your own data preprocessing before feeding the state to the net, but don’t spend too much time on it – most (if not all) of the improvements should come from adapting the net (and you are able to get a fairly decent speed without doing any preprocessing at all – way beyond the required minimum to pass the course).

Training and Evaluation

To train the neural network you have to press the Run Training button:

This will start training the neural network by running the simulation in a separate thread with about 30 times realtime speed and apply the trained net back to the visible simulation from time to time so you should be able to see immediate improvements (only if your net layout is any good of course).

The site also provides an evaluation button that is going to run exactly the same evaluation we are using for the competition.

The evaluation run also happens on a separate thread simulating 10 runs of about 30 minutes each. For each run it computes the average speed per run, and the final score will be the median speed of the 10 runs.

You have to keep in mind that your local evaluation only gives you an estimate of the actual score, as there is some random component involved in how the other cars behave. The relevant score the one we compute. (And we will also look at your code to see if there is any kind of cheating involved, which would get you banned – so don’t even try).

You can find your best speed on your profile page, and if you are really good in the top 10 leaderboard.

Designing the Neural Network

To change the default neural network layout we provide (which is intentionally changed to perform badly) you have to change the code in the code box on the website.

The apply code button runs the code to create the newly defined neural network (watch out: you will lose the training state you had before).

And the save and load buttons allow you to save your code and the trained net state to your machine and load it back afterwards. Save regularly!

Looking at the Code

Defines the most basic settings – for larger inputs you should probably increase the number of train iterations. Actually looking ahead a few patches, and at least one lane to the side is probably a good idea as well.

Specifies some more details about the input – you don’t need to touch that part (except maybe the temporal window).

The net is defined with an array of layers starting with the input which you don’t have to change:

We added one basic hidden layer with just one neuron to show you how to do that – you should definitely change that:

And in the end there is the final regression layer that decides on the action, which probably is fine as it is:

There are a lot more options for the Q-Learning part – details on them can be found in the comments of the code at the following link: https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js These are mostly interesting for more advanced optimisations of your net.

And the last step is creating the brain.

Submission

To submit your neural network for evaluation press the submit button:

Make sure you run training and a local evaluation first and only submit if you are happy with the performance. You can submit multiple times and we will take your best result, but doing it too often is not a good idea: submission adds your net to the back of our evaluation queue, but you can only have one spot there, so if you resubmit before evaluation is done, you get bumped to the back again. To see the results of our evaluation go to your profile page.

If you are officially registered for this class you need to perform better than 65 mph to get credit for this assignment. 

References:

ConvNetJS

Deep Q-Learning using ConvNetJS

DeepTesla: End-to-End Learning from Human and Autopilot Driving

End-to-end steering describes the driving-related AI task of producing a steering wheel value given an image.  To this end we developed DeepTesla, a simple neural network demonstration of both end-to-end driving as well as in-the-browser neural network training.

Site Overview

This demonstration is supposed to give students a simple demonstration of using convolutional neural networks in end-to-end steering.   Readers are expected to have a basic understanding of neural networks.  When a user loads the page, an end-to-end steering model begins training: downloading packages of images/steering wheel values, doing a forward and backward pass, and visualizing the individual layers of the network.

Below we can see an example of what the demo looks like while training a model.

At the very top of the page you’ll see an area which contains some metrics about our network:

  • Forward/backward pass (ms): the amount of time it took the network to perform a forward or backward pass on a single example.  This metric is important for both training and evaluation.
  • Total examples seen / unique: the number of examples the network has trained on in total, as well as the number of unique examples
  • Network status: the current operation the network is performing, either training, or fetching data.

In the metrics box is also the loss graph.  This is a live graph that updates after every 250 training examples.  The Y-axis shows the number of examples seen in the currently loaded network, and the X-axis is the loss function value over the last 250 examples.  Ideally we should see this decrease over time.

Immediately below the metrics box, you’ll see the editor.  This is how the user interacts with the network: by providing layer organization/type and parameters, as well as parameters to the training algorithm.  The input to this editor must be a single valid JSON object containing two indexes: “network” and “trainer”.

After editing the parameters, you’ll want to reload the network and begin training it.  To do so, click the “Restart Training” button in the lower-left corner of the editor.  This will send the JSON to our training web worker, and it will be parsed and loaded as a ConvNetJS model.

Below the editor is another area, where the visualization takes place.  Upon first opening the page, you’ll see the layer-by-layer visualization.  You’ll see images for each layer in the network showing the activations for an arbitrary training example.  While the network is training, you’ll notice the activations will change – representing the network learning new features.

The very first layer will always be the input layer (and the last layer will always be a single neuron).  For each input example, we show the actual steering angle for that example as well as the currently predicted steering angle.

Depending on the layer type, you’ll see different types of visualization.  For convolutional layers, we create canvas objects containing the actual activations at each neuron – these will look remarkably similar to the input image.  We only visualize the filters that produced the activations if they are bigger than 3×3.

There is one more type of visualization: video validation.  To get there, find the button in the lower-right corner of the network editor.

Upon clicking the video visualization button, the layer visualization will be replaced by a video clip from a Tesla driving.  The currently loaded network evaluates each frame of video and makes a prediction about the steering wheel angle.  The network continues training while the video plays – you can see the network become more and more accurate as more training examples are seen (if your network is working).  When the video finishes, it starts over again.

At the bottom of the video, we visualize some information about the current performance of the model.  On the far left, we see the actual value of the steering wheel in blue, and the predicted value in white, and in red we see the difference between them.  To the right, we draw two steering wheels representing the actual and predicted values for the current frame.  To the right of the steering wheels, we see the current frame number, the forward pass time in milliseconds, and the average error (total difference between actual and predicted value divided by the total number of frames evaluated).

You’ll see a green box around the lower third of the video player.  This box shows the portion of the image being used as input to the network for evaluation (the coordinates and size of this box cannot be changed).

On the far right of the video information box, we see rapidly changing black/white bars.  This is a simple 17-bit sign-magnitude barcode, a hack we use to ensure accuracy in determining the frame number and wheel value for each video frame.  They are encoded into the video itself.

Training an End-to-End model

Now let’s look at how we can improve our model.  When you first load DeepTesla, your model editor will contain this:

So, we can see our network has an input layer of size (200, 66, 3) – representing a width of 200, a height of 66, and 3 channels (red-green-blue).  Following that, there are three convolutional layers, a pooling layer, and a single output neuron.

For example, we may want to decrease the stride parameters of our convolutional layer – right now, it is two which means every filter takes a two pixel step.  If we decrease that parameter to one, our filters will pass over more of the image.  However, this also means our convolutional layers will take more time for each pass.  We can offset this by adding additional pooling after each layer.

Below our “network” key, we see that we also supply a training algorithm, along with some parameters to that algorithm.  ConvNetJS provides several training algorithms to us: Adadelta, Adagrad, and standard SGD.  Each of these algorithms takes specific specific parameters as JSON key/value pairs.

More specifics about the available algorithms and parameters can be found in the ConvNetJS documentation ( http://cs.stanford.edu/people/karpathy/convnetjs/docs.html ).

After editing our network/trainer, we can begin training it by pressing the “Restart Training” button.

After allowing our network to train for 5 minutes and evaluating its performance on on our test video, we decide that we want to submit our network.  Press the “Submit Network” button to complete the assignment.

Additional Info/How it Works

Training

In ConvNetJS there are two important constructs: the network/trainer object, and the Volume object.  Each network in ConvNetJS is specified by a JSON object containing a list of layers.

To construct the images used for training, we use OpenCV.  We iterate over each frame of our video examples, extract and crop the frame, pair it with a synchronized wheel value, and push it onto a list – one for training, one for validation.  After we do this for all examples, we shuffle our examples and create an image that contains batches of 250 images – one image, flattened, on each row.  It looks like the following:

For the wheel values, we keep track of the synchronized values and create JSON object containing the frame ID and the ground truth.

First, we need to load the image into the browser:

When the image is finished loading we blit the image onto a canvas which is the way you can extract the RGB values for that image.

To keep the page responsive, we use multiple threads (via Web Workers) – in our main thread we perform visualization and respond to user input, and in another thread we perform training.  Thus, the final call in our dimg.onload callback is to postMessage, which sends the image batch and wheel values to our training thread.

Now that we have our image data, we need to transform it into a ConvNetJS volume – the basic unit of data representation in ConvNetJS.  In the code snippet below, we create a ConvNetJS volume of size (base_input_x, base_input_y, 3) and copy our image data to the volume.

Image data from a canvas is stored as a 1D array with four values for each pixel: red, green, blue, alpha.  Because our volume only contains 3 channels, we have to transform each RGBA value to RGB and set the appropriate value in our volume.

Video Evaluation

Video evaluation is more complex and requires a modern, non-mobile browser that supports HTML5 video.

First, we load a hidden video element:

Next, we create a Javascript function that runs another function on each repaint of the browser:

In our video_to_canvas function we copy the currently shown video frame to a canvas element.

Finally, we call the same image_data_to_volume function that we used on our training images.

To decode our barcodes, we iterate over each individual “bar” and calculate a pixel value average.  If the bar falls below a threshold, we decode that as a “0”, otherwise we decode it as a “1”.

Resources

DeepTesla: http://selfdrivingcars.mit.edu/deepteslajs/

ConvNetJS: http://cs.stanford.edu/people/karpathy/convnetjs/