Author Archives: bjenikmit-edu

DeepTraffic

DeepTraffic is a gamified simulation of typical highway traffic. Your task is to build a neural agent – more specifically design and train a neural network that performs well on high traffic roads. Your neural network gets to control one of the cars (displayed in red) and has to learn how to navigate efficiently to go as fast as possible. The car already comes with a safety system, so you don’t have to worry about the basic task of driving – the net only has to tell the car if it should accelerate/slow down or change lanes, and it will do so if that is possible without crashing into other cars.

Overview

The page consists of three different areas: on the left you can find a real time simulation of the road, with different display options, using the current state of the net. On the upper half of the right side of the page there is a coding area where you can change the design of the neural network and below that you can find some information about the state of the neural network and also buttons to train and test it.

The simulation area shows some basic information like the current speed of the car and the number of cars that have been passed since you opened the site. It also allows you to change the way the simulation is displayed.

The simulation uses frames as an internal measure of time – so neither a slow computer, nor a slow net influences the result. The Simulation Speed setting lets you control how the simulation is displayed to you – using the Normal setting the simulation tries to draw the frames matching real time, so it waits if the actual calculation is going faster – Fast displays frames as soon as they are finished, which may be much faster.

Internally the whole game runs on a grid system. You can see it if you change the Road Overlay to Full Map:

For each car the grid cells below it are filled with the car’s speed, empty cells are filled with a high value to symbolize the potential for speed.

Your car gets a car-centric cutout of that map to use as an input to the neural network. You can have a look at it by changing the Road Overlay to Learning Input:

The following variables control the size of the input the net gets – a larger input area provides more information about the traffic situation, but it also makes it harder to learn the relevant parts, and may require longer learning times. (But you should definitely change the input size of one we have in the starting sample – that one makes the car essentially blind)

The basic algorithm that powers all the other cars, and also the basis of yours is called Safety System you can have a look at it by switching the Road Overlay:

The highlighted cells tell you what it is looking at, if they are red the Safety System currently blocks going in that direction. The front facing part of the Safety System makes the car slow down to avoid hitting obstacles. Lane switching is disabled while there is any other car in the highlighted area, and if you are already in the process of switching lanes. The checked area increases depending how fast you are trying to go – so just flooring it is not always a good idea.

The agent is controlled by a function called learn that receives the current state (provided as a flattened array of the defined learning input cutout), a reward for the last step (in this case the average speed in mph) and has to return one of the following actions:

The most basic learn function that simply tells the agent to hold its speed and lane would look like:

For the competition you are supposed to use a neural network to control the car – the learn function

to make this happen is already provided in the initial code sample and can stay the same; you are of course free to do your own data preprocessing before feeding the state to the net, but don’t spend too much time on it – most (if not all) of the improvements should come from adapting the net (and you are able to get a fairly decent speed without doing any preprocessing at all – way beyond the required minimum to pass the course).

Training and Evaluation

To train the neural network you have to press the Run Training button:

This will start training the neural network by running the simulation in a separate thread with about 30 times realtime speed and apply the trained net back to the visible simulation from time to time so you should be able to see immediate improvements (only if your net layout is any good of course).

The site also provides an evaluation button that is going to run exactly the same evaluation we are using for the competition.

The evaluation run also happens on a separate thread simulating 10 runs of about 30 minutes each. For each run it computes the average speed per run, and the final score will be the median speed of the 10 runs.

You have to keep in mind that your local evaluation only gives you an estimate of the actual score, as there is some random component involved in how the other cars behave. The relevant score the one we compute. (And we will also look at your code to see if there is any kind of cheating involved, which would get you banned – so don’t even try).

You can find your best speed on your profile page, and if you are really good in the top 10 leaderboard.

Designing the Neural Network

To change the default neural network layout we provide (which is intentionally changed to perform badly) you have to change the code in the code box on the website.

The apply code button runs the code to create the newly defined neural network (watch out: you will lose the training state you had before).

And the save and load buttons allow you to save your code and the trained net state to your machine and load it back afterwards. Save regularly!

Looking at the Code

Defines the most basic settings – for larger inputs you should probably increase the number of train iterations. Actually looking ahead a few patches, and at least one lane to the side is probably a good idea as well.

Specifies some more details about the input – you don’t need to touch that part (except maybe the temporal window).

The net is defined with an array of layers starting with the input which you don’t have to change:

We added one basic hidden layer with just one neuron to show you how to do that – you should definitely change that:

And in the end there is the final regression layer that decides on the action, which probably is fine as it is:

There are a lot more options for the Q-Learning part – details on them can be found in the comments of the code at the following link: https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js These are mostly interesting for more advanced optimisations of your net.

And the last step is creating the brain.

Submission

To submit your neural network for evaluation press the submit button:

Make sure you run training and a local evaluation first and only submit if you are happy with the performance. You can submit multiple times and we will take your best result, but doing it too often is not a good idea: submission adds your net to the back of our evaluation queue, but you can only have one spot there, so if you resubmit before evaluation is done, you get bumped to the back again. To see the results of our evaluation go to your profile page.

If you are officially registered for this class you need to perform better than 65 mph to get credit for this assignment. 

References:

ConvNetJS

Deep Q-Learning using ConvNetJS