DeepTraffic – About

DeepTraffic: About

DeepTraffic is a deep reinforcement learning competition part of the MIT Deep Learning for Self-Driving Cars course. The goal is to create a neural network to drive a vehicle (or multiple vehicles) as fast as possible through dense highway traffic. An instance of your neural network gets to control one of the cars (displayed in red) and has to learn how to navigate efficiently to go as fast as possible. The car already comes with a safety system, so you don’t have to worry about the basic task of driving – the net only has to tell the car if it should accelerate/slow down or change lanes, and it will do so if that is possible without crashing into other cars.

Overview

 

The game page consists of four different areas

  • on the left, you can find a real time simulation of the road with different display options.
  • on the upper half of the page, you can find (1) a coding area where you can change the design of the neural network which controls the agents and (2) some buttons for applying your changes, saving/loading, and making a submission.
  • below the coding area, you can find (1) a graph showing a moving average of the center red car’s reward, (2) a visualization of the neural network activations, and (3) buttons for training and testing your network.
  • between the simulated roadway and the graphs, you can find the current image of you vehicle and some options to customize it and create a visualization of your best submisison

The simulation area shows some basic information like the current speed of the car and the number of cars that have been passed since you opened the site. It also allows you to change the way the simulation is displayed.

The simulation uses frames as an internal measure of time – so neither a slow computer, nor a slow net influences the result. The Simulation Speed setting lets you control how the simulation is displayed to you – using the Normal setting the simulation tries to draw the frames matching real time, so it waits if the actual calculation is going faster – Fast displays frames as soon as they are finished, which may be much faster.

Internally the whole game runs on a grid system. You can see it if you change the Road Overlay to Full Map:

For each car the grid cells below it are filled with the car’s speed, empty cells are filled with a high value to symbolize the potential for speed.

Your car gets a car-centric cutout of that map to use as an input to the neural network. You can have a look at it by changing the Road Overlay to Learning Input:

The following variables control the size of the input the net gets – a larger input area provides more information about the traffic situation, but it also makes it harder to learn the relevant parts, and may require longer learning times. (But you should definitely change the input size of one we have in the starting sample – that one makes the car essentially blind)

The basic algorithm that powers all the other cars, and also the basis of yours is called Safety System you can have a look at it by switching the Road Overlay:

The highlighted cells tell you what it is looking at, if they are red the Safety System currently blocks going in that direction. The front facing part of the Safety System makes the car slow down to avoid hitting obstacles. Lane switching is disabled while there is any other car in the highlighted area, and if you are already in the process of switching lanes. The checked area increases depending how fast you are trying to go – so just flooring it is not always a good idea.

The agent is controlled by a function called learn that receives the current state (provided as a flattened array of the defined learning input cutout), a reward for the last step (in this case the average speed in mph) and has to return one of the following actions:

The most basic learn function that simply tells the agent to hold its speed and lane would look like:

For the competition you are supposed to use a neural network to control the car – the learn function

to make this happen is already provided in the initial code sample and can stay the same; you are of course free to do your own data preprocessing before feeding the state to the net, but don’t spend too much time on it – most (if not all) of the improvements should come from adapting the net (and you are able to get a fairly decent speed without doing any preprocessing at all – way beyond the required minimum to pass the course).

Training and Evaluation

To train the neural network you have to press the Run Training button:

This will start training the neural network by running the simulation in a separate thread with about 30 times realtime speed and apply the trained net back to the visible simulation from time to time so you should be able to see immediate improvements (only if your net layout is any good of course).

The site also provides an evaluation button that is going to run exactly the same evaluation we are using for the competition.

The evaluation run also happens on a separate thread simulating 500 runs of about 30 seconds each. For each run it computes the average speed per run, and the final score will be the median speed of the 500 runs.

You have to keep in mind that your local evaluation only gives you an estimate of the actual score, as there is some random component involved in how the other cars behave. The relevant score the one we compute. (And we will also look at your code to see if there is any kind of cheating involved, which would get you banned – so don’t even try).

You can find your best speed on your profile page, and if you are really good in the top 10 leaderboard.

Controlling Multiple Vehicles

You  can control up to 10 vehicles. By changing the following line of code

Each agent runs an instance of your algorithm, thus they cannot plan collectively. To drive fast, your network will have to learn to avoid causing traffic jams. With multiple agents, your score is the average velocity of the agents.

Designing the Neural Network

To change the default neural network layout we provide (which is intentionally changed to perform badly) you have to change the code in the code box on the website.

The apply code button runs the code to create the newly defined neural network (watch out: you will lose the training state you had before).

And the save and load buttons allow you to save your code and the trained net state to your machine and load it back afterwards. Save regularly!

Looking at the Code

Defines the most basic settings – for larger inputs you should probably increase the number of train iterations. Actually looking ahead a few patches, and at least one lane to the side is probably a good idea as well.

Specifies some more details about the input – you don’t need to touch that part (except maybe the temporal window).

The net is defined with an array of layers starting with the input which you don’t have to change:

We added one basic hidden layer with just one neuron to show you how to do that – you should definitely change that:

And in the end there is the final regression layer that decides on the action, which probably is fine as it is:

There are a lot more options for the Q-Learning part – details on them can be found in the comments of the code at the following link: https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js These are mostly interesting for more advanced optimisations of your net.

And the last step is creating the brain.

Submission

To submit your neural network for evaluation press the submit button:

Make sure you run training and a local evaluation first and only submit if you are happy with the performance. You can submit multiple times and we will take your best result, but doing it too often is not a good idea: submission adds your net to the back of our evaluation queue, but you can only have one spot there, so if you resubmit before evaluation is done, you get bumped to the back again. To see the results of our evaluation go to your profile page.

If you are officially registered for this class you need to perform better than 65 mph to get credit for this assignment. 

Customization & Visualization

Between the simulated highway and the graphs is an image of your vehicle, options to customize the look of DeepTraffic, and a button to request a visualization of your best submission.

To upload a custom vehicle image (png files only), click the LOAD CUSTOM IMAGE button. After selecting a png file, you need to crop it to a 268×586 image. You can also customize the color scheme, e.g. the trail of the centered agent, with the drop down color selector. After you have made a submission you can request a visualization of your best performance. This visualization is a .mp4 file. When it is ready, you will find a download link on your profile page.

References:

ConvNetJS

Deep Q-Learning using ConvNetJS