The goal of this project was to use computer vision techniques to label road signs. The state-of-the-art would often be machine learning, but for this project I focused on comprehending the structural and semantic aspects of an image. This includes working with images as 2D and 3D arrays, applying edge detection, identifying lines and basic shapes, and addressing the presence of noise or distortion in an image.

This first section covers some mathematical background that is useful to understand before beginning this project. If you want to skip to the code, head down to Applying the Math.

There is a problem here though. If we calculate the differences stepping in the horizontal direction, vertical edges will be favored over horizontal ones. This is because we are moving across the vertical edges while moving horizontally. Meanwhile, the horizontal edges will be a lot more subtle. The opposite also applies when stepping in the vertical direction. The image below shows this effect in action. On the left is the representation of taking the partial derivative with respect to x, and on the right is the partial derivative with respect to y.

The solution to our problem is the Sobel operator. The Sobel operator consists of two 3x3 convolution kernels. One kernel responds maximally to vertical edges; the other kernel is rotated 90 degrees and responds maximally to horizontal edges. The result of applying each of these kernels to an image is combined to produce an edge image. The Sobel operator is very computationally efficient but comes at the cost of only being

The next step is to perform something called

The process I have just described is referred to as the Canny edge detector. Later, we will be using OpenCV to apply this to an image.

Lastly, we need to find lines from these edge pixels. Even after performing an edge detector algorithm, there is likely to be a lot of noise in an image with erroneous edge pixels and unneeded lines. To find the lines, a voting algorithm will be implemented where each pixel votes for its possible lines. The polar representation of a line identifies a line by

The first step is to pre-process the image. The input image (left) has had some noise added to more closely resemble the type of image a self-driving car would be working with. The image on the right is after I applied cv2.fastNlMeansDenoisingColored(). It's not a perfect outcome but it helps with the later steps. I know stop signs are going to be some variation of the color red. So, the next step was to filter out all non-red colors from the image. After the non-reds were filtered out, I converted the image to grayscale to prepare it for the Canny edge detector.

There were still a lot of unwanted pixels from the background and many small holes in the stop sign. Despite that, the bulk of the stop sign was intact, and the background was mostly cleared out. To finish the job, I applied a median filter across the entire image to smooth out those irregularities.

At this point, the image was ready for the Canny edge detector. The result can be seen below.

Now it is time to determine where the lines are in the image. Using the line detection method described in the math foundations, I was able to acquire a list of all the lines detected in the image. I avoided using horizontal lines because there will often be horizontal lines in images which aren't related to road signs such as the horizon or buildings. But a stop sign also has two pairs of parralel lines at the 45- and 135-degree angles. I calculated the angle of all the lines I found and searched for those two pairs. With those pairs found, I calculated where they intersected which gave me a diamond shaped box located directly over the stop sign. Locating the center of the box yielded the center of the stop sign.