License Plate Number Recognition Using Artificial Neural Network
Final project by
Tor Ivry and Shahar Michal
Motivation
Many computer science problems can be solved with computational
biology. Computational
biology has many aspects,
several of which are evolutionary algorithms, genetic programming,
cellular computing and artificial neural networks (ANN). During the
course we learned a little about neurons in the brain and what can
be computed with weighted sums.
This lead us to the idea to combine ANNs and computer vision.
We present our program, LPSee, which is capable of recognizing
Israeli license plate numbers.
Introduction
General
LPSee tries to solve the problem of automatic pattern recognition.
Pattern recognition is concerned with the identification of visual
or audio patterns by computers. A pattern recognizing program must
convert the patterns into digital signals and compare them to
patterns already stored in memory. Some uses of pattern recognition
include character recognition (e.g. hand writing), voice recognition
and face recognition. It is a hard problem for a computer, but
humans are very good at this, so it is only natural to use a similar
system as our brain to solve this problem.
In our approach, the patterns are not stored in the computer's
memory. The ANN learns how to classify the patterns, and this
knowledge is saved in a file.
Artificial Neural Networks
McCulloch and Pitts are generally recognized as the designers
of the first neural network in 1943.
ANN is a simplified weighted graph model of the brain. The nodes represent neurons arranged in
layers, and the edges represent synapses. The strength of the
synapse is the weight of the edge. Each neuron has many inputs from
the previous layer and one output connected to the following layer.
In the execution of the ANN, each perceptron
computes the sum of its weighted inputs and passes the result to a
hard-limit threshold function (in more complex perceptrons,
other functions are used, as in our approach)
The weights on the "synapses" change according to the learning
procedure where the outputs (neurons of the output layer) are mapped
to learned targets. The specific architecture will be discussed in the next section.
       
Approach and Method
Algorithm
The main stages of the algorithm are:
1. Extraction: cropping the license plate from the captured image.
2. Image manipulation: several filters are used to convert the image to the required input to the network.
3. Using the Network: there are two options - learn the example or return the complete license plate number.
       
License Plate Extraction
In real-life applications, specifically license plate recognition
systems (like the one use in road no. 6), the camera is located in a
fixed position, and the position of the license plate in the
different images is similar (there are issues of the car's height
and the shape and location of the license plate). We felt that an
accurate and reliable license plate extraction procedure depends on
the image capturing method, and decided not to develop such
algorithm. Instead, we use cropping to extract the license plate,
and images that contain a license plate in the same position as the
previously displayed image, will not require further cropping.
License Plate Image Manipulation
The extracted license plate is passed through four filters to
convert the input image to a set of binary matrices that is given to
the ANN as input:
1. Thresholding: a user defined threshold is used to convert the image to a black and white image (numbers in black).
2. Segmentation: the image is divided to atomic images, each containing a symbol from the license plate image.
3. Resizing: each atomic image is resized to a fixed pre-defined size. The number of pixels in the new image
is proportionate to the number of inputs in the network.
4. Final conversion: each resized atomic image is converted to a binary matrix of ones and zeros.
In the learning stage, the returned set of binary matrices represent
one example in the training set.
In the execution stage, this
set is given as input, and the result is the complete license
number.
ANN Architecture
The input layer consists of 400 neurons, representing a 20X20 pixels
image. We used one hidden layer of 20 neurons, with a sigmoid
function. As opposed to a
linear threshold function, the neurons always fire, i.e., the result
of the sigmoid on the weighted sum is passed to the output neuron.
The output layer consists of 11 output neurons, one for each number
and a neuron for the "-" symbol. The output neurons also use the
sigmoid function.
Because only one output is wanted, the output
layer's values are converted to an output vector O, which contains
all zeroes, except for the k'th element, which correspond to neuron
k, who's value is the highest.
We used the feed-forward architecture, where all of the neurons are
connected to each of the neurons in the next layer, and there are no cycles. This final layout
gave us the best results, in terms of learning time, storage space
and ability to generalize.
We experimented with different architectures (number of hidden
layers and layer sizes) before reaching this final layout.
Training the ANN - Learning procedure
The input for the learning procedure is a set of license plate
images that serve as the training set, accompanied by the correct
classification given by the user. Each example is transformed to a
set of binary matrices. The
classification information is converted to a target vectors set (the
size of the target vectors is the size of the output neurons layer)
. Each target vector represents a symbol in the license plate and
consists of a one and zeroes. For example, for the number 5, the
target vector is [0 0 0 0 0 1 0 0 0
0 0].
The purpose of the learning procedure is to learn to generalize the
problem and to recognize features which are common to good and bad
examples.
The training is done according to the
back-propagation method. It is a training
procedure which allows multi-layer feed forward neural networks to be
trained and can theoretically perform any input-output mapping. A detailed algorithm is given in the full report.
Each example is added to the previously learned examples, in order
to improve the knowledge of the ANN and not to specialize in the
recently added examples. This data is not needed for the execution
of the ANN. In the end of the training procedure, the user can save
the weights data, which represent the knowledge of the ANN. This
data has a fixed size and does not grow with added examples or added
knowledge.
Testing the Network
Before using the network for license plates recognition, the ANN
must be trained, as described in the previous section. Another
option is to load an existing knowledge file, from previous
training. As in the training, the input for the ANN is a license
plate image, that was transformed to a set of binary matrices. This
is actually a step in the back-propagation algorithm :
"Input the instance (X1,…,Xn) to the network and
compute the network outputs Ok".
Here, the Output
vector O, is converted to a number or a symbol (specified by the 1
in the vector). After each matrix in the set is given to the ANN,
the returned numbers (or symbols) are joined to comprise the
complete license number.
Results
We experimented with different network architectures (as described in the full report).
With the final ANN layout that was descibed in the network architecture section,
the trained network was able to recognize new license plates which
were not part of the training set, yet the recognition was not completely
accurate due to the extraction procedure which requires cropping.
The use of cropping leads to shifts and size variations of the
digits.
In order to cope with the network capability to handle
size variations and shifts, the same training set was given with
different cropping coordinates. Now the ANN was able to deal with
all kinds of license plate images tested by us.
Click here for a simulation of LPSee:    
Conclusions
We have shown that the problem of license plate number recognition
is easily solved with a simple ANN. The results were better with one
hidden layer than our initial experiments with a network that
contained two hidden layers.
Our network can be easily used for other license plates that include
letters. This will require adding output neurons and change the
network configuration (size of input, number of hidden layers and
number of neurons in each hidden layer). In addition, an improved
extraction mechanism of the license plate will give added value to
the application.
We found that choosing the right configuration of the network
requires extensive experiments. The network should be able to learn
the training set in a short time. In addition, the network should be
able to classify patterns never seen before, as well as the learned
examples (i.e., generalize the problem). There is a trade off
between the size of the network and the learning time. Big networks
with several hidden layers and many inputs will be able to cope with
harder problems, but the time that will be required to learn
sufficient data sets will be long. Smaller networks, will learn a
given training set much faster but will only be able to handle
simpler problems and there is a risk that the network will not be
able to generalize the problem.
We propose the use of genetic programming to evolve a network
architecture (i.e., number of layers and their size and the
connections between the neurons in the network). The fitness will be
assigned according to how good the network learned a given training
set.
Additional Information
References
An Introduction to Back-Propagation Neural Networks
The BackPropagation Network:Learning by Example
An Introduction to Neural Networks by Leslie S. Smith
Neural Networks for Pattern Recognition by C.M. Bishop, 1996
A logical calculus of the ideas immanent in nervous activity by W. S. McCulloch and W. H. Pitts
|