60 lines
2.9 KiB
Markdown
60 lines
2.9 KiB
Markdown
|
# CNN Plates Classification
|
||
|
Author: Weronika Skowrońska, s444523
|
||
|
|
||
|
As my individual project, I decided to perform a classification of plates images using a Convolutional Neural Network. The goal of the project is to classify a photo of the client's plate as empty, full or dirty, and assign an appropriate value to the given instance of the "Table" class.
|
||
|
|
||
|
# Architecture
|
||
|
|
||
|
Architecture of my CNN is very simple. I decided to use two convolutions, each using 32 feature detectors of size 3 by 3, followed by the ReLU activation function and MaxPooling of size 2 by 2.
|
||
|
```sh
|
||
|
classifier = Sequential()
|
||
|
|
||
|
classifier.add(Convolution2D(32, (3, 3), input_shape =(256, 256, 3), activation = "relu"))
|
||
|
classifier.add(MaxPooling2D(pool_size = (2,2)))
|
||
|
|
||
|
classifier.add(Convolution2D(32, 3, 3, activation = 'relu'))
|
||
|
classifier.add(MaxPooling2D(pool_size = (2, 2)))
|
||
|
|
||
|
classifier.add(Flatten())
|
||
|
```
|
||
|
After flattening, I added a fully connected layer of size 128 (again with ReLU activation function). The output layer consists of 3 neurons with softmax activation function, as I am using the Network for multiclass classification (3 possible outcomes).
|
||
|
```sh
|
||
|
classifier.add(Dense(units = 128, activation = "relu"))
|
||
|
classifier.add(Dense(units = 3, activation = "softmax"))
|
||
|
```
|
||
|
The optimizer of my network is adam, and categorical cross entropy was my choice for a loss function.
|
||
|
```sh
|
||
|
classifier.compile(optimizer = "adam", loss = "categorical_crossentropy", metrics = ["accuracy"])
|
||
|
```
|
||
|
# Library
|
||
|
|
||
|
I used keras to implement the network. It let me add some specific features to my network, such as early stopping and a few methods of data augmentation.
|
||
|
```sh
|
||
|
train_datagen = ImageDataGenerator(
|
||
|
rescale=1./255,
|
||
|
shear_range=0.2,
|
||
|
zoom_range=0.2,
|
||
|
horizontal_flip=True,
|
||
|
width_shift_range=0.2,
|
||
|
height_shift_range=0.1,
|
||
|
fill_mode='nearest')
|
||
|
```
|
||
|
This last issue was very important to me, as I did not have many photos to train the network with (altogether there were approximately 1200 of them).
|
||
|
|
||
|
# Project implementation
|
||
|
|
||
|
After training the Network, I saved the model which gave me the best results (two keras callbacks, EarlyStopping and ModelCheckpoint were very useful) to a file named "best_model.h5".
|
||
|
```sh
|
||
|
# callbacks:
|
||
|
es = EarlyStopping(monitor='val_loss', mode='min', baseline=1, patience = 10)
|
||
|
mc = ModelCheckpoint('best_model.h5', monitor='val_loss', mode='min', save_best_only=True, verbose = 1, period = 10)
|
||
|
|
||
|
```
|
||
|
Then, I imported the model into our project (The Waiter) using "load_model" utility of keras.models:
|
||
|
```sh
|
||
|
from keras.models import load_model
|
||
|
...
|
||
|
saved_model = load_model('best_model.h5')
|
||
|
```
|
||
|
After coming to each table, the Agent (the waiter) evaluates a randomly selected photo of a plate using the saved model, and assigns the number of predicted class into the "state" attribute of a given table. This information will let perform further actions, based on the predicted outcome.
|