AI2020_Project/Report_Kamila_Bobkowska.md

5.3 KiB

Report - Individual Project Kamila Bobkowska

General information

Whenever our agent (the garbage truck) visits a dumpster before it takes in garbage it needs to recognize what kind of garbage it is. Even though there are dumpsters of different kinds we assume that people make mistakes. Per dumpster we assume that:

  • there is from 1 to 3 trash correctly sorted
  • there is from 0 to 2 incorrectly sorted trash

Before running the program it is need to unzip the "Garbage classifier.rar" file in the place where the rest of the files is. The assessment of correctness is performed by a Convolutional Neural network. From the initial idea the types of trash changed. Now the Garbage truck will recognize between the following five kinds of debris:

  • paper
  • plastic
  • glass
  • cardboard
  • metal

Implementation

As mentioned above to solve the problem of sorting I used CNNs. I implemented it using mostly Keras and Tensorflow. I used a fairly basic way of doing so, I decided to have 3 2D convolution layers. As I will have more than 2 output nodes I used a softmax as the output activation function. I decided to decrease the size of the images to (110, 110). I decided to use (2,2) Pooling as I have read that it is the most widely used and works best with most cases.

classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), input_shape=(110, 110, 3), activation = "relu"))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Conv2D(64, (3, 3), activation = "relu"))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
# this layer in ver 4
classifier.add(Conv2D(32, (3, 3), activation = "relu"))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
# -----------------
classifier.add(Flatten())
classifier.add(Dense(activation = "relu", units = 64 ))
classifier.add(Dense(activation = "softmax", units = 5))
classifier.compile(optimizer = "adam", loss = "categorical_crossentropy", metrics = ["accuracy"])

After that I did some work with preprocessing the images I was working with. Made sure that the size is still (110, 110). set a batch size, allowed for flips and such.

train_datagen = ImageDataGenerator(
    rescale=1./255,
    shear_range=0.1,
    zoom_range=0.1,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True,
    vertical_flip=True,
)

Later I trained my classifier and saved it's weights for future use in the project. I attempted this about 5 times with different number of epochs and the size per epoch.

UPDATE: I tried running it once again with more epochs and steps per epoch and get a better resulat and I think a better validation loss than before.

UPDATE: Fixing my mistake of choosing an incorrect loss function which produced false accuracy.

classifier.fit_generator( train_generator, steps_per_epoch = 165, epochs = 32, validation_data = test_generator )
classifier.save_weights('model_ver_6.h5')

Training: Example

Changes in the common part

I mostly worked with the classes: Garbagetruck.py and dumpster.py. In the class dumpster.py I added a short function that chooses random images of trash and adds them to a list. In the class Garbagetruck.py I added a function that uses the saved model and gets a list of what trash actually is in the dumpster according to the CCN. Then I enhanced the function that collects garbage and if it was incorrectly sorted it says so. Beyond that it is also possible to run the program in such a way that in each step we can see what iamges is the CNN looking at. At the endo fo examinging each dumpster a general report of what is in the garbage truck is displayed.

Data set

I got my data set from Kaggle: https://www.kaggle.com/asdasdasasdas/garbage-classification . Before starting I removed the category trash completely and created folders for the test set. Later I used a function:

#sepperating the file into training and testing data, creation of folders by hand removal of 75 images from papers for a more even distribution
def sepperate(type):
    for i in type:
        folder = "Garbage classification\\Garbage classification\\" + i
        destination = "Garbage classification\\testset\\" + i
        howmany = len(os.listdir(folder))
        for j in range(int(howmany*0.2)):
            move1 = random.choice(os.listdir(folder))
            source = "Garbage classification\\Garbage classification\\" + i + "\\" + move1
            d = shutil.move(source, destination, copy_function = shutil.copytree)


types = ["cardboard", "glass", "metal", "paper", "plastic"]
sepperate(types)
os.rename("Garbage classification\\Garbage classification", "Garbage classification\\trainset")

To randomly pick something for the test and train set. Later after running about two trials I also removed about 75 pictures from paper as I noticed there was some inbalance in the number of images per category.

Additional information

The weights are saved and later used as it would be too time conspumping to run it every time. For this project I used the following libraries:

import os
import numpy as np
import random
import shutil
from keras.models import Sequential
from keras.layers import Conv2D, Flatten, MaxPooling2D, Dense
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt # optional