minor adjustments

This commit is contained in:
Kamila Bobkowska 2020-05-24 14:10:14 +00:00
parent 98d3b3ff2f
commit 367be396a7

View File

@ -1,85 +1,87 @@
# Report - Individual Project Klaudia Przybylska # Report - Individual Project Klaudia Przybylska
## General information ## General information
In our project, our agent - garbage truck is collecting trash from dumpsters on the grid and then bringing it to the garbage dump. However to make sure that it wasn't sorted incorrectly or mixed on the way because the road was bumpy, wastes is checked again before the truck is emptied and is sorted accordingly. In our project, our agent - garbage truck is collecting trash from dumpsters on the grid and then bringing it to the garbage dump. However to make sure that it wasn't sorted incorrectly or mixed on the way because the road was bumpy, wastes is checked again before the truck is emptied and is sorted accordingly.
The program uses Random Forest Classifier to recognize five types of rubbish: The program uses Random Forest Classifier to recognize five types of rubbish:
* cardboard * cardboard
* glass * glass
* metal * metal
* paper * paper
* plastic * plastic
Before running the program it is obligatory to unpack "Garbage classifier.rar" and "ClassificationGarbage.rar".
## Extracting information from images
In order to use Random Forest Classifier to classify pictures, I used three global feature descriptors: Before running the program it is obligatory to unpack "Garbage classifier.rar" and "ClassificationGarbage.rar".
* Hu Moments - responsible for capturing information about shapes because they have information about intensity and position of pixels. They are invariant to image transformations (unlike moments or central moments). ## Extracting information from images
``` In order to use Random Forest Classifier to classify pictures, I used three global feature descriptors:
def hu_moments(image): * Hu Moments - responsible for capturing information about shapes because they have information about intensity and position of pixels. They are invariant to image transformations (unlike moments or central moments).
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) ```
moments = cv2.moments(gray) def hu_moments(image):
huMoments = cv2.HuMoments(moments).flatten() gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
return huMoments moments = cv2.moments(gray)
``` huMoments = cv2.HuMoments(moments).flatten()
* Color histogram - representation of the distribution of colors in an image. return huMoments
``` ```
def histogram(image, mask=None): * Color histogram - representation of the distribution of colors in an image.
image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV) ```
hist = cv2.calcHist([image], [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256]) def histogram(image, mask=None):
cv2.normalize(hist, hist) image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
histogram = hist.flatten() hist = cv2.calcHist([image], [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256])
return histogram cv2.normalize(hist, hist)
``` histogram = hist.flatten()
* Haralick Texture is used to quantify an image based on texture (the consistency of patterns and colors in an image). return histogram
``` ```
def haralick(image): * Haralick Texture is used to quantify an image based on texture (the consistency of patterns and colors in an image).
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) ```
haralick = mahotas.features.haralick(gray).mean(axis=0) def haralick(image):
return haralick gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
``` haralick = mahotas.features.haralick(gray).mean(axis=0)
* All three features are then stacked into one matrix and used in training the classifier, and in the same way for testing it. return haralick
``` ```
allFeatures = np.hstack([histo, hara, huMoments]) * All three features are then stacked into one matrix and used in training the classifier, and in the same way for testing it.
``` ```
##Creating test and training sets allFeatures = np.hstack([histo, hara, huMoments])
Data is divided between two sets, where training set contains 80% of all data and test set only 20%. Images are randomly shuffled. ```
``` ## Creating test and training sets
allFileNames = os.listdir(sourceDir) Data is divided between two sets, where training set contains 80% of all data and test set only 20%. Images are randomly shuffled.
np.random.shuffle(allFileNames) ```
trainingFileNames, testFileNames = np.split(np.array(allFileNames), [int(len(allFileNames) * (1 - testRatio))]) allFileNames = os.listdir(sourceDir)
``` np.random.shuffle(allFileNames)
##Implementation trainingFileNames, testFileNames = np.split(np.array(allFileNames), [int(len(allFileNames) * (1 - testRatio))])
Functions in garbageDumpSorting.py: ```
* createSets - divides images between test and training set. This function should be run only once, unless the folders with training and test set are removed, ## Implementation
``` Functions in garbageDumpSorting.py:
trainingFileNames, testFileNames = np.split(np.array(allFileNames), [int(len(allFileNames) * (1 - testRatio))]) * createSets - divides images between test and training set. This function should be run only once, unless the folders with training and test set are removed,
``` ```
* huMoments, haralick, histogram - calculate global feature descriptors, trainingFileNames, testFileNames = np.split(np.array(allFileNames), [int(len(allFileNames) * (1 - testRatio))])
* processTrainData, processTestData - both work in the same way, they iterate over files in train or test directory, saves features as a matrix and then saves results to h5 file, it is recommended to run it only once as it takes some time to finish. ```
``` * huMoments, haralick, histogram - calculate global feature descriptors,
allFeatures = np.hstack([histo, hara, huMoments]) * processTrainData, processTestData - both work in the same way, they iterate over files in train or test directory, saves features as a matrix and then saves results to h5 file, it is recommended to run it only once as it takes some time to finish.
``` ```
* trainAndTest - creates classifier, trains it and scores it, allFeatures = np.hstack([histo, hara, huMoments])
``` ```
clf = RandomForestClassifier(n_estimators=100, max_depth=15, random_state=9) * trainAndTest - creates classifier, trains it and scores it,
``` ```
* classifyImage - predicts what kind of garbage is visible on a single image, clf = RandomForestClassifier(n_estimators=100, max_depth=15, random_state=9)
``` ```
prediction = clf.predict(features)[0] * classifyImage - predicts what kind of garbage is visible on a single image,
``` ```
* sortDump - checks what kinds of trash are inside the garbage truck and their quantity, empties the garbage truck and sorts its contents on the garbage dump. prediction = clf.predict(features)[0]
```
##Changes in common part * sortDump - checks what kinds of trash are inside the garbage truck and their quantity, empties the garbage truck and sorts its contents on the garbage dump.
I created class garbageDump in which I store information about the quantity of trash present on the garbage dump. I had to add a small function to Garbagetruck class in order to remove wastes from the garbage truck. In main I initialize garbage dump and at the end I display its contents.
## Changes in common part
##Libraries I created class garbageDump in which I store information about the quantity of trash present on the garbage dump. I had to add a small function to Garbagetruck class in order to remove wastes from the garbage truck. In main I initialize garbage dump and at the end I display its contents.
The following libraries are required to run the program:
``` ## Libraries
import os The following libraries are required to run the program:
import numpy as np ```
import shutil import os
import cv2 import numpy as np
import mahotas import shutil
import h5py import cv2
from sklearn.preprocessing import LabelEncoder import mahotas
from sklearn.preprocessing import MinMaxScaler import h5py
from sklearn.ensemble import RandomForestClassifier from sklearn.preprocessing import LabelEncoder
import random from sklearn.preprocessing import MinMaxScaler
``` from sklearn.ensemble import RandomForestClassifier
import random
```