From 367be396a705726be6126a4be0e95bf71e481869 Mon Sep 17 00:00:00 2001 From: Kamila Bobkowska Date: Sun, 24 May 2020 14:10:14 +0000 Subject: [PATCH] minor adjustments --- Report_Klaudia_Przybylska.md | 172 ++++++++++++++++++----------------- 1 file changed, 87 insertions(+), 85 deletions(-) diff --git a/Report_Klaudia_Przybylska.md b/Report_Klaudia_Przybylska.md index e93f373..f769860 100644 --- a/Report_Klaudia_Przybylska.md +++ b/Report_Klaudia_Przybylska.md @@ -1,85 +1,87 @@ -# Report - Individual Project Klaudia Przybylska -## General information -In our project, our agent - garbage truck is collecting trash from dumpsters on the grid and then bringing it to the garbage dump. However to make sure that it wasn't sorted incorrectly or mixed on the way because the road was bumpy, wastes is checked again before the truck is emptied and is sorted accordingly. -The program uses Random Forest Classifier to recognize five types of rubbish: -* cardboard -* glass -* metal -* paper -* plastic -Before running the program it is obligatory to unpack "Garbage classifier.rar" and "ClassificationGarbage.rar". -## Extracting information from images -In order to use Random Forest Classifier to classify pictures, I used three global feature descriptors: -* Hu Moments - responsible for capturing information about shapes because they have information about intensity and position of pixels. They are invariant to image transformations (unlike moments or central moments). -``` -def hu_moments(image): - gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) - moments = cv2.moments(gray) - huMoments = cv2.HuMoments(moments).flatten() - return huMoments -``` -* Color histogram - representation of the distribution of colors in an image. -``` -def histogram(image, mask=None): - image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV) - hist = cv2.calcHist([image], [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256]) - cv2.normalize(hist, hist) - histogram = hist.flatten() - return histogram -``` -* Haralick Texture is used to quantify an image based on texture (the consistency of patterns and colors in an image). -``` -def haralick(image): - gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) - haralick = mahotas.features.haralick(gray).mean(axis=0) - return haralick -``` -* All three features are then stacked into one matrix and used in training the classifier, and in the same way for testing it. -``` -allFeatures = np.hstack([histo, hara, huMoments]) -``` -##Creating test and training sets -Data is divided between two sets, where training set contains 80% of all data and test set only 20%. Images are randomly shuffled. -``` -allFileNames = os.listdir(sourceDir) -np.random.shuffle(allFileNames) -trainingFileNames, testFileNames = np.split(np.array(allFileNames), [int(len(allFileNames) * (1 - testRatio))]) -``` -##Implementation -Functions in garbageDumpSorting.py: -* createSets - divides images between test and training set. This function should be run only once, unless the folders with training and test set are removed, -``` -trainingFileNames, testFileNames = np.split(np.array(allFileNames), [int(len(allFileNames) * (1 - testRatio))]) -``` -* huMoments, haralick, histogram - calculate global feature descriptors, -* processTrainData, processTestData - both work in the same way, they iterate over files in train or test directory, saves features as a matrix and then saves results to h5 file, it is recommended to run it only once as it takes some time to finish. -``` -allFeatures = np.hstack([histo, hara, huMoments]) -``` -* trainAndTest - creates classifier, trains it and scores it, -``` -clf = RandomForestClassifier(n_estimators=100, max_depth=15, random_state=9) -``` -* classifyImage - predicts what kind of garbage is visible on a single image, -``` -prediction = clf.predict(features)[0] -``` -* sortDump - checks what kinds of trash are inside the garbage truck and their quantity, empties the garbage truck and sorts its contents on the garbage dump. - -##Changes in common part -I created class garbageDump in which I store information about the quantity of trash present on the garbage dump. I had to add a small function to Garbagetruck class in order to remove wastes from the garbage truck. In main I initialize garbage dump and at the end I display its contents. - -##Libraries -The following libraries are required to run the program: -``` -import os -import numpy as np -import shutil -import cv2 -import mahotas -import h5py -from sklearn.preprocessing import LabelEncoder -from sklearn.preprocessing import MinMaxScaler -from sklearn.ensemble import RandomForestClassifier -import random -``` +# Report - Individual Project Klaudia Przybylska +## General information +In our project, our agent - garbage truck is collecting trash from dumpsters on the grid and then bringing it to the garbage dump. However to make sure that it wasn't sorted incorrectly or mixed on the way because the road was bumpy, wastes is checked again before the truck is emptied and is sorted accordingly. +The program uses Random Forest Classifier to recognize five types of rubbish: +* cardboard +* glass +* metal +* paper +* plastic + + +Before running the program it is obligatory to unpack "Garbage classifier.rar" and "ClassificationGarbage.rar". +## Extracting information from images +In order to use Random Forest Classifier to classify pictures, I used three global feature descriptors: +* Hu Moments - responsible for capturing information about shapes because they have information about intensity and position of pixels. They are invariant to image transformations (unlike moments or central moments). +``` +def hu_moments(image): + gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) + moments = cv2.moments(gray) + huMoments = cv2.HuMoments(moments).flatten() + return huMoments +``` +* Color histogram - representation of the distribution of colors in an image. +``` +def histogram(image, mask=None): + image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV) + hist = cv2.calcHist([image], [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256]) + cv2.normalize(hist, hist) + histogram = hist.flatten() + return histogram +``` +* Haralick Texture is used to quantify an image based on texture (the consistency of patterns and colors in an image). +``` +def haralick(image): + gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) + haralick = mahotas.features.haralick(gray).mean(axis=0) + return haralick +``` +* All three features are then stacked into one matrix and used in training the classifier, and in the same way for testing it. +``` +allFeatures = np.hstack([histo, hara, huMoments]) +``` +## Creating test and training sets +Data is divided between two sets, where training set contains 80% of all data and test set only 20%. Images are randomly shuffled. +``` +allFileNames = os.listdir(sourceDir) +np.random.shuffle(allFileNames) +trainingFileNames, testFileNames = np.split(np.array(allFileNames), [int(len(allFileNames) * (1 - testRatio))]) +``` +## Implementation +Functions in garbageDumpSorting.py: +* createSets - divides images between test and training set. This function should be run only once, unless the folders with training and test set are removed, +``` +trainingFileNames, testFileNames = np.split(np.array(allFileNames), [int(len(allFileNames) * (1 - testRatio))]) +``` +* huMoments, haralick, histogram - calculate global feature descriptors, +* processTrainData, processTestData - both work in the same way, they iterate over files in train or test directory, saves features as a matrix and then saves results to h5 file, it is recommended to run it only once as it takes some time to finish. +``` +allFeatures = np.hstack([histo, hara, huMoments]) +``` +* trainAndTest - creates classifier, trains it and scores it, +``` +clf = RandomForestClassifier(n_estimators=100, max_depth=15, random_state=9) +``` +* classifyImage - predicts what kind of garbage is visible on a single image, +``` +prediction = clf.predict(features)[0] +``` +* sortDump - checks what kinds of trash are inside the garbage truck and their quantity, empties the garbage truck and sorts its contents on the garbage dump. + +## Changes in common part +I created class garbageDump in which I store information about the quantity of trash present on the garbage dump. I had to add a small function to Garbagetruck class in order to remove wastes from the garbage truck. In main I initialize garbage dump and at the end I display its contents. + +## Libraries +The following libraries are required to run the program: +``` +import os +import numpy as np +import shutil +import cv2 +import mahotas +import h5py +from sklearn.preprocessing import LabelEncoder +from sklearn.preprocessing import MinMaxScaler +from sklearn.ensemble import RandomForestClassifier +import random +```