diff --git a/Report_Klaudia_Przybylska.md b/Report_Klaudia_Przybylska.md index f769860..d5a3dfb 100644 --- a/Report_Klaudia_Przybylska.md +++ b/Report_Klaudia_Przybylska.md @@ -13,6 +13,7 @@ Before running the program it is obligatory to unpack "Garbage classifier.rar" a ## Extracting information from images In order to use Random Forest Classifier to classify pictures, I used three global feature descriptors: * Hu Moments - responsible for capturing information about shapes because they have information about intensity and position of pixels. They are invariant to image transformations (unlike moments or central moments). + ``` def hu_moments(image): gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) @@ -20,7 +21,9 @@ def hu_moments(image): huMoments = cv2.HuMoments(moments).flatten() return huMoments ``` + * Color histogram - representation of the distribution of colors in an image. + ``` def histogram(image, mask=None): image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV) @@ -30,15 +33,19 @@ def histogram(image, mask=None): return histogram ``` * Haralick Texture is used to quantify an image based on texture (the consistency of patterns and colors in an image). + ``` def haralick(image): gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) haralick = mahotas.features.haralick(gray).mean(axis=0) return haralick + ``` * All three features are then stacked into one matrix and used in training the classifier, and in the same way for testing it. + ``` allFeatures = np.hstack([histo, hara, huMoments]) + ``` ## Creating test and training sets Data is divided between two sets, where training set contains 80% of all data and test set only 20%. Images are randomly shuffled.