This commit is contained in:
Kamila Bobkowska 2020-05-24 14:15:54 +00:00
parent 367be396a7
commit cd70306d93

View File

@ -13,6 +13,7 @@ Before running the program it is obligatory to unpack "Garbage classifier.rar" a
## Extracting information from images
In order to use Random Forest Classifier to classify pictures, I used three global feature descriptors:
* Hu Moments - responsible for capturing information about shapes because they have information about intensity and position of pixels. They are invariant to image transformations (unlike moments or central moments).
```
def hu_moments(image):
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
@ -20,7 +21,9 @@ def hu_moments(image):
huMoments = cv2.HuMoments(moments).flatten()
return huMoments
```
* Color histogram - representation of the distribution of colors in an image.
```
def histogram(image, mask=None):
image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
@ -30,15 +33,19 @@ def histogram(image, mask=None):
return histogram
```
* Haralick Texture is used to quantify an image based on texture (the consistency of patterns and colors in an image).
```
def haralick(image):
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
haralick = mahotas.features.haralick(gray).mean(axis=0)
return haralick
```
* All three features are then stacked into one matrix and used in training the classifier, and in the same way for testing it.
```
allFeatures = np.hstack([histo, hara, huMoments])
```
## Creating test and training sets
Data is divided between two sets, where training set contains 80% of all data and test set only 20%. Images are randomly shuffled.