1
0
forked from s425077/PotatoPlan

Fixing documentation

This commit is contained in:
BOTLester 2020-05-10 23:29:40 +02:00
parent a90602b9c6
commit aa2ec07c35

View File

@ -7,7 +7,7 @@ It's decision is mostly based on nutrients in soil, but also on few other proper
Dataset is very small, it contains only 100 entries.
There are 7 types of fertilizers, each of them adding a specific amount of nutrients to the soil.
Example:
'''
FertilizerType[6] = new Fertilizer
{
ID = 5,
@ -16,14 +16,12 @@ Example:
Phosphorus = 1.77f / 5,
Potassium = 9.5f / 5
};
'''
Unfortunately values of nutrients are not based on real values.
That is because even though dataset intention (by it's creator) was to be used to classify fertilizers, it looks like instead it says what fertilizer WAS used and what will be the results of using that fertilizer on some field.
E.g: Urea has 46% of Nitrogen in it and nothing else. In dataset it was classified as best fertilizer to be used on fields with already really high Nitrogen levels. That would lead to oversaturation with Nitrogen and lack of other nutrients.
So i did some calculations and Urea now looks like this:
'''
FertilizerType[7] = new Fertilizer
{
ID = 6,
@ -33,7 +31,7 @@ So i did some calculations and Urea now looks like this:
Potassium = 9.5f / 5
};
// an "inversed" and little modified counterpart of real-world version of this fertilizer.
'''
## Implementation
@ -41,15 +39,13 @@ I used Gradient Boosting Decision Tree Algorithm for this task due to many featu
First a csv file is loaded:
'''
IDataView trainingDataView = mlContext.Data.LoadFromTextFile<ModelInput>(
path: path,
hasHeader: true,
separatorChar: ',',
allowQuoting: true,
allowSparse: false);
'''
Then it is passed to next function which will train, evaluate and build a model.
Also trainer parameters will be fine-tuned here to prevent overfitting as much as possible by:
- limiting number of leaves,
@ -59,7 +55,6 @@ while maintaining high accuracy by:
- low learning rate combine with
- high number of iterations.
'''
var options = new LightGbmMulticlassTrainer.Options
{
MaximumBinCountPerFeature = 8,
@ -74,11 +69,9 @@ while maintaining high accuracy by:
MaximumTreeDepth = 10
}
};
'''
Creating pipeline for the model:
'''
var pipeline = mlContext.Transforms
.Text.FeaturizeText("Soil_TypeF", "Soil_Type")
.Append(mlContext.Transforms.Text.FeaturizeText("Crop_TypeF", "Crop_Type"))
@ -87,16 +80,13 @@ Creating pipeline for the model:
.AppendCacheCheckpoint(mLContext)
.Append(mLContext.MulticlassClassification.Trainers.LightGbm(options))
.Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel", "PredictedLabel"));
'''
Evaluation of the pipeline is done with cross-validation method with 10 folds.
Results are as follow:
'''
Micro Accuracy: 0.95829
LogLoss Average: 0.100171
LogLoss Reduction: 0.933795
'''
Model is created and saved for later use, to skip long trainig and evaluation times.
Later that model is loaded and prediction engine is created when program is started.
@ -111,5 +101,5 @@ Production rate value is shown in the UI as well as it is represented by the col
At 100% bar will pure **Green**. Any value below will make bar more **Red**, while any value above will add **Blue**, eventually turning bar colour into cyan.
Example:
![Progression Bar](https://git.wmi.amu.edu.pl/s425077/PotatoPlan/src/af0a2bbefa3f4c9d9ba95e9162aca048c2f8072d/example_img.jpg)
![Progression Bar](https://git.wmi.amu.edu.pl/s425077/PotatoPlan/raw/Oskar-ML/example_img.jpg)