aitech-ium/IUM_08.MLFlow.ipynb

46 KiB

MLflow


MLflow

  • https://mlflow.org/
  • Narzędzie podobne do omawianego na poprzednich zajęciach Sacred
  • Nieco inne podejście: mniej ingerencji w istniejący kod
  • Bardziej kompleksowe rozwiązanie: 4 komponenty, pierwszy z nich ma funkcjonalność podobną do Sacred
  • Działa "z każdym" językiem. A tak naprawdę: Python, R, Java + CLI API + REST API
  • Popularna wśród pracodawców - wyniki wyszukiwania ofert pracy: 20 ofert (https://pl.indeed.com/), 36 ofert (linkedin). Sacred: 0
  • Integracja z licznymi bibliotekami / chmurami

Komponenty

MLflow składa się z czterech niezależnych komponentów:

  • MLflow Tracking - pozwala śledzić zmiany parametrów, kodu, środowiska i ich wpływ na metryki. Jest to funkcjonalność bardzo zbliżona do tej, którą zapewnia Sacred

  • MLflow Projects - umożliwia "pakowanie" kodu ekserymentów w taki sposób, żeby mogłby być w łatwy sposób zreprodukowane przez innych

  • MLflow Models - ułatwia "pakowanie" modeli uczenia maszynowego

  • MLflow Registry - zapewnia centralne miejsce do przechowywania i współdzielenia modeli. Zapewnia narzędzia do wersjonowania i śledzenia pochodzenia tych modeli.

    Komponenty te mogą być używane razem bądź oddzielnie.

MLflow Tracking - przykład

(poniższe przykłady kodu trenującego pochodzą z tutoriala MLflow: https://mlflow.org/docs/latest/tutorials-and-examples/tutorial.html)

%%capture null
!pip install mlflow
!pip install sklearn
%%writefile IUM_08/examples/sklearn_elasticnet_wine/train.py
# The data set used in this example is from http://archive.ics.uci.edu/ml/datasets/Wine+Quality
# P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
# Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.

import os
import warnings
import sys

import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
from urllib.parse import urlparse
import mlflow
import mlflow.sklearn

import logging

logging.basicConfig(level=logging.WARN)
logger = logging.getLogger(__name__)


def eval_metrics(actual, pred):
    rmse = np.sqrt(mean_squared_error(actual, pred))
    mae = mean_absolute_error(actual, pred)
    r2 = r2_score(actual, pred)
    return rmse, mae, r2


if __name__ == "__main__":
    warnings.filterwarnings("ignore")
    np.random.seed(40)

    # Read the wine-quality csv file from the URL
    csv_url = (
        "http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
    )
    try:
        data = pd.read_csv(csv_url, sep=";")
    except Exception as e:
        logger.exception(
            "Unable to download training & test CSV, check your internet connection. Error: %s", e
        )

    # Split the data into training and test sets. (0.75, 0.25) split.
    train, test = train_test_split(data)

    # The predicted column is "quality" which is a scalar from [3, 9]
    train_x = train.drop(["quality"], axis=1)
    test_x = test.drop(["quality"], axis=1)
    train_y = train[["quality"]]
    test_y = test[["quality"]]

    
    alpha = float(sys.argv[1]) if len(sys.argv) > 1 else 0.5
    #alpha = 0.5
    l1_ratio = float(sys.argv[2]) if len(sys.argv) > 2 else 0.5
    #l1_ratio = 0.5

    with mlflow.start_run():
        lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
        lr.fit(train_x, train_y)

        predicted_qualities = lr.predict(test_x)

        (rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)

        print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
        print("  RMSE: %s" % rmse)
        print("  MAE: %s" % mae)
        print("  R2: %s" % r2)

        mlflow.log_param("alpha", alpha)
        mlflow.log_param("l1_ratio", l1_ratio)
        mlflow.log_metric("rmse", rmse)
        mlflow.log_metric("r2", r2)
        mlflow.log_metric("mae", mae)

        tracking_url_type_store = urlparse(mlflow.get_tracking_uri()).scheme

        # Model registry does not work with file store
        if tracking_url_type_store != "file":

            # Register the model
            # There are other ways to use the Model Registry, which depends on the use case,
            # please refer to the doc for more information:
            # https://mlflow.org/docs/latest/model-registry.html#api-workflow
            mlflow.sklearn.log_model(lr, "model", registered_model_name="ElasticnetWineModel")
        else:
            mlflow.sklearn.log_model(lr, "model")
Overwriting IUM_08/examples/sklearn_elasticnet_wine/train.py
### Wtyrenujmy model z domyślnymi wartościami parametrów
! cd ./IUM_08/examples/; python sklearn_elasticnet_wine/train.py
Elasticnet model (alpha=0.500000, l1_ratio=0.500000):
  RMSE: 0.7931640229276851
  MAE: 0.6271946374319586
  R2: 0.10862644997792614
### I jeszcze raz, tym razem ze zmienionymi wartościami parametrów
! cd ./IUM_08/examples/; for l in {1..9}; do for a in {1..9}; do python sklearn_elasticnet_wine/train.py 0.$a 0.$l; done; done
Elasticnet model (alpha=0.100000, l1_ratio=0.100000):
  RMSE: 0.7128829045893679
  MAE: 0.5462202174984664
  R2: 0.2799376066653344
Elasticnet model (alpha=0.200000, l1_ratio=0.100000):
  RMSE: 0.7268133518615142
  MAE: 0.5586842416161892
  R2: 0.251521166881557
Elasticnet model (alpha=0.300000, l1_ratio=0.100000):
  RMSE: 0.7347397539240514
  MAE: 0.5657315547549873
  R2: 0.23510678899596094
Elasticnet model (alpha=0.400000, l1_ratio=0.100000):
  RMSE: 0.7410782793160982
  MAE: 0.5712718681984227
  R2: 0.22185255063708875
Elasticnet model (alpha=0.500000, l1_ratio=0.100000):
  RMSE: 0.7460550348172179
  MAE: 0.576381895873763
  R2: 0.21136606570632266
Elasticnet model (alpha=0.600000, l1_ratio=0.100000):
  RMSE: 0.7510866447955419
  MAE: 0.5815681289333974
  R2: 0.20069264568704714
Elasticnet model (alpha=0.700000, l1_ratio=0.100000):
  RMSE: 0.7560654760040749
  MAE: 0.5868129921328281
  R2: 0.19006056603695476
Elasticnet model (alpha=0.800000, l1_ratio=0.100000):
  RMSE: 0.7609263702116827
  MAE: 0.5919470003487062
  R2: 0.17961256649282442
Elasticnet model (alpha=0.900000, l1_ratio=0.100000):
  RMSE: 0.7656313758553691
  MAE: 0.5969367233859049
  R2: 0.16943586313742276
Elasticnet model (alpha=0.100000, l1_ratio=0.200000):
  RMSE: 0.7201489594275661
  MAE: 0.5525324524014098
  R2: 0.26518433811823017
Elasticnet model (alpha=0.200000, l1_ratio=0.200000):
  RMSE: 0.7336400911821402
  MAE: 0.5643841279275428
  R2: 0.23739466063584158
Elasticnet model (alpha=0.300000, l1_ratio=0.200000):
  RMSE: 0.7397486012946922
  MAE: 0.5704931175017443
  R2: 0.22464242411894242
Elasticnet model (alpha=0.400000, l1_ratio=0.200000):
  RMSE: 0.7468093030485085
  MAE: 0.5777243300021722
  R2: 0.2097706278632726
Elasticnet model (alpha=0.500000, l1_ratio=0.200000):
  RMSE: 0.7543919979968401
  MAE: 0.5857669727382302
  R2: 0.19364204365178095
Elasticnet model (alpha=0.600000, l1_ratio=0.200000):
  RMSE: 0.7622123676513404
  MAE: 0.5938629318868578
  R2: 0.17683724501340814
Elasticnet model (alpha=0.700000, l1_ratio=0.200000):
  RMSE: 0.7700845840888665
  MAE: 0.6024685725504659
  R2: 0.15974600028150265
Elasticnet model (alpha=0.800000, l1_ratio=0.200000):
  RMSE: 0.7778880968569085
  MAE: 0.6105907461474273
  R2: 0.14263059582492588
Elasticnet model (alpha=0.900000, l1_ratio=0.200000):
  RMSE: 0.7855450337039626
  MAE: 0.6182359127922239
  R2: 0.1256689455181047
Elasticnet model (alpha=0.100000, l1_ratio=0.300000):
  RMSE: 0.7260299544064643
  MAE: 0.5571534327625295
  R2: 0.2531337966130104
Elasticnet model (alpha=0.200000, l1_ratio=0.300000):
  RMSE: 0.7357092639331829
  MAE: 0.5667609266233857
  R2: 0.23308686049079996
Elasticnet model (alpha=0.300000, l1_ratio=0.300000):
  RMSE: 0.7443224557281489
  MAE: 0.5754825491733004
  R2: 0.2150247343683439
Elasticnet model (alpha=0.400000, l1_ratio=0.300000):
  RMSE: 0.7545302211047864
  MAE: 0.5862255018460154
  R2: 0.19334652749043568
Elasticnet model (alpha=0.500000, l1_ratio=0.300000):
  RMSE: 0.7657094552843393
  MAE: 0.597876674089536
  R2: 0.16926645189778677
Elasticnet model (alpha=0.600000, l1_ratio=0.300000):
  RMSE: 0.7774287676055035
  MAE: 0.6102458961382884
  R2: 0.14364282001967787
Elasticnet model (alpha=0.700000, l1_ratio=0.300000):
  RMSE: 0.7876149030178985
  MAE: 0.6208628759605734
  R2: 0.12105524358911324
Elasticnet model (alpha=0.800000, l1_ratio=0.300000):
  RMSE: 0.7972426725990548
  MAE: 0.6310633254738363
  R2: 0.09943554388738107
Elasticnet model (alpha=0.900000, l1_ratio=0.300000):
  RMSE: 0.806653553139972
  MAE: 0.6407940021176486
  R2: 0.07804901733081859
Elasticnet model (alpha=0.100000, l1_ratio=0.400000):
  RMSE: 0.7301757756825391
  MAE: 0.5603782497631705
  R2: 0.24457984004307665
Elasticnet model (alpha=0.200000, l1_ratio=0.400000):
  RMSE: 0.7383379454127179
  MAE: 0.5696920200435643
  R2: 0.22759672468382497
Elasticnet model (alpha=0.300000, l1_ratio=0.400000):
  RMSE: 0.7501603725852
  MAE: 0.5818749078280213
  R2: 0.2026629101382652
Elasticnet model (alpha=0.400000, l1_ratio=0.400000):
  RMSE: 0.7644619587468349
  MAE: 0.5966303605775048
  R2: 0.17197111491474282
Elasticnet model (alpha=0.500000, l1_ratio=0.400000):
  RMSE: 0.7794144864140182
  MAE: 0.6125287339702588
  R2: 0.1392625955410326
Elasticnet model (alpha=0.600000, l1_ratio=0.400000):
  RMSE: 0.7928446872861473
  MAE: 0.626666444473971
  R2: 0.10934405701835759
Elasticnet model (alpha=0.700000, l1_ratio=0.400000):
  RMSE: 0.8064523157995205
  MAE: 0.6407990295001776
  R2: 0.07850896155515663
Elasticnet model (alpha=0.800000, l1_ratio=0.400000):
  RMSE: 0.8200264141399415
  MAE: 0.6539313398770489
  R2: 0.04722706260889009
Elasticnet model (alpha=0.900000, l1_ratio=0.400000):
  RMSE: 0.8317936823364004
  MAE: 0.6647839366878934
  R2: 0.01968654319755092
Elasticnet model (alpha=0.100000, l1_ratio=0.500000):
  RMSE: 0.7308996187375898
  MAE: 0.5615486628017713
  R2: 0.2430813606733676
Elasticnet model (alpha=0.200000, l1_ratio=0.500000):
  RMSE: 0.7415652207304311
  MAE: 0.573067857646195
  R2: 0.22082961765864062
Elasticnet model (alpha=0.300000, l1_ratio=0.500000):
  RMSE: 0.7573787958793151
  MAE: 0.5893143148791096
  R2: 0.18724431943947983
Elasticnet model (alpha=0.400000, l1_ratio=0.500000):
  RMSE: 0.7759342885655987
  MAE: 0.6090076377075831
  R2: 0.14693206734185604
Elasticnet model (alpha=0.500000, l1_ratio=0.500000):
  RMSE: 0.7931640229276851
  MAE: 0.6271946374319586
  R2: 0.10862644997792614
Elasticnet model (alpha=0.600000, l1_ratio=0.500000):
  RMSE: 0.8112953030727291
  MAE: 0.645693705089251
  R2: 0.06740807086129252
Elasticnet model (alpha=0.700000, l1_ratio=0.500000):
  RMSE: 0.8298921852578498
  MAE: 0.6629780128961713
  R2: 0.024163452726365775
Elasticnet model (alpha=0.800000, l1_ratio=0.500000):
  RMSE: 0.8320198635059106
  MAE: 0.6657357030427604
  R2: 0.019153337439844154
Elasticnet model (alpha=0.900000, l1_ratio=0.500000):
  RMSE: 0.8323808561832262
  MAE: 0.6669472047761406
  R2: 0.0183020229672054
Elasticnet model (alpha=0.100000, l1_ratio=0.600000):
  RMSE: 0.7317723392279818
  MAE: 0.5627373693033669
  R2: 0.24127270524006605
Elasticnet model (alpha=0.200000, l1_ratio=0.600000):
  RMSE: 0.7454324777911233
  MAE: 0.5772117261484206
  R2: 0.21268169183406394
Elasticnet model (alpha=0.300000, l1_ratio=0.600000):
  RMSE: 0.7661028672396263
  MAE: 0.5984406933733759
  R2: 0.16841259155853305
Elasticnet model (alpha=0.400000, l1_ratio=0.600000):
  RMSE: 0.787179486885359
  MAE: 0.6210967388389844
  R2: 0.12202678676193257
Elasticnet model (alpha=0.500000, l1_ratio=0.600000):
  RMSE: 0.809739471626647
  MAE: 0.6442565454817458
  R2: 0.07098152823463388
Elasticnet model (alpha=0.600000, l1_ratio=0.600000):
  RMSE: 0.8317884179944764
  MAE: 0.6647524814105722
  R2: 0.019698951776764728
Elasticnet model (alpha=0.700000, l1_ratio=0.600000):
  RMSE: 0.8321519738036909
  MAE: 0.6662086037874676
  R2: 0.018841829895677176
Elasticnet model (alpha=0.800000, l1_ratio=0.600000):
  RMSE: 0.8326350511178233
  MAE: 0.6676630843299566
  R2: 0.01770234373563795
Elasticnet model (alpha=0.900000, l1_ratio=0.600000):
  RMSE: 0.8332048101440411
  MAE: 0.6690717294644856
  R2: 0.016357542209390563
Elasticnet model (alpha=0.100000, l1_ratio=0.700000):
  RMSE: 0.7327938109945942
  MAE: 0.5640101718105491
  R2: 0.23915303116151632
Elasticnet model (alpha=0.200000, l1_ratio=0.700000):
  RMSE: 0.7499835110445395
  MAE: 0.5819389930665501
  R2: 0.20303883413454027
Elasticnet model (alpha=0.300000, l1_ratio=0.700000):
  RMSE: 0.7747136483567111
  MAE: 0.6079678532556209
  R2: 0.14961391810397695
Elasticnet model (alpha=0.400000, l1_ratio=0.700000):
  RMSE: 0.8004478857657858
  MAE: 0.6350378679245181
  R2: 0.09217977708630032
Elasticnet model (alpha=0.500000, l1_ratio=0.700000):
  RMSE: 0.829586285479097
  MAE: 0.6627028304266674
  R2: 0.024882710417618137
Elasticnet model (alpha=0.600000, l1_ratio=0.700000):
  RMSE: 0.8321502650365332
  MAE: 0.6662000872414003
  R2: 0.018845859373919027
Elasticnet model (alpha=0.700000, l1_ratio=0.700000):
  RMSE: 0.832725785743381
  MAE: 0.667898097502809
  R2: 0.017488244494447747
Elasticnet model (alpha=0.800000, l1_ratio=0.700000):
  RMSE: 0.8331825395236181
  MAE: 0.6692175076829847
  R2: 0.016410124803194592
Elasticnet model (alpha=0.900000, l1_ratio=0.700000):
  RMSE: 0.8331069437643933
  MAE: 0.6697424890266508
  R2: 0.016588601539516357
Elasticnet model (alpha=0.100000, l1_ratio=0.800000):
  RMSE: 0.7339712501091269
  MAE: 0.5654097809725043
  R2: 0.23670603806205326
Elasticnet model (alpha=0.200000, l1_ratio=0.800000):
  RMSE: 0.7552646505492441
  MAE: 0.5873472009739388
  R2: 0.19177543499093674
Elasticnet model (alpha=0.300000, l1_ratio=0.800000):
  RMSE: 0.7836957692333741
  MAE: 0.6176788505535867
  R2: 0.12978065429593022
Elasticnet model (alpha=0.400000, l1_ratio=0.800000):
  RMSE: 0.8160164529135189
  MAE: 0.650349905850893
  R2: 0.05652247327326554
Elasticnet model (alpha=0.500000, l1_ratio=0.800000):
  RMSE: 0.8320145539945119
  MAE: 0.6657081587004348
  R2: 0.019165855890777572
Elasticnet model (alpha=0.600000, l1_ratio=0.800000):
  RMSE: 0.8326325509502465
  MAE: 0.6676500690618903
  R2: 0.01770824285088779
Elasticnet model (alpha=0.700000, l1_ratio=0.800000):
  RMSE: 0.8331830329685253
  MAE: 0.6692142378162035
  R2: 0.016408959758236752
Elasticnet model (alpha=0.800000, l1_ratio=0.800000):
  RMSE: 0.8330972295348316
  MAE: 0.669813814205792
  R2: 0.016611535037920344
Elasticnet model (alpha=0.900000, l1_ratio=0.800000):
  RMSE: 0.8330208354420413
  MAE: 0.6704133670619602
  R2: 0.016791878033996177
Elasticnet model (alpha=0.100000, l1_ratio=0.900000):
  RMSE: 0.735314956888905
  MAE: 0.566974647785579
  R2: 0.23390870203034675
Elasticnet model (alpha=0.200000, l1_ratio=0.900000):
  RMSE: 0.7613249071370938
  MAE: 0.593613372674502
  R2: 0.1787529818606436
Elasticnet model (alpha=0.300000, l1_ratio=0.900000):
  RMSE: 0.7940027723712206
  MAE: 0.6284316436541582
  R2: 0.10674024649047587
Elasticnet model (alpha=0.400000, l1_ratio=0.900000):
  RMSE: 0.831784893250733
  MAE: 0.6647313794016759
  R2: 0.019707259905588637
Elasticnet model (alpha=0.500000, l1_ratio=0.900000):
  RMSE: 0.8323747376136406
  MAE: 0.6669171677143245
  R2: 0.018316455219614114
Elasticnet model (alpha=0.600000, l1_ratio=0.900000):
  RMSE: 0.8332063354920289
  MAE: 0.6690618761753936
  R2: 0.01635394069773599
Elasticnet model (alpha=0.700000, l1_ratio=0.900000):
  RMSE: 0.8331078270287657
  MAE: 0.6697360518827573
  R2: 0.016586516302516174
Elasticnet model (alpha=0.800000, l1_ratio=0.900000):
  RMSE: 0.8330212125502486
  MAE: 0.6704102143580977
  R2: 0.016790987837928095
Elasticnet model (alpha=0.900000, l1_ratio=0.900000):
  RMSE: 0.8329464950658837
  MAE: 0.6710843636018047
  R2: 0.01696735695860563
### Informacje o przebieagach eksperymentu zostały zapisane w katalogu mlruns
! ls -l IUM_08/examples/mlruns/0
total 16
drwxrwxr-x 6 tomek tomek 4096 maj  2 17:07 15918a3901854356933736dfc0935807
drwxrwxr-x 6 tomek tomek 4096 maj  2 16:36 23ae1069b29e4955ac9f3536c71e7ac2
drwxrwxr-x 6 tomek tomek 4096 maj  2 17:07 b7ddb17a37404d7898e105afa5c20287
-rw-rw-r-- 1 tomek tomek  151 maj  2 16:36 meta.yaml
### Możemy je obejrzeć w przeglądarce uruchamiając interfejs webowy:
### (powinniśmy to wywołać w normalnej konsoli, w jupyter będziemy mieli zablokowany kernel)
! cd IUM_08/examples/; mlflow ui
[2021-05-10 12:21:16 +0200] [20029] [INFO] Starting gunicorn 20.1.0
[2021-05-10 12:21:16 +0200] [20029] [INFO] Listening at: http://127.0.0.1:5000 (20029)
[2021-05-10 12:21:16 +0200] [20029] [INFO] Using worker: sync
[2021-05-10 12:21:16 +0200] [20030] [INFO] Booting worker with pid: 20030
^C
[2021-05-10 12:22:32 +0200] [20029] [INFO] Handling signal: int
[2021-05-10 12:22:32 +0200] [20030] [INFO] Worker exiting (pid: 20030)

Wygląd interfejsu webowego

Porównywanie wyników

Logowanie

  • logowania metryk i parametrów można dokonać m.in. poprzez wywołania Python-owego API: mlflow.log_param() i mlflow.log_metric(). Więcej dostępnych funkcji: link

  • wywołania te muszą nastąpić po wykonaniu mlflow.start_run(), najlepiej wewnątrz bloku:

    with mlflow.start_run():
         
         #[...]
    
         mlflow.log_param("alpha", alpha)
         mlflow.log_param("l1_ratio", l1_ratio)
    
  • jest też możliwość automatycznego logwania dla wybranych bibliotek: https://mlflow.org/docs/latest/tracking.html#automatic-logging

MLflow Projects

  • MLflow projects to zestaw konwencji i kilku narzędzi
  • ułatwiają one uruchamianie eskperymentów

Konfiguracja projektu

  • W pliku MLproject zapisuje się konfigurację projektu (specyfikacja)
  • Zawiera ona:
    • odnośnik do środowiska, w którym ma być wywołany eksperyment szczegóły:
      • nazwa obrazu Docker
      • albo ścieżka do pliku conda.yaml definiującego środowisko wykonania Conda
    • parametry, z którymi można wywołać eksperyment
    • polecenia służące do wywołania eksperymentu
%%writefile IUM_08/examples/sklearn_elasticnet_wine/MLproject
name: tutorial

conda_env: conda.yaml #ścieżka do pliku conda.yaml z definicją środowiska
    
#docker_env:
#  image: mlflow-docker-example-environment

entry_points:
  main:
    parameters:
      alpha: {type: float, default: 0.5}
      l1_ratio: {type: float, default: 0.1}
    command: "python train.py {alpha} {l1_ratio}"
  test:
    parameters:
      alpha: {type: cutoff, default: 0}
    command: "python test.py {cutoff}"
Overwriting IUM_08/examples/sklearn_elasticnet_wine/MLproject
%%writefile IUM_08/examples/sklearn_elasticnet_wine/conda.yaml
name: tutorial
channels:
  - defaults
dependencies:
  - python=3.6 #Te zależności będą zainstalowane za pomocą conda isntall
  - pip
  - pip: #Te ząś za pomocą pip install
    - scikit-learn==0.23.2
    - mlflow>=1.0
Overwriting IUM_08/examples/sklearn_elasticnet_wine/conda.yaml

Środowisko docker

  • zamiast środowiska Conda możemy również podać nazwę obrazu docker, w którym ma być wywołany eksperyment.
  • obraz będzie szukany lokalnie a następnie na DockerHub, lub w innym repozytorium dockera
  • składnia specyfikacji ścieżki jest taka sama jak w przypadki poleceń dockera, np. docker pull link
  • Można również podać katalogi do podmontowania wewnątrz kontenera oraz wartości zmiennych środowiskowych do ustawienia w kontenerze:
    docker_env:
     image: mlflow-docker-example-environment
     volumes: ["/local/path:/container/mount/path"]
     environment: [["NEW_ENV_VAR", "new_var_value"], "VAR_TO_COPY_FROM_HOST_ENVIRONMENT"]
    

Parametry

  • Specyfikacja parametrów w pliku MLproject pozwala na ich walidację i używanie wartości domyślnych

  • Dostępne typy:

    • String
    • Float - dowolna liczba (MLflow waliduje, czy podana wartość jest liczbą)
    • Path - pozwala podawać ścieżki względne (przekształca je na bezwzlędne) do plików lokalnych albo do plików zdalnych (np. do s3://) - zostaną wtedy ściągnięte lokalnie
    • URI - podobnie jak path, ale do rozproszonych systemów plików
  • Składnia

    parameter_name: {type: data_type, default: value}  # Short syntax
    
    parameter_name:     # Long syntax
       type: data_type
       default: value
    

Uruchamianie projektu

  • Projekt możemy uruchomić przy pomocy polecenia mlflow run (dokumentacja)
  • Spowoduje to przygotowanie środowiska i uruchomienie eksperymentu wewnątrz środowiska
  • domyślnie zostanie uruchomione polecenie zdefiniowane w "entry point" main. Żeby uruchomić inny "entry point", możemy użyć parametru -e, np:
    mlflow run sklearn_elasticnet_wine -e test
    
!cd IUM_08/examples/; mlflow run sklearn_elasticnet_wine -P alpha=0.42
2021/05/10 12:39:32 INFO mlflow.utils.conda: === Creating conda environment mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29 ===
Collecting package metadata (repodata.json): done
Solving environment: done
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Installing pip dependencies: / Ran pip subprocess with arguments:
['/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/bin/python', '-m', 'pip', 'install', '-U', '-r', '/home/tomek/AITech/repo/aitech-ium-private/IUM_08/examples/sklearn_elasticnet_wine/condaenv.xf9x7i2v.requirements.txt']
Pip subprocess output:
Collecting scikit-learn==0.23.2
  Using cached scikit_learn-0.23.2-cp36-cp36m-manylinux1_x86_64.whl (6.8 MB)
Collecting mlflow>=1.0
  Downloading mlflow-1.17.0-py3-none-any.whl (14.2 MB)
Collecting joblib>=0.11
  Using cached joblib-1.0.1-py3-none-any.whl (303 kB)
Collecting scipy>=0.19.1
  Using cached scipy-1.5.4-cp36-cp36m-manylinux1_x86_64.whl (25.9 MB)
Requirement already satisfied: numpy>=1.13.3 in /home/tomek/.local/lib/python3.6/site-packages (from scikit-learn==0.23.2->-r /home/tomek/AITech/repo/aitech-ium-private/IUM_08/examples/sklearn_elasticnet_wine/condaenv.xf9x7i2v.requirements.txt (line 1)) (1.15.4)
Collecting threadpoolctl>=2.0.0
  Using cached threadpoolctl-2.1.0-py3-none-any.whl (12 kB)
Collecting pandas
  Using cached pandas-1.1.5-cp36-cp36m-manylinux1_x86_64.whl (9.5 MB)
Collecting pyyaml
  Using cached PyYAML-5.4.1-cp36-cp36m-manylinux1_x86_64.whl (640 kB)
Collecting gunicorn
  Using cached gunicorn-20.1.0-py3-none-any.whl (79 kB)
Collecting Flask
  Using cached Flask-1.1.2-py2.py3-none-any.whl (94 kB)
Collecting alembic<=1.4.1
  Using cached alembic-1.4.1-py2.py3-none-any.whl
Collecting prometheus-flask-exporter
  Downloading prometheus_flask_exporter-0.18.2.tar.gz (22 kB)
Collecting entrypoints
  Using cached entrypoints-0.3-py2.py3-none-any.whl (11 kB)
Collecting databricks-cli>=0.8.7
  Using cached databricks_cli-0.14.3-py3-none-any.whl
Collecting requests>=2.17.3
  Using cached requests-2.25.1-py2.py3-none-any.whl (61 kB)
Collecting docker>=4.0.0
  Using cached docker-5.0.0-py2.py3-none-any.whl (146 kB)
Collecting sqlalchemy
  Downloading SQLAlchemy-1.4.14-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB)
Collecting cloudpickle
  Using cached cloudpickle-1.6.0-py3-none-any.whl (23 kB)
Collecting pytz
  Using cached pytz-2021.1-py2.py3-none-any.whl (510 kB)
Collecting protobuf>=3.6.0
  Downloading protobuf-3.16.0-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.0 MB)
Collecting click>=7.0
  Using cached click-7.1.2-py2.py3-none-any.whl (82 kB)
Collecting sqlparse>=0.3.1
  Using cached sqlparse-0.4.1-py3-none-any.whl (42 kB)
Collecting querystring-parser
  Using cached querystring_parser-1.2.4-py2.py3-none-any.whl (7.9 kB)
Collecting gitpython>=2.1.0
  Using cached GitPython-3.1.14-py3-none-any.whl (159 kB)
Collecting Mako
  Using cached Mako-1.1.4-py2.py3-none-any.whl (75 kB)
Collecting python-editor>=0.3
  Using cached python_editor-1.0.4-py3-none-any.whl (4.9 kB)
Collecting python-dateutil
  Using cached python_dateutil-2.8.1-py2.py3-none-any.whl (227 kB)
Collecting tabulate>=0.7.7
  Using cached tabulate-0.8.9-py3-none-any.whl (25 kB)
Requirement already satisfied: six>=1.10.0 in /home/tomek/.local/lib/python3.6/site-packages (from databricks-cli>=0.8.7->mlflow>=1.0->-r /home/tomek/AITech/repo/aitech-ium-private/IUM_08/examples/sklearn_elasticnet_wine/condaenv.xf9x7i2v.requirements.txt (line 2)) (1.12.0)
Collecting websocket-client>=0.32.0
  Downloading websocket_client-0.59.0-py2.py3-none-any.whl (67 kB)
Collecting gitdb<5,>=4.0.1
  Using cached gitdb-4.0.7-py3-none-any.whl (63 kB)
Collecting smmap<5,>=3.0.1
  Using cached smmap-4.0.0-py2.py3-none-any.whl (24 kB)
Collecting idna<3,>=2.5
  Using cached idna-2.10-py2.py3-none-any.whl (58 kB)
Collecting chardet<5,>=3.0.2
  Using cached chardet-4.0.0-py2.py3-none-any.whl (178 kB)
Collecting urllib3<1.27,>=1.21.1
  Using cached urllib3-1.26.4-py2.py3-none-any.whl (153 kB)
Requirement already satisfied: certifi>=2017.4.17 in /media/tomek/Linux_data/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/site-packages (from requests>=2.17.3->mlflow>=1.0->-r /home/tomek/AITech/repo/aitech-ium-private/IUM_08/examples/sklearn_elasticnet_wine/condaenv.xf9x7i2v.requirements.txt (line 2)) (2020.12.5)
Collecting greenlet!=0.4.17
  Downloading greenlet-1.1.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (155 kB)
Collecting importlib-metadata
  Using cached importlib_metadata-4.0.1-py3-none-any.whl (16 kB)
Collecting itsdangerous>=0.24
  Using cached itsdangerous-1.1.0-py2.py3-none-any.whl (16 kB)
Collecting Werkzeug>=0.15
  Using cached Werkzeug-1.0.1-py2.py3-none-any.whl (298 kB)
Collecting Jinja2>=2.10.1
  Using cached Jinja2-2.11.3-py2.py3-none-any.whl (125 kB)
Collecting MarkupSafe>=0.23
  Using cached MarkupSafe-1.1.1-cp36-cp36m-manylinux2010_x86_64.whl (32 kB)
Requirement already satisfied: setuptools>=3.0 in /media/tomek/Linux_data/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/site-packages (from gunicorn->mlflow>=1.0->-r /home/tomek/AITech/repo/aitech-ium-private/IUM_08/examples/sklearn_elasticnet_wine/condaenv.xf9x7i2v.requirements.txt (line 2)) (52.0.0.post20210125)
Collecting typing-extensions>=3.6.4
  Using cached typing_extensions-3.10.0.0-py3-none-any.whl (26 kB)
Collecting zipp>=0.5
  Using cached zipp-3.4.1-py3-none-any.whl (5.2 kB)
Collecting prometheus_client
  Using cached prometheus_client-0.10.1-py2.py3-none-any.whl (55 kB)
Building wheels for collected packages: prometheus-flask-exporter
  Building wheel for prometheus-flask-exporter (setup.py): started
  Building wheel for prometheus-flask-exporter (setup.py): finished with status 'done'
  Created wheel for prometheus-flask-exporter: filename=prometheus_flask_exporter-0.18.2-py3-none-any.whl size=17399 sha256=84da5903cdaabc8f667b7b2e3d5f63a3021cab3d4f4fc1981d9d2a3ab5264738
  Stored in directory: /home/tomek/.cache/pip/wheels/15/77/e8/3ca90b66243b0b58d5a5323a3da02cc8c5daf1de7a65141701
Successfully built prometheus-flask-exporter
Installing collected packages: zipp, typing-extensions, MarkupSafe, Werkzeug, urllib3, smmap, Jinja2, itsdangerous, importlib-metadata, idna, greenlet, click, chardet, websocket-client, tabulate, sqlalchemy, requests, pytz, python-editor, python-dateutil, prometheus-client, Mako, gitdb, Flask, threadpoolctl, sqlparse, scipy, querystring-parser, pyyaml, protobuf, prometheus-flask-exporter, pandas, joblib, gunicorn, gitpython, entrypoints, docker, databricks-cli, cloudpickle, alembic, scikit-learn, mlflow
Successfully installed Flask-1.1.2 Jinja2-2.11.3 Mako-1.1.4 MarkupSafe-1.1.1 Werkzeug-1.0.1 alembic-1.4.1 chardet-4.0.0 click-7.1.2 cloudpickle-1.6.0 databricks-cli-0.14.3 docker-5.0.0 entrypoints-0.3 gitdb-4.0.7 gitpython-3.1.14 greenlet-1.1.0 gunicorn-20.1.0 idna-2.10 importlib-metadata-4.0.1 itsdangerous-1.1.0 joblib-1.0.1 mlflow-1.17.0 pandas-1.1.5 prometheus-client-0.10.1 prometheus-flask-exporter-0.18.2 protobuf-3.16.0 python-dateutil-2.8.1 python-editor-1.0.4 pytz-2021.1 pyyaml-5.4.1 querystring-parser-1.2.4 requests-2.25.1 scikit-learn-0.23.2 scipy-1.5.4 smmap-4.0.0 sqlalchemy-1.4.14 sqlparse-0.4.1 tabulate-0.8.9 threadpoolctl-2.1.0 typing-extensions-3.10.0.0 urllib3-1.26.4 websocket-client-0.59.0 zipp-3.4.1

done
#
# To activate this environment, use
#
#     $ conda activate mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29
#
# To deactivate an active environment, use
#
#     $ conda deactivate

2021/05/10 12:40:17 INFO mlflow.projects.utils: === Created directory /tmp/tmpgvcpfml8 for downloading remote URIs passed to arguments of type 'path' ===
2021/05/10 12:40:17 INFO mlflow.projects.backend.local: === Running command 'source /home/tomek/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29 1>&2 && python train.py 0.42 0.1' in run with ID 'b9b3795a2898495d95c650bafc0dcc76' === 
ERROR:__main__:Unable to download training & test CSV, check your internet connection. Error: <urlopen error [Errno 110] Connection timed out>
Traceback (most recent call last):
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/urllib/request.py", line 1349, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/http/client.py", line 1287, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/http/client.py", line 1333, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/http/client.py", line 1282, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/http/client.py", line 1042, in _send_output
    self.send(msg)
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/http/client.py", line 980, in send
    self.connect()
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/http/client.py", line 952, in connect
    (self.host,self.port), self.timeout, self.source_address)
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/socket.py", line 724, in create_connection
    raise err
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/socket.py", line 713, in create_connection
    sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 40, in <module>
    data = pd.read_csv(csv_url, sep=";")
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/site-packages/pandas/io/parsers.py", line 437, in _read
    filepath_or_buffer, encoding, compression
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/site-packages/pandas/io/common.py", line 183, in get_filepath_or_buffer
    req = urlopen(filepath_or_buffer)
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/site-packages/pandas/io/common.py", line 137, in urlopen
    return urllib.request.urlopen(*args, **kwargs)
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/urllib/request.py", line 526, in open
    response = self._open(req, data)
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/urllib/request.py", line 544, in _open
    '_open', req)
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/urllib/request.py", line 1377, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/home/tomek/miniconda3/envs/mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29/lib/python3.6/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 110] Connection timed out>
Traceback (most recent call last):
  File "train.py", line 47, in <module>
    train, test = train_test_split(data)
NameError: name 'data' is not defined
2021/05/10 12:42:29 ERROR mlflow.cli: === Run (ID 'b9b3795a2898495d95c650bafc0dcc76') failed ===

Zadania [10p pkt] (do 16 V 12:00)

  1. Dodaj do swojego projektu logowanie parametrów i metryk za pomocą MLflow (polecenia mlflow.log_param i mlflow.log_metric
  2. Dodaj plik MLProject definiujący polecenia do trenowania i testowania, ich parametry wywołania oraz środowisko (użyj zdefiniowanego wcześniej obrazu Docker)