ium/IUM_08.MLFlow.ipynb
2024-04-23 14:55:50 +02:00

77 KiB

Inżynieria uczenia maszynowego

24 kwietnia 2024

8. MLFlow

MLflow


  • https://mlflow.org/
  • Narzędzie podobne do omawianego na poprzednich zajęciach Sacred
  • Nieco inne podejście: mniej ingerencji w istniejący kod
  • Bardziej kompleksowe rozwiązanie: 4 komponenty, pierwszy z nich ma funkcjonalność podobną do Sacred
  • Działa "z każdym" językiem. A tak naprawdę: Python, R, Java + CLI API + REST API
  • Popularna wśród pracodawców - wyniki wyszukiwania ofert pracy: 20 ofert (https://pl.indeed.com/), 36 ofert (linkedin). Sacred: 0
  • Integracja z licznymi bibliotekami / chmurami
  • Rozwiązanie OpenSource, stworzone przez firmę Databricks
  • Dostępne różne wydania / opcje instalacji:
    • płatne:
      • Databricks Customers
    • bezpłatne:
      • Databricks Community Edition
      • Self-managed MLflow
      • Local Tracking Server

Komponenty

MLflow składa się z czterech niezależnych komponentów:

  • MLflow Tracking - pozwala śledzić zmiany parametrów, kodu, środowiska i ich wpływ na metryki. Jest to funkcjonalność bardzo zbliżona do tej, którą zapewnia Sacred

  • MLflow Projects - umożliwia "pakowanie" kodu ekserymentów w taki sposób, żeby mogłby być w łatwy sposób zreprodukowane przez innych

  • MLflow Models - ułatwia "pakowanie" modeli uczenia maszynowego

  • MLflow Registry - zapewnia centralne miejsce do przechowywania i współdzielenia modeli. Zapewnia narzędzia do wersjonowania i śledzenia pochodzenia tych modeli.

    Komponenty te mogą być używane razem bądź oddzielnie.

MLflow Tracking - przykład

(Poniższe przykłady kodu trenującego pochodzą z tutoriala MLflow).

%%capture null
!pip install mlflow
!pip install sklearn
!mkdir -p IUM_08/examples/sklearn_elasticnet_wine/
%%writefile IUM_08/examples/sklearn_elasticnet_wine/train.py
# The data set used in this example is from http://archive.ics.uci.edu/ml/datasets/Wine+Quality
# P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
# Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.

import os
import warnings
import sys

import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
from urllib.parse import urlparse
import mlflow
import mlflow.sklearn

import logging

logging.basicConfig(level=logging.WARN)
logger = logging.getLogger(__name__)

mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("s123456")

def eval_metrics(actual, pred):
    rmse = np.sqrt(mean_squared_error(actual, pred))
    mae = mean_absolute_error(actual, pred)
    r2 = r2_score(actual, pred)
    return rmse, mae, r2


if __name__ == "__main__":
    warnings.filterwarnings("ignore")
    np.random.seed(40)

    # Read the wine-quality csv file from the URL
    csv_url = (
        "http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
    )
    try:
        data = pd.read_csv(csv_url, sep=";")
    except Exception as e:
        logger.exception(
            "Unable to download training & test CSV, check your internet connection. Error: %s", e
        )

    # Split the data into training and test sets. (0.75, 0.25) split.
    train, test = train_test_split(data)

    # The predicted column is "quality" which is a scalar from [3, 9]
    train_x = train.drop(["quality"], axis=1)
    test_x = test.drop(["quality"], axis=1)
    train_y = train[["quality"]]
    test_y = test[["quality"]]

    
    alpha = float(sys.argv[1]) if len(sys.argv) > 1 else 0.5
    #alpha = 0.5
    l1_ratio = float(sys.argv[2]) if len(sys.argv) > 2 else 0.5
    #l1_ratio = 0.5

    with mlflow.start_run() as run:
        print("MLflow run experiment_id: {0}".format(run.info.experiment_id))
        print("MLflow run artifact_uri: {0}".format(run.info.artifact_uri))

        lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
        lr.fit(train_x, train_y)

        predicted_qualities = lr.predict(test_x)

        (rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)

        print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
        print("  RMSE: %s" % rmse)
        print("  MAE: %s" % mae)
        print("  R2: %s" % r2)

        mlflow.log_param("alpha", alpha)
        mlflow.log_param("l1_ratio", l1_ratio)
        mlflow.log_metric("rmse", rmse)
        mlflow.log_metric("r2", r2)
        mlflow.log_metric("mae", mae)
        
        # Infer model signature to log it
        # Więcej o sygnaturach: https://mlflow.org/docs/latest/models.html?highlight=signature#model-signature
        signature = mlflow.models.signature.infer_signature(train_x, lr.predict(train_x))

        tracking_url_type_store = urlparse(mlflow.get_tracking_uri()).scheme

        # Model registry does not work with file store
        if tracking_url_type_store != "file":

            # Register the model
            # There are other ways to use the Model Registry, which depends on the use case,
            # please refer to the doc for more information:
            # https://mlflow.org/docs/latest/model-registry.html#api-workflow
            mlflow.sklearn.log_model(lr, "wines-model", registered_model_name="ElasticnetWineModel", signature=signature)
        else:
            mlflow.sklearn.log_model(lr, "model", signature=signature)
Writing IUM_08/examples/sklearn_elasticnet_wine/train.py
! ls -l /tmp/mlruns
### Wtyrenujmy model z domyślnymi wartościami parametrów
! cd ./IUM_08/examples/; python sklearn_elasticnet_wine/train.py
ls: cannot access '/tmp/mlruns': No such file or directory
WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=4, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7130>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=3, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7580>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7730>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b78e0>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7a90>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 198, in _new_conn
    sock = connection.create_connection(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 793, in urlopen
    response = self._make_request(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request
    conn.request(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 400, in request
    self.endheaders()
  File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output
    self.send(msg)
  File "/usr/lib/python3.10/http/client.py", line 976, in send
    self.connect()
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 238, in connect
    self.sock = self._new_conn()
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 213, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fd9988b7c40>: Failed to establish a new connection: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  [Previous line repeated 2 more times]
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 847, in urlopen
    retries = retries.increment(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7c40>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 128, in http_request
    return _get_http_response_with_retries(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/request_utils.py", line 228, in _get_http_response_with_retries
    return session.request(method, url, allow_redirects=allow_redirects, **kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7c40>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pawel/ium/IUM_08/examples/sklearn_elasticnet_wine/train.py", line 24, in <module>
    mlflow.set_experiment("s123456")
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/fluent.py", line 143, in set_experiment
    experiment = client.get_experiment_by_name(experiment_name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/client.py", line 544, in get_experiment_by_name
    return self._tracking_client.get_experiment_by_name(name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/_tracking_service/client.py", line 236, in get_experiment_by_name
    return self.store.get_experiment_by_name(name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 323, in get_experiment_by_name
    response_proto = self._call_endpoint(GetExperimentByName, req_body)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 60, in _call_endpoint
    return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 286, in call_endpoint
    response = http_request(**call_kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 151, in http_request
    raise MlflowException(f"API request to {url} failed with exception {e}")
mlflow.exceptions.MlflowException: API request to http://localhost:5000/api/2.0/mlflow/experiments/get-by-name failed with exception HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7c40>: Failed to establish a new connection: [Errno 111] Connection refused'))
### I jeszcze raz, tym razem ze zmienionymi wartościami parametrów
! cd ./IUM_08/examples/; for l in {1..9}; do for a in {1..9}; do python sklearn_elasticnet_wine/train.py 0.$a 0.$l; done; done
WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=4, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8f130>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=3, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8f580>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8f730>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8f8e0>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8fa90>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 198, in _new_conn
    sock = connection.create_connection(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 793, in urlopen
    response = self._make_request(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request
    conn.request(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 400, in request
    self.endheaders()
  File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output
    self.send(msg)
  File "/usr/lib/python3.10/http/client.py", line 976, in send
    self.connect()
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 238, in connect
    self.sock = self._new_conn()
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 213, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fa7bbd8fc40>: Failed to establish a new connection: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  [Previous line repeated 2 more times]
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 847, in urlopen
    retries = retries.increment(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8fc40>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 128, in http_request
    return _get_http_response_with_retries(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/request_utils.py", line 228, in _get_http_response_with_retries
    return session.request(method, url, allow_redirects=allow_redirects, **kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8fc40>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pawel/ium/IUM_08/examples/sklearn_elasticnet_wine/train.py", line 24, in <module>
    mlflow.set_experiment("s123456")
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/fluent.py", line 143, in set_experiment
    experiment = client.get_experiment_by_name(experiment_name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/client.py", line 544, in get_experiment_by_name
    return self._tracking_client.get_experiment_by_name(name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/_tracking_service/client.py", line 236, in get_experiment_by_name
    return self.store.get_experiment_by_name(name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 323, in get_experiment_by_name
    response_proto = self._call_endpoint(GetExperimentByName, req_body)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 60, in _call_endpoint
    return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 286, in call_endpoint
    response = http_request(**call_kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 151, in http_request
    raise MlflowException(f"API request to {url} failed with exception {e}")
mlflow.exceptions.MlflowException: API request to http://localhost:5000/api/2.0/mlflow/experiments/get-by-name failed with exception HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8fc40>: Failed to establish a new connection: [Errno 111] Connection refused'))
WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=4, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947130>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=3, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947580>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947730>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed99478e0>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947a90>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 198, in _new_conn
    sock = connection.create_connection(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 793, in urlopen
    response = self._make_request(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request
    conn.request(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 400, in request
    self.endheaders()
  File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output
    self.send(msg)
  File "/usr/lib/python3.10/http/client.py", line 976, in send
    self.connect()
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 238, in connect
    self.sock = self._new_conn()
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 213, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f8ed9947c40>: Failed to establish a new connection: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  [Previous line repeated 2 more times]
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 847, in urlopen
    retries = retries.increment(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947c40>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 128, in http_request
    return _get_http_response_with_retries(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/request_utils.py", line 228, in _get_http_response_with_retries
    return session.request(method, url, allow_redirects=allow_redirects, **kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947c40>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pawel/ium/IUM_08/examples/sklearn_elasticnet_wine/train.py", line 24, in <module>
    mlflow.set_experiment("s123456")
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/fluent.py", line 143, in set_experiment
    experiment = client.get_experiment_by_name(experiment_name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/client.py", line 544, in get_experiment_by_name
    return self._tracking_client.get_experiment_by_name(name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/_tracking_service/client.py", line 236, in get_experiment_by_name
    return self.store.get_experiment_by_name(name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 323, in get_experiment_by_name
    response_proto = self._call_endpoint(GetExperimentByName, req_body)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 60, in _call_endpoint
    return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 286, in call_endpoint
    response = http_request(**call_kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 151, in http_request
    raise MlflowException(f"API request to {url} failed with exception {e}")
mlflow.exceptions.MlflowException: API request to http://localhost:5000/api/2.0/mlflow/experiments/get-by-name failed with exception HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947c40>: Failed to establish a new connection: [Errno 111] Connection refused'))
WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=4, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63130>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=3, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63580>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63730>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b638e0>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63a90>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 198, in _new_conn
    sock = connection.create_connection(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 793, in urlopen
    response = self._make_request(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request
    conn.request(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 400, in request
    self.endheaders()
  File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output
    self.send(msg)
  File "/usr/lib/python3.10/http/client.py", line 976, in send
    self.connect()
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 238, in connect
    self.sock = self._new_conn()
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 213, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f46d5b63c40>: Failed to establish a new connection: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  [Previous line repeated 2 more times]
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 847, in urlopen
    retries = retries.increment(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63c40>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 128, in http_request
    return _get_http_response_with_retries(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/request_utils.py", line 228, in _get_http_response_with_retries
    return session.request(method, url, allow_redirects=allow_redirects, **kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63c40>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pawel/ium/IUM_08/examples/sklearn_elasticnet_wine/train.py", line 24, in <module>
    mlflow.set_experiment("s123456")
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/fluent.py", line 143, in set_experiment
    experiment = client.get_experiment_by_name(experiment_name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/client.py", line 544, in get_experiment_by_name
    return self._tracking_client.get_experiment_by_name(name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/_tracking_service/client.py", line 236, in get_experiment_by_name
    return self.store.get_experiment_by_name(name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 323, in get_experiment_by_name
    response_proto = self._call_endpoint(GetExperimentByName, req_body)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 60, in _call_endpoint
    return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 286, in call_endpoint
    response = http_request(**call_kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 151, in http_request
    raise MlflowException(f"API request to {url} failed with exception {e}")
mlflow.exceptions.MlflowException: API request to http://localhost:5000/api/2.0/mlflow/experiments/get-by-name failed with exception HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63c40>: Failed to establish a new connection: [Errno 111] Connection refused'))
WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=4, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda74a13130>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456
^C
Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 198, in _new_conn
    sock = connection.create_connection(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 793, in urlopen
    response = self._make_request(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request
    conn.request(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 400, in request
    self.endheaders()
  File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output
    self.send(msg)
  File "/usr/lib/python3.10/http/client.py", line 976, in send
    self.connect()
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 238, in connect
    self.sock = self._new_conn()
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 213, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fda74a13580>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pawel/ium/IUM_08/examples/sklearn_elasticnet_wine/train.py", line 24, in <module>
    mlflow.set_experiment("s123456")
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/fluent.py", line 143, in set_experiment
    experiment = client.get_experiment_by_name(experiment_name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/client.py", line 544, in get_experiment_by_name
    return self._tracking_client.get_experiment_by_name(name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/_tracking_service/client.py", line 236, in get_experiment_by_name
    return self.store.get_experiment_by_name(name)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 323, in get_experiment_by_name
    response_proto = self._call_endpoint(GetExperimentByName, req_body)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 60, in _call_endpoint
    return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 286, in call_endpoint
    response = http_request(**call_kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 128, in http_request
    return _get_http_response_with_retries(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/request_utils.py", line 228, in _get_http_response_with_retries
    return session.request(method, url, allow_redirects=allow_redirects, **kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 850, in urlopen
    retries.sleep()
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 359, in sleep
    self._sleep_backoff()
  File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 343, in _sleep_backoff
    time.sleep(backoff)
KeyboardInterrupt
### Informacje o przebieagach eksperymentu zostały zapisane w katalogu mlruns
! ls -l IUM_08/examples/mlruns/0 | head
total 16
drwxrwxr-x 6 tomek tomek 4096 maj 17 08:43 375cde31bdd44a45a91fd7cee92ebcda
drwxrwxr-x 6 tomek tomek 4096 maj 17 10:38 b395b55b47fc43de876b67f5a4a5dae9
drwxrwxr-x 6 tomek tomek 4096 maj 17 09:15 b3ead42eca964113b29e7e5f8bcb7bb7
-rw-rw-r-- 1 tomek tomek  151 maj 17 08:43 meta.yaml
! ls -l IUM_08/examples/mlruns/0/375cde31bdd44a45a91fd7cee92ebcda
total 20
drwxrwxr-x 3 tomek tomek 4096 maj 17 08:43 artifacts
-rw-rw-r-- 1 tomek tomek  423 maj 17 08:43 meta.yaml
drwxrwxr-x 2 tomek tomek 4096 maj 17 08:43 metrics
drwxrwxr-x 2 tomek tomek 4096 maj 17 08:43 params
drwxrwxr-x 2 tomek tomek 4096 maj 17 08:43 tags
### Możemy je obejrzeć w przeglądarce uruchamiając interfejs webowy:
### (powinniśmy to wywołać w normalnej konsoli, w jupyter będziemy mieli zablokowany kernel)
! cd IUM_08/examples/; mlflow ui
[2021-05-16 17:58:43 +0200] [118029] [INFO] Starting gunicorn 20.1.0
[2021-05-16 17:58:43 +0200] [118029] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2021-05-16 17:58:43 +0200] [118029] [ERROR] Retrying in 1 second.
[2021-05-16 17:58:44 +0200] [118029] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2021-05-16 17:58:44 +0200] [118029] [ERROR] Retrying in 1 second.
[2021-05-16 17:58:45 +0200] [118029] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2021-05-16 17:58:45 +0200] [118029] [ERROR] Retrying in 1 second.
[2021-05-16 17:58:46 +0200] [118029] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2021-05-16 17:58:46 +0200] [118029] [ERROR] Retrying in 1 second.
[2021-05-16 17:58:47 +0200] [118029] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2021-05-16 17:58:47 +0200] [118029] [ERROR] Retrying in 1 second.
[2021-05-16 17:58:48 +0200] [118029] [ERROR] Can't connect to ('127.0.0.1', 5000)
Running the mlflow server failed. Please see the logs above for details.

Instancja na naszym serwerze: http://tzietkiewicz.vm.wmi.amu.edu.pl:5000/#/

Wygląd interfejsu webowego

Porównywanie wyników

Logowanie

  • logowania metryk i parametrów można dokonać m.in. poprzez wywołania Python-owego API: mlflow.log_param() i mlflow.log_metric(). Więcej dostępnych funkcji: link

  • wywołania te muszą nastąpić po wykonaniu mlflow.start_run(), najlepiej wewnątrz bloku:

    with mlflow.start_run():
         
         #[...]
    
         mlflow.log_param("alpha", alpha)
         mlflow.log_param("l1_ratio", l1_ratio)
    
  • jest też możliwość automatycznego logwania dla wybranych bibliotek: https://mlflow.org/docs/latest/tracking.html#automatic-logging

MLflow Projects

  • MLflow projects to zestaw konwencji i kilku narzędzi
  • ułatwiają one uruchamianie eskperymentów

Konfiguracja projektu

  • W pliku MLproject zapisuje się konfigurację projektu (specyfikacja)
  • Zawiera ona:
    • odnośnik do środowiska, w którym ma być wywołany eksperyment szczegóły:
      • nazwa obrazu Docker
      • albo ścieżka do pliku conda.yaml definiującego środowisko wykonania Conda
    • parametry, z którymi można wywołać eksperyment
    • polecenia służące do wywołania eksperymentu
%%writefile IUM_08/examples/sklearn_elasticnet_wine/MLproject
name: tutorial

conda_env: conda.yaml #ścieżka do pliku conda.yaml z definicją środowiska
    
#docker_env:
#  image: mlflow-docker-example-environment

entry_points:
  main:
    parameters:
      alpha: {type: float, default: 0.5}
      l1_ratio: {type: float, default: 0.1}
    command: "python train.py {alpha} {l1_ratio}"
  test:
    parameters:
      alpha: {type: cutoff, default: 0}
    command: "python test.py {cutoff}"
Overwriting IUM_08/examples/sklearn_elasticnet_wine/MLproject
%%writefile IUM_08/examples/sklearn_elasticnet_wine/conda.yaml
name: tutorial
channels:
  - defaults
dependencies:
  - python=3.6 #Te zależności będą zainstalowane za pomocą conda isntall
  - pip
  - pip: #Te ząś za pomocą pip install
    - scikit-learn==0.23.2
    - mlflow>=1.0
Overwriting IUM_08/examples/sklearn_elasticnet_wine/conda.yaml

Środowisko docker

  • zamiast środowiska Conda możemy również podać nazwę obrazu docker, w którym ma być wywołany eksperyment.
  • obraz będzie szukany lokalnie a następnie na DockerHub, lub w innym repozytorium dockera
  • składnia specyfikacji ścieżki jest taka sama jak w przypadki poleceń dockera, np. docker pull link
  • Można również podać katalogi do podmontowania wewnątrz kontenera oraz wartości zmiennych środowiskowych do ustawienia w kontenerze:
    docker_env:
     image: mlflow-docker-example-environment
     volumes: ["/local/path:/container/mount/path"]
     environment: [["NEW_ENV_VAR", "new_var_value"], "VAR_TO_COPY_FROM_HOST_ENVIRONMENT"]
    

Parametry

  • Specyfikacja parametrów w pliku MLproject pozwala na ich walidację i używanie wartości domyślnych

  • Dostępne typy:

    • String
    • Float - dowolna liczba (MLflow waliduje, czy podana wartość jest liczbą)
    • Path - pozwala podawać ścieżki względne (przekształca je na bezwzlędne) do plików lokalnych albo do plików zdalnych (np. do s3://) - zostaną wtedy ściągnięte lokalnie
    • URI - podobnie jak path, ale do rozproszonych systemów plików
  • Składnia

    parameter_name: {type: data_type, default: value}  # Short syntax
    
    parameter_name:     # Long syntax
       type: data_type
       default: value
    

Uruchamianie projektu

  • Projekt możemy uruchomić przy pomocy polecenia mlflow run (dokumentacja)
  • Spowoduje to przygotowanie środowiska i uruchomienie eksperymentu wewnątrz środowiska
  • domyślnie zostanie uruchomione polecenie zdefiniowane w "entry point" main. Żeby uruchomić inny "entry point", możemy użyć parametru -e, np:
    mlflow run sklearn_elasticnet_wine -e test
    
  • Parametry do naszego polecenia możemy przekazywać przy pomocy flagi -P
!cd IUM_08/examples/; mlflow run sklearn_elasticnet_wine -P alpha=0.42
2021/05/16 17:59:10 INFO mlflow.projects.utils: === Created directory /tmp/tmprq4mdosv for downloading remote URIs passed to arguments of type 'path' ===
2021/05/16 17:59:10 INFO mlflow.projects.backend.local: === Running command 'source /home/tomek/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29 1>&2 && python train.py 0.42 0.1' in run with ID '1860d321ea1545ff8866e4ba199d1712' === 
Elasticnet model (alpha=0.420000, l1_ratio=0.100000):
  RMSE: 0.7420620899060748
  MAE: 0.5722846717246247
  R2: 0.21978513651550236
2021/05/16 17:59:19 INFO mlflow.projects: === Run (ID '1860d321ea1545ff8866e4ba199d1712') succeeded ===

Zadania [10p pkt]

  1. Dodaj do swojego projektu logowanie parametrów i metryk za pomocą MLflow (polecenia mlflow.log_param i mlflow.log_metric
  2. Dodaj plik MLProject definiujący polecenia do trenowania i testowania, ich parametry wywołania oraz środowisko (Conda albo Docker)

MLflow Models

MLflow Models to konwencja zapisu modeli, która ułatwia potem ich załadowanie i użycie

Rodzaje modeli ("flavors") wspierane przez MLflow:

  • Python Function (python_function)
  • PyTorch (pytorch)
  • TensorFlow (tensorflow)
  • Keras (keras)
  • Scikit-learn (sklearn)
  • Spacy(spaCy)
  • ONNX (onnx)
  • R Function (crate)
  • H2O (h2o)
  • MLeap (mleap)
  • Spark MLlib (spark)
  • MXNet Gluon (gluon)
  • XGBoost (xgboost)
  • LightGBM (lightgbm)
  • CatBoost (catboost)
  • Fastai(fastai)
  • Statsmodels (statsmodels)

Zapisywanie modelu

Model ML można zapisać w MLflow przy pomocy jednej z dwóch funkcji z pakietu odpowiadającego używanej przez nas bibliotece:

  • save_model() - zapisuje model na dysku
  • log_model() - zapisuje model razem z innymi informacjami (metrykami, parametrami). W zależności od ustawień "tracking_uri" może być to lokalny folder w mlruns/ lub ścieżka na zdalnym serwerze MLflow
        mlflow.sklearn.save_model(lr, "my_model")
        mlflow.keras.save_model(lr, "my_model")

Wywołanie tej funkcji spowoduje stworzenie katalogu "my_model" zawierającego:

  • plik _MLmodel zawierający informacje o sposobach, w jaki model można załadować ("flavors") oraz ścieżki do plików związanych z modelem, takich jak:
    • _conda.yaml - opis środowiska potrzebnego do załadowania modelu
    • _model.pkl - plik z zserializowanym modelem

Tylko plik _MLmodel jest specjalnym plikiem MLflow - reszta zależy od konkrentego "flavour"

ls IUM_08/examples/my_model
conda.yaml  MLmodel  model.pkl
! ls -l IUM_08/examples/mlruns/0/b395b55b47fc43de876b67f5a4a5dae9/artifacts/model
total 12
-rw-rw-r-- 1 tomek tomek 153 maj 17 10:38 conda.yaml
-rw-rw-r-- 1 tomek tomek 958 maj 17 10:38 MLmodel
-rw-rw-r-- 1 tomek tomek 641 maj 17 10:38 model.pkl
# %load IUM_08/examples/mlruns/0/b395b55b47fc43de876b67f5a4a5dae9/artifacts/model/MLmodel
artifact_path: model
flavors:
  python_function:
    env: conda.yaml
    loader_module: mlflow.sklearn
    model_path: model.pkl
    python_version: 3.9.1
  sklearn:
    pickled_model: model.pkl
    serialization_format: cloudpickle
    sklearn_version: 0.24.2
run_id: b395b55b47fc43de876b67f5a4a5dae9
signature:
  inputs: '[{"name": "fixed acidity", "type": "double"}, {"name": "volatile acidity",
    "type": "double"}, {"name": "citric acid", "type": "double"}, {"name": "residual
    sugar", "type": "double"}, {"name": "chlorides", "type": "double"}, {"name": "free
    sulfur dioxide", "type": "double"}, {"name": "total sulfur dioxide", "type": "double"},
    {"name": "density", "type": "double"}, {"name": "pH", "type": "double"}, {"name":
    "sulphates", "type": "double"}, {"name": "alcohol", "type": "double"}]'
  outputs: '[{"type": "tensor", "tensor-spec": {"dtype": "float64", "shape": [-1]}}]'
utc_time_created: '2021-05-17 08:38:41.749670'
# %load IUM_08/examples/my_model/conda.yaml
channels:
- defaults
- conda-forge
dependencies:
- python=3.9.1
- pip
- pip:
  - mlflow
  - scikit-learn==0.24.2
  - cloudpickle==1.6.0
name: mlflow-env

Dodatkowe pola w MLmodel

  • _utc_time_created - timestamp z czasem stworzenia modelu
  • _run_id - ID uruchomienia ("run"), które stworzyło ten model, jeśli model był zapisany za pomocą MLflow Tracking.
  • _signature - opisa danych wejściowych i wyjściowych w formacie JSON
  • _input_example przykładowe wejście przyjmowane przez model. Można je podać poprzez parametr input_example funkcji log_model
import mlflow
import pandas as pd
model = mlflow.sklearn.load_model("IUM_08/examples/mlruns/0/b395b55b47fc43de876b67f5a4a5dae9/artifacts/model")
csv_url = "http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
data = pd.read_csv(csv_url, sep=";")
model.predict(data.drop(["quality"], axis=1).head())
array([5.57688397, 5.50664777, 5.52550482, 5.50431125, 5.57688397])

Serwowanie modeli

!cd IUM_08/examples/; mlflow models --help
Usage: mlflow models [OPTIONS] COMMAND [ARGS]...

  Deploy MLflow models locally.

  To deploy a model associated with a run on a tracking server, set the
  MLFLOW_TRACKING_URI environment variable to the URL of the desired server.

Options:
  --help  Show this message and exit.

Commands:
  build-docker  **EXPERIMENTAL**: Builds a Docker image whose default...
  predict       Generate predictions in json format using a saved MLflow...
  prepare-env   **EXPERIMENTAL**: Performs any preparation necessary to...
  serve         Serve a model saved with MLflow by launching a webserver on...
!cd IUM_08/examples/; mlflow models serve --help
Usage: mlflow models serve [OPTIONS]

  Serve a model saved with MLflow by launching a webserver on the specified
  host and port. The command supports models with the ``python_function`` or
  ``crate`` (R Function) flavor. For information about the input data
  formats accepted by the webserver, see the following documentation:
  https://www.mlflow.org/docs/latest/models.html#built-in-deployment-tools.

  You can make requests to ``POST /invocations`` in pandas split- or record-
  oriented formats.

  Example:

  .. code-block:: bash

      $ mlflow models serve -m runs:/my-run-id/model-path &

      $ curl http://127.0.0.1:5000/invocations -H 'Content-Type:
      application/json' -d '{         "columns": ["a", "b", "c"],
      "data": [[1, 2, 3], [4, 5, 6]]     }'

Options:
  -m, --model-uri URI  URI to the model. A local path, a 'runs:/' URI, or a
                       remote storage URI (e.g., an 's3://' URI). For more
                       information about supported remote URIs for model
                       artifacts, see
                       https://mlflow.org/docs/latest/tracking.html#artifact-
                       stores  [required]

  -p, --port INTEGER   The port to listen on (default: 5000).
  -h, --host HOST      The network address to listen on (default: 127.0.0.1).
                       Use 0.0.0.0 to bind to all addresses if you want to
                       access the tracking server from other machines.

  -w, --workers TEXT   Number of gunicorn worker processes to handle requests
                       (default: 4).

  --no-conda           If specified, will assume that MLmodel/MLproject is
                       running within a Conda environment with the necessary
                       dependencies for the current project instead of
                       attempting to create a new conda environment.

  --install-mlflow     If specified and there is a conda environment to be
                       activated mlflow will be installed into the environment
                       after it has been activated. The version of installed
                       mlflow will be the same asthe one used to invoke this
                       command.

  --help               Show this message and exit.
import pandas as pd
csv_url = "http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
data = pd.read_csv(csv_url, sep=";").drop(["quality"], axis=1).head(1).to_json(orient='split')
print(data)
{"columns":["fixed acidity","volatile acidity","citric acid","residual sugar","chlorides","free sulfur dioxide","total sulfur dioxide","density","pH","sulphates","alcohol"],"index":[0],"data":[[7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4]]}
!curl http://127.0.0.1:5003/invocations -H 'Content-Type: application/json' -d '{\
        "columns":[\
            "fixed acidity","volatile acidity","citric acid","residual sugar","chlorides","free sulfur dioxide","total sulfur dioxide","density","pH","sulphates","alcohol"],\
            "index":[0],\
            "data":[[7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4]]}'
[5.576883967129615]
$ cd IUM_08/examples/
$ mlflow models serve -m my_model
2021/05/17 08:52:07 INFO mlflow.models.cli: Selected backend for flavor 'python_function'
2021/05/17 08:52:07 INFO mlflow.pyfunc.backend: === Running command 'source /home/tomek/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-503f0c7520a32f054a9d168bd099584a9439de9d 1>&2 && gunicorn --timeout=60 -b 127.0.0.1:5003 -w 1 ${GUNICORN_CMD_ARGS} -- mlflow.pyfunc.scoring_server.wsgi:app'
[2021-05-17 08:52:07 +0200] [291217] [INFO] Starting gunicorn 20.1.0
[2021-05-17 08:52:07 +0200] [291217] [INFO] Listening at: http://127.0.0.1:5003 (291217)
[2021-05-17 08:52:07 +0200] [291217] [INFO] Using worker: sync
[2021-05-17 08:52:07 +0200] [291221] [INFO] Booting worker with pid: 291221

MLflow Registry

  • umożliwia zapisywanie i ładowanie modeli z centralnego rejestru
  • Modele można też serwować bezpośrednio z rejestru:
#!/usr/bin/env sh

# Set environment variable for the tracking URL where the Model Registry resides
export MLFLOW_TRACKING_URI=http://localhost:5000

# Serve the production model from the model registry
mlflow models serve -m "models:/sk-learn-random-forest-reg-model/Production"
  • Żeby było to możliwe, musimy mieć uruchomiony serwer MLflow
  • Umożliwia zarządzanie wersjami modeli i oznaczanie ich różnymi fazami, np. "Staging", "Production"