77 KiB
Inżynieria uczenia maszynowego
24 kwietnia 2024
8. MLFlow
MLflow
- https://mlflow.org/
- Narzędzie podobne do omawianego na poprzednich zajęciach Sacred
- Nieco inne podejście: mniej ingerencji w istniejący kod
- Bardziej kompleksowe rozwiązanie: 4 komponenty, pierwszy z nich ma funkcjonalność podobną do Sacred
- Działa "z każdym" językiem. A tak naprawdę: Python, R, Java + CLI API + REST API
- Popularna wśród pracodawców - wyniki wyszukiwania ofert pracy: 20 ofert (https://pl.indeed.com/), 36 ofert (linkedin). Sacred: 0
- Integracja z licznymi bibliotekami / chmurami
- Rozwiązanie OpenSource, stworzone przez firmę Databricks
- Dostępne różne wydania / opcje instalacji:
- płatne:
- Databricks Customers
- bezpłatne:
- Databricks Community Edition
- Self-managed MLflow
- Local Tracking Server
- płatne:
Komponenty
MLflow składa się z czterech niezależnych komponentów:
MLflow Tracking - pozwala śledzić zmiany parametrów, kodu, środowiska i ich wpływ na metryki. Jest to funkcjonalność bardzo zbliżona do tej, którą zapewnia Sacred
MLflow Projects - umożliwia "pakowanie" kodu ekserymentów w taki sposób, żeby mogłby być w łatwy sposób zreprodukowane przez innych
MLflow Models - ułatwia "pakowanie" modeli uczenia maszynowego
MLflow Registry - zapewnia centralne miejsce do przechowywania i współdzielenia modeli. Zapewnia narzędzia do wersjonowania i śledzenia pochodzenia tych modeli.
Komponenty te mogą być używane razem bądź oddzielnie.
MLflow Tracking - przykład
(Poniższe przykłady kodu trenującego pochodzą z tutoriala MLflow).
%%capture null
!pip install mlflow
!pip install sklearn
!mkdir -p IUM_08/examples/sklearn_elasticnet_wine/
%%writefile IUM_08/examples/sklearn_elasticnet_wine/train.py
# The data set used in this example is from http://archive.ics.uci.edu/ml/datasets/Wine+Quality
# P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
# Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.
import os
import warnings
import sys
import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
from urllib.parse import urlparse
import mlflow
import mlflow.sklearn
import logging
logging.basicConfig(level=logging.WARN)
logger = logging.getLogger(__name__)
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("s123456")
def eval_metrics(actual, pred):
rmse = np.sqrt(mean_squared_error(actual, pred))
mae = mean_absolute_error(actual, pred)
r2 = r2_score(actual, pred)
return rmse, mae, r2
if __name__ == "__main__":
warnings.filterwarnings("ignore")
np.random.seed(40)
# Read the wine-quality csv file from the URL
csv_url = (
"http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
)
try:
data = pd.read_csv(csv_url, sep=";")
except Exception as e:
logger.exception(
"Unable to download training & test CSV, check your internet connection. Error: %s", e
)
# Split the data into training and test sets. (0.75, 0.25) split.
train, test = train_test_split(data)
# The predicted column is "quality" which is a scalar from [3, 9]
train_x = train.drop(["quality"], axis=1)
test_x = test.drop(["quality"], axis=1)
train_y = train[["quality"]]
test_y = test[["quality"]]
alpha = float(sys.argv[1]) if len(sys.argv) > 1 else 0.5
#alpha = 0.5
l1_ratio = float(sys.argv[2]) if len(sys.argv) > 2 else 0.5
#l1_ratio = 0.5
with mlflow.start_run() as run:
print("MLflow run experiment_id: {0}".format(run.info.experiment_id))
print("MLflow run artifact_uri: {0}".format(run.info.artifact_uri))
lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
lr.fit(train_x, train_y)
predicted_qualities = lr.predict(test_x)
(rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)
print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
print(" RMSE: %s" % rmse)
print(" MAE: %s" % mae)
print(" R2: %s" % r2)
mlflow.log_param("alpha", alpha)
mlflow.log_param("l1_ratio", l1_ratio)
mlflow.log_metric("rmse", rmse)
mlflow.log_metric("r2", r2)
mlflow.log_metric("mae", mae)
# Infer model signature to log it
# Więcej o sygnaturach: https://mlflow.org/docs/latest/models.html?highlight=signature#model-signature
signature = mlflow.models.signature.infer_signature(train_x, lr.predict(train_x))
tracking_url_type_store = urlparse(mlflow.get_tracking_uri()).scheme
# Model registry does not work with file store
if tracking_url_type_store != "file":
# Register the model
# There are other ways to use the Model Registry, which depends on the use case,
# please refer to the doc for more information:
# https://mlflow.org/docs/latest/model-registry.html#api-workflow
mlflow.sklearn.log_model(lr, "wines-model", registered_model_name="ElasticnetWineModel", signature=signature)
else:
mlflow.sklearn.log_model(lr, "model", signature=signature)
Writing IUM_08/examples/sklearn_elasticnet_wine/train.py
! ls -l /tmp/mlruns
### Wtyrenujmy model z domyślnymi wartościami parametrów
! cd ./IUM_08/examples/; python sklearn_elasticnet_wine/train.py
ls: cannot access '/tmp/mlruns': No such file or directory WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=4, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7130>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=3, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7580>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7730>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b78e0>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7a90>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 198, in _new_conn sock = connection.create_connection( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection raise err File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 793, in urlopen response = self._make_request( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request conn.request( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 400, in request self.endheaders() File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output self.send(msg) File "/usr/lib/python3.10/http/client.py", line 976, in send self.connect() File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 238, in connect self.sock = self._new_conn() File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 213, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fd9988b7c40>: Failed to establish a new connection: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 486, in send resp = conn.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen return self.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen return self.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen return self.urlopen( [Previous line repeated 2 more times] File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 847, in urlopen retries = retries.increment( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7c40>: Failed to establish a new connection: [Errno 111] Connection refused')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 128, in http_request return _get_http_response_with_retries( File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/request_utils.py", line 228, in _get_http_response_with_retries return session.request(method, url, allow_redirects=allow_redirects, **kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request resp = self.send(prep, **send_kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send r = adapter.send(request, **kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 519, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7c40>: Failed to establish a new connection: [Errno 111] Connection refused')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/pawel/ium/IUM_08/examples/sklearn_elasticnet_wine/train.py", line 24, in <module> mlflow.set_experiment("s123456") File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/fluent.py", line 143, in set_experiment experiment = client.get_experiment_by_name(experiment_name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/client.py", line 544, in get_experiment_by_name return self._tracking_client.get_experiment_by_name(name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/_tracking_service/client.py", line 236, in get_experiment_by_name return self.store.get_experiment_by_name(name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 323, in get_experiment_by_name response_proto = self._call_endpoint(GetExperimentByName, req_body) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 60, in _call_endpoint return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 286, in call_endpoint response = http_request(**call_kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 151, in http_request raise MlflowException(f"API request to {url} failed with exception {e}") mlflow.exceptions.MlflowException: API request to http://localhost:5000/api/2.0/mlflow/experiments/get-by-name failed with exception HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd9988b7c40>: Failed to establish a new connection: [Errno 111] Connection refused'))
### I jeszcze raz, tym razem ze zmienionymi wartościami parametrów
! cd ./IUM_08/examples/; for l in {1..9}; do for a in {1..9}; do python sklearn_elasticnet_wine/train.py 0.$a 0.$l; done; done
WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=4, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8f130>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=3, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8f580>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8f730>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8f8e0>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8fa90>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 198, in _new_conn sock = connection.create_connection( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection raise err File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 793, in urlopen response = self._make_request( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request conn.request( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 400, in request self.endheaders() File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output self.send(msg) File "/usr/lib/python3.10/http/client.py", line 976, in send self.connect() File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 238, in connect self.sock = self._new_conn() File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 213, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fa7bbd8fc40>: Failed to establish a new connection: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 486, in send resp = conn.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen return self.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen return self.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen return self.urlopen( [Previous line repeated 2 more times] File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 847, in urlopen retries = retries.increment( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8fc40>: Failed to establish a new connection: [Errno 111] Connection refused')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 128, in http_request return _get_http_response_with_retries( File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/request_utils.py", line 228, in _get_http_response_with_retries return session.request(method, url, allow_redirects=allow_redirects, **kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request resp = self.send(prep, **send_kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send r = adapter.send(request, **kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 519, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8fc40>: Failed to establish a new connection: [Errno 111] Connection refused')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/pawel/ium/IUM_08/examples/sklearn_elasticnet_wine/train.py", line 24, in <module> mlflow.set_experiment("s123456") File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/fluent.py", line 143, in set_experiment experiment = client.get_experiment_by_name(experiment_name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/client.py", line 544, in get_experiment_by_name return self._tracking_client.get_experiment_by_name(name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/_tracking_service/client.py", line 236, in get_experiment_by_name return self.store.get_experiment_by_name(name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 323, in get_experiment_by_name response_proto = self._call_endpoint(GetExperimentByName, req_body) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 60, in _call_endpoint return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 286, in call_endpoint response = http_request(**call_kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 151, in http_request raise MlflowException(f"API request to {url} failed with exception {e}") mlflow.exceptions.MlflowException: API request to http://localhost:5000/api/2.0/mlflow/experiments/get-by-name failed with exception HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa7bbd8fc40>: Failed to establish a new connection: [Errno 111] Connection refused')) WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=4, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947130>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=3, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947580>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947730>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed99478e0>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947a90>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 198, in _new_conn sock = connection.create_connection( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection raise err File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 793, in urlopen response = self._make_request( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request conn.request( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 400, in request self.endheaders() File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output self.send(msg) File "/usr/lib/python3.10/http/client.py", line 976, in send self.connect() File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 238, in connect self.sock = self._new_conn() File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 213, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f8ed9947c40>: Failed to establish a new connection: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 486, in send resp = conn.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen return self.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen return self.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen return self.urlopen( [Previous line repeated 2 more times] File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 847, in urlopen retries = retries.increment( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947c40>: Failed to establish a new connection: [Errno 111] Connection refused')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 128, in http_request return _get_http_response_with_retries( File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/request_utils.py", line 228, in _get_http_response_with_retries return session.request(method, url, allow_redirects=allow_redirects, **kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request resp = self.send(prep, **send_kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send r = adapter.send(request, **kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 519, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947c40>: Failed to establish a new connection: [Errno 111] Connection refused')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/pawel/ium/IUM_08/examples/sklearn_elasticnet_wine/train.py", line 24, in <module> mlflow.set_experiment("s123456") File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/fluent.py", line 143, in set_experiment experiment = client.get_experiment_by_name(experiment_name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/client.py", line 544, in get_experiment_by_name return self._tracking_client.get_experiment_by_name(name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/_tracking_service/client.py", line 236, in get_experiment_by_name return self.store.get_experiment_by_name(name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 323, in get_experiment_by_name response_proto = self._call_endpoint(GetExperimentByName, req_body) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 60, in _call_endpoint return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 286, in call_endpoint response = http_request(**call_kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 151, in http_request raise MlflowException(f"API request to {url} failed with exception {e}") mlflow.exceptions.MlflowException: API request to http://localhost:5000/api/2.0/mlflow/experiments/get-by-name failed with exception HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ed9947c40>: Failed to establish a new connection: [Errno 111] Connection refused')) WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=4, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63130>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=3, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63580>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63730>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b638e0>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63a90>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 198, in _new_conn sock = connection.create_connection( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection raise err File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 793, in urlopen response = self._make_request( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request conn.request( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 400, in request self.endheaders() File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output self.send(msg) File "/usr/lib/python3.10/http/client.py", line 976, in send self.connect() File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 238, in connect self.sock = self._new_conn() File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 213, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f46d5b63c40>: Failed to establish a new connection: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 486, in send resp = conn.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen return self.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen return self.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen return self.urlopen( [Previous line repeated 2 more times] File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 847, in urlopen retries = retries.increment( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63c40>: Failed to establish a new connection: [Errno 111] Connection refused')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 128, in http_request return _get_http_response_with_retries( File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/request_utils.py", line 228, in _get_http_response_with_retries return session.request(method, url, allow_redirects=allow_redirects, **kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request resp = self.send(prep, **send_kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send r = adapter.send(request, **kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 519, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63c40>: Failed to establish a new connection: [Errno 111] Connection refused')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/pawel/ium/IUM_08/examples/sklearn_elasticnet_wine/train.py", line 24, in <module> mlflow.set_experiment("s123456") File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/fluent.py", line 143, in set_experiment experiment = client.get_experiment_by_name(experiment_name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/client.py", line 544, in get_experiment_by_name return self._tracking_client.get_experiment_by_name(name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/_tracking_service/client.py", line 236, in get_experiment_by_name return self.store.get_experiment_by_name(name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 323, in get_experiment_by_name response_proto = self._call_endpoint(GetExperimentByName, req_body) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 60, in _call_endpoint return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 286, in call_endpoint response = http_request(**call_kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 151, in http_request raise MlflowException(f"API request to {url} failed with exception {e}") mlflow.exceptions.MlflowException: API request to http://localhost:5000/api/2.0/mlflow/experiments/get-by-name failed with exception HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f46d5b63c40>: Failed to establish a new connection: [Errno 111] Connection refused')) WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=4, read=5, redirect=5, status=5)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda74a13130>: Failed to establish a new connection: [Errno 111] Connection refused')': /api/2.0/mlflow/experiments/get-by-name?experiment_name=s123456 ^C Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 198, in _new_conn sock = connection.create_connection( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection raise err File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 793, in urlopen response = self._make_request( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request conn.request( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 400, in request self.endheaders() File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output self.send(msg) File "/usr/lib/python3.10/http/client.py", line 976, in send self.connect() File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 238, in connect self.sock = self._new_conn() File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connection.py", line 213, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fda74a13580>: Failed to establish a new connection: [Errno 111] Connection refused During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/pawel/ium/IUM_08/examples/sklearn_elasticnet_wine/train.py", line 24, in <module> mlflow.set_experiment("s123456") File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/fluent.py", line 143, in set_experiment experiment = client.get_experiment_by_name(experiment_name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/client.py", line 544, in get_experiment_by_name return self._tracking_client.get_experiment_by_name(name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/tracking/_tracking_service/client.py", line 236, in get_experiment_by_name return self.store.get_experiment_by_name(name) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 323, in get_experiment_by_name response_proto = self._call_endpoint(GetExperimentByName, req_body) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/store/tracking/rest_store.py", line 60, in _call_endpoint return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 286, in call_endpoint response = http_request(**call_kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/rest_utils.py", line 128, in http_request return _get_http_response_with_retries( File "/home/pawel/ium/venv/lib/python3.10/site-packages/mlflow/utils/request_utils.py", line 228, in _get_http_response_with_retries return session.request(method, url, allow_redirects=allow_redirects, **kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request resp = self.send(prep, **send_kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send r = adapter.send(request, **kwargs) File "/home/pawel/ium/venv/lib/python3.10/site-packages/requests/adapters.py", line 486, in send resp = conn.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen return self.urlopen( File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 850, in urlopen retries.sleep() File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 359, in sleep self._sleep_backoff() File "/home/pawel/ium/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 343, in _sleep_backoff time.sleep(backoff) KeyboardInterrupt
### Informacje o przebieagach eksperymentu zostały zapisane w katalogu mlruns
! ls -l IUM_08/examples/mlruns/0 | head
total 16 drwxrwxr-x 6 tomek tomek 4096 maj 17 08:43 375cde31bdd44a45a91fd7cee92ebcda drwxrwxr-x 6 tomek tomek 4096 maj 17 10:38 b395b55b47fc43de876b67f5a4a5dae9 drwxrwxr-x 6 tomek tomek 4096 maj 17 09:15 b3ead42eca964113b29e7e5f8bcb7bb7 -rw-rw-r-- 1 tomek tomek 151 maj 17 08:43 meta.yaml
! ls -l IUM_08/examples/mlruns/0/375cde31bdd44a45a91fd7cee92ebcda
total 20 drwxrwxr-x 3 tomek tomek 4096 maj 17 08:43 artifacts -rw-rw-r-- 1 tomek tomek 423 maj 17 08:43 meta.yaml drwxrwxr-x 2 tomek tomek 4096 maj 17 08:43 metrics drwxrwxr-x 2 tomek tomek 4096 maj 17 08:43 params drwxrwxr-x 2 tomek tomek 4096 maj 17 08:43 tags
### Możemy je obejrzeć w przeglądarce uruchamiając interfejs webowy:
### (powinniśmy to wywołać w normalnej konsoli, w jupyter będziemy mieli zablokowany kernel)
! cd IUM_08/examples/; mlflow ui
[2021-05-16 17:58:43 +0200] [118029] [INFO] Starting gunicorn 20.1.0 [2021-05-16 17:58:43 +0200] [118029] [ERROR] Connection in use: ('127.0.0.1', 5000) [2021-05-16 17:58:43 +0200] [118029] [ERROR] Retrying in 1 second. [2021-05-16 17:58:44 +0200] [118029] [ERROR] Connection in use: ('127.0.0.1', 5000) [2021-05-16 17:58:44 +0200] [118029] [ERROR] Retrying in 1 second. [2021-05-16 17:58:45 +0200] [118029] [ERROR] Connection in use: ('127.0.0.1', 5000) [2021-05-16 17:58:45 +0200] [118029] [ERROR] Retrying in 1 second. [2021-05-16 17:58:46 +0200] [118029] [ERROR] Connection in use: ('127.0.0.1', 5000) [2021-05-16 17:58:46 +0200] [118029] [ERROR] Retrying in 1 second. [2021-05-16 17:58:47 +0200] [118029] [ERROR] Connection in use: ('127.0.0.1', 5000) [2021-05-16 17:58:47 +0200] [118029] [ERROR] Retrying in 1 second. [2021-05-16 17:58:48 +0200] [118029] [ERROR] Can't connect to ('127.0.0.1', 5000) Running the mlflow server failed. Please see the logs above for details.
Instancja na naszym serwerze: http://tzietkiewicz.vm.wmi.amu.edu.pl:5000/#/
Logowanie
logowania metryk i parametrów można dokonać m.in. poprzez wywołania Python-owego API:
mlflow.log_param()
imlflow.log_metric()
. Więcej dostępnych funkcji: linkwywołania te muszą nastąpić po wykonaniu
mlflow.start_run()
, najlepiej wewnątrz bloku:with mlflow.start_run(): #[...] mlflow.log_param("alpha", alpha) mlflow.log_param("l1_ratio", l1_ratio)
jest też możliwość automatycznego logwania dla wybranych bibliotek: https://mlflow.org/docs/latest/tracking.html#automatic-logging
MLflow Projects
- MLflow projects to zestaw konwencji i kilku narzędzi
- ułatwiają one uruchamianie eskperymentów
Konfiguracja projektu
- W pliku
MLproject
zapisuje się konfigurację projektu (specyfikacja) - Zawiera ona:
- odnośnik do środowiska, w którym ma być wywołany eksperyment szczegóły:
- nazwa obrazu Docker
- albo ścieżka do pliku conda.yaml definiującego środowisko wykonania Conda
- parametry, z którymi można wywołać eksperyment
- polecenia służące do wywołania eksperymentu
- odnośnik do środowiska, w którym ma być wywołany eksperyment szczegóły:
%%writefile IUM_08/examples/sklearn_elasticnet_wine/MLproject
name: tutorial
conda_env: conda.yaml #ścieżka do pliku conda.yaml z definicją środowiska
#docker_env:
# image: mlflow-docker-example-environment
entry_points:
main:
parameters:
alpha: {type: float, default: 0.5}
l1_ratio: {type: float, default: 0.1}
command: "python train.py {alpha} {l1_ratio}"
test:
parameters:
alpha: {type: cutoff, default: 0}
command: "python test.py {cutoff}"
Overwriting IUM_08/examples/sklearn_elasticnet_wine/MLproject
Środowisko Conda
- https://docs.conda.io
- Składnia plików conda.yaml definiujących środowisko: https://docs.conda.io/projects/conda/en/4.6.1/user-guide/tasks/manage-environments.html#create-env-file-manually
- Składnia YAML: przystępnie, oficjalnie
%%writefile IUM_08/examples/sklearn_elasticnet_wine/conda.yaml
name: tutorial
channels:
- defaults
dependencies:
- python=3.6 #Te zależności będą zainstalowane za pomocą conda isntall
- pip
- pip: #Te ząś za pomocą pip install
- scikit-learn==0.23.2
- mlflow>=1.0
Overwriting IUM_08/examples/sklearn_elasticnet_wine/conda.yaml
Środowisko docker
- zamiast środowiska Conda możemy również podać nazwę obrazu docker, w którym ma być wywołany eksperyment.
- obraz będzie szukany lokalnie a następnie na DockerHub, lub w innym repozytorium dockera
- składnia specyfikacji ścieżki jest taka sama jak w przypadki poleceń dockera, np. docker pull link
- Można również podać katalogi do podmontowania wewnątrz kontenera oraz wartości zmiennych środowiskowych do ustawienia w kontenerze:
docker_env: image: mlflow-docker-example-environment volumes: ["/local/path:/container/mount/path"] environment: [["NEW_ENV_VAR", "new_var_value"], "VAR_TO_COPY_FROM_HOST_ENVIRONMENT"]
Parametry
Specyfikacja parametrów w pliku MLproject pozwala na ich walidację i używanie wartości domyślnych
Dostępne typy:
- String
- Float - dowolna liczba (MLflow waliduje, czy podana wartość jest liczbą)
- Path - pozwala podawać ścieżki względne (przekształca je na bezwzlędne) do plików lokalnych albo do plików zdalnych (np. do s3://) - zostaną wtedy ściągnięte lokalnie
- URI - podobnie jak path, ale do rozproszonych systemów plików
-
parameter_name: {type: data_type, default: value} # Short syntax parameter_name: # Long syntax type: data_type default: value
Uruchamianie projektu
- Projekt możemy uruchomić przy pomocy polecenia
mlflow run
(dokumentacja) - Spowoduje to przygotowanie środowiska i uruchomienie eksperymentu wewnątrz środowiska
- domyślnie zostanie uruchomione polecenie zdefiniowane w "entry point"
main
. Żeby uruchomić inny "entry point", możemy użyć parametru-e
, np:mlflow run sklearn_elasticnet_wine -e test
- Parametry do naszego polecenia możemy przekazywać przy pomocy flagi
-P
!cd IUM_08/examples/; mlflow run sklearn_elasticnet_wine -P alpha=0.42
2021/05/16 17:59:10 INFO mlflow.projects.utils: === Created directory /tmp/tmprq4mdosv for downloading remote URIs passed to arguments of type 'path' === 2021/05/16 17:59:10 INFO mlflow.projects.backend.local: === Running command 'source /home/tomek/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-5987e03d4dbaa5faa1a697bb113be9b9bdc39b29 1>&2 && python train.py 0.42 0.1' in run with ID '1860d321ea1545ff8866e4ba199d1712' === Elasticnet model (alpha=0.420000, l1_ratio=0.100000): RMSE: 0.7420620899060748 MAE: 0.5722846717246247 R2: 0.21978513651550236 2021/05/16 17:59:19 INFO mlflow.projects: === Run (ID '1860d321ea1545ff8866e4ba199d1712') succeeded ===
Zadania [10p pkt]
- Dodaj do swojego projektu logowanie parametrów i metryk za pomocą MLflow (polecenia
mlflow.log_param
imlflow.log_metric
- Dodaj plik MLProject definiujący polecenia do trenowania i testowania, ich parametry wywołania oraz środowisko (Conda albo Docker)
MLflow Models
MLflow Models to konwencja zapisu modeli, która ułatwia potem ich załadowanie i użycie
Rodzaje modeli ("flavors") wspierane przez MLflow:
- Python Function (python_function)
- PyTorch (pytorch)
- TensorFlow (tensorflow)
- Keras (keras)
- Scikit-learn (sklearn)
- Spacy(spaCy)
- ONNX (onnx)
- R Function (crate)
- H2O (h2o)
- MLeap (mleap)
- Spark MLlib (spark)
- MXNet Gluon (gluon)
- XGBoost (xgboost)
- LightGBM (lightgbm)
- CatBoost (catboost)
- Fastai(fastai)
- Statsmodels (statsmodels)
Zapisywanie modelu
Model ML można zapisać w MLflow przy pomocy jednej z dwóch funkcji z pakietu odpowiadającego używanej przez nas bibliotece:
save_model()
- zapisuje model na dyskulog_model()
- zapisuje model razem z innymi informacjami (metrykami, parametrami). W zależności od ustawień "tracking_uri" może być to lokalny folder wmlruns/
lub ścieżka na zdalnym serwerze MLflow
mlflow.sklearn.save_model(lr, "my_model")
mlflow.keras.save_model(lr, "my_model")
Wywołanie tej funkcji spowoduje stworzenie katalogu "my_model" zawierającego:
- plik _MLmodel zawierający informacje o sposobach, w jaki model można załadować ("flavors") oraz ścieżki do plików związanych z modelem, takich jak:
- _conda.yaml - opis środowiska potrzebnego do załadowania modelu
- _model.pkl - plik z zserializowanym modelem
Tylko plik _MLmodel jest specjalnym plikiem MLflow - reszta zależy od konkrentego "flavour"
ls IUM_08/examples/my_model
conda.yaml MLmodel model.pkl
! ls -l IUM_08/examples/mlruns/0/b395b55b47fc43de876b67f5a4a5dae9/artifacts/model
total 12 -rw-rw-r-- 1 tomek tomek 153 maj 17 10:38 conda.yaml -rw-rw-r-- 1 tomek tomek 958 maj 17 10:38 MLmodel -rw-rw-r-- 1 tomek tomek 641 maj 17 10:38 model.pkl
# %load IUM_08/examples/mlruns/0/b395b55b47fc43de876b67f5a4a5dae9/artifacts/model/MLmodel
artifact_path: model
flavors:
python_function:
env: conda.yaml
loader_module: mlflow.sklearn
model_path: model.pkl
python_version: 3.9.1
sklearn:
pickled_model: model.pkl
serialization_format: cloudpickle
sklearn_version: 0.24.2
run_id: b395b55b47fc43de876b67f5a4a5dae9
signature:
inputs: '[{"name": "fixed acidity", "type": "double"}, {"name": "volatile acidity",
"type": "double"}, {"name": "citric acid", "type": "double"}, {"name": "residual
sugar", "type": "double"}, {"name": "chlorides", "type": "double"}, {"name": "free
sulfur dioxide", "type": "double"}, {"name": "total sulfur dioxide", "type": "double"},
{"name": "density", "type": "double"}, {"name": "pH", "type": "double"}, {"name":
"sulphates", "type": "double"}, {"name": "alcohol", "type": "double"}]'
outputs: '[{"type": "tensor", "tensor-spec": {"dtype": "float64", "shape": [-1]}}]'
utc_time_created: '2021-05-17 08:38:41.749670'
# %load IUM_08/examples/my_model/conda.yaml
channels:
- defaults
- conda-forge
dependencies:
- python=3.9.1
- pip
- pip:
- mlflow
- scikit-learn==0.24.2
- cloudpickle==1.6.0
name: mlflow-env
Dodatkowe pola w MLmodel
- _utc_time_created - timestamp z czasem stworzenia modelu
- _run_id - ID uruchomienia ("run"), które stworzyło ten model, jeśli model był zapisany za pomocą MLflow Tracking.
- _signature - opisa danych wejściowych i wyjściowych w formacie JSON
- _input_example przykładowe wejście przyjmowane przez model. Można je podać poprzez parametr
input_example
funkcji log_model
import mlflow
import pandas as pd
model = mlflow.sklearn.load_model("IUM_08/examples/mlruns/0/b395b55b47fc43de876b67f5a4a5dae9/artifacts/model")
csv_url = "http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
data = pd.read_csv(csv_url, sep=";")
model.predict(data.drop(["quality"], axis=1).head())
array([5.57688397, 5.50664777, 5.52550482, 5.50431125, 5.57688397])
Serwowanie modeli
!cd IUM_08/examples/; mlflow models --help
Usage: mlflow models [OPTIONS] COMMAND [ARGS]... Deploy MLflow models locally. To deploy a model associated with a run on a tracking server, set the MLFLOW_TRACKING_URI environment variable to the URL of the desired server. Options: --help Show this message and exit. Commands: build-docker **EXPERIMENTAL**: Builds a Docker image whose default... predict Generate predictions in json format using a saved MLflow... prepare-env **EXPERIMENTAL**: Performs any preparation necessary to... serve Serve a model saved with MLflow by launching a webserver on...
!cd IUM_08/examples/; mlflow models serve --help
Usage: mlflow models serve [OPTIONS] Serve a model saved with MLflow by launching a webserver on the specified host and port. The command supports models with the ``python_function`` or ``crate`` (R Function) flavor. For information about the input data formats accepted by the webserver, see the following documentation: https://www.mlflow.org/docs/latest/models.html#built-in-deployment-tools. You can make requests to ``POST /invocations`` in pandas split- or record- oriented formats. Example: .. code-block:: bash $ mlflow models serve -m runs:/my-run-id/model-path & $ curl http://127.0.0.1:5000/invocations -H 'Content-Type: application/json' -d '{ "columns": ["a", "b", "c"], "data": [[1, 2, 3], [4, 5, 6]] }' Options: -m, --model-uri URI URI to the model. A local path, a 'runs:/' URI, or a remote storage URI (e.g., an 's3://' URI). For more information about supported remote URIs for model artifacts, see https://mlflow.org/docs/latest/tracking.html#artifact- stores [required] -p, --port INTEGER The port to listen on (default: 5000). -h, --host HOST The network address to listen on (default: 127.0.0.1). Use 0.0.0.0 to bind to all addresses if you want to access the tracking server from other machines. -w, --workers TEXT Number of gunicorn worker processes to handle requests (default: 4). --no-conda If specified, will assume that MLmodel/MLproject is running within a Conda environment with the necessary dependencies for the current project instead of attempting to create a new conda environment. --install-mlflow If specified and there is a conda environment to be activated mlflow will be installed into the environment after it has been activated. The version of installed mlflow will be the same asthe one used to invoke this command. --help Show this message and exit.
import pandas as pd
csv_url = "http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
data = pd.read_csv(csv_url, sep=";").drop(["quality"], axis=1).head(1).to_json(orient='split')
print(data)
{"columns":["fixed acidity","volatile acidity","citric acid","residual sugar","chlorides","free sulfur dioxide","total sulfur dioxide","density","pH","sulphates","alcohol"],"index":[0],"data":[[7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4]]}
!curl http://127.0.0.1:5003/invocations -H 'Content-Type: application/json' -d '{\
"columns":[\
"fixed acidity","volatile acidity","citric acid","residual sugar","chlorides","free sulfur dioxide","total sulfur dioxide","density","pH","sulphates","alcohol"],\
"index":[0],\
"data":[[7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4]]}'
[5.576883967129615]
$ cd IUM_08/examples/
$ mlflow models serve -m my_model
2021/05/17 08:52:07 INFO mlflow.models.cli: Selected backend for flavor 'python_function'
2021/05/17 08:52:07 INFO mlflow.pyfunc.backend: === Running command 'source /home/tomek/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-503f0c7520a32f054a9d168bd099584a9439de9d 1>&2 && gunicorn --timeout=60 -b 127.0.0.1:5003 -w 1 ${GUNICORN_CMD_ARGS} -- mlflow.pyfunc.scoring_server.wsgi:app'
[2021-05-17 08:52:07 +0200] [291217] [INFO] Starting gunicorn 20.1.0
[2021-05-17 08:52:07 +0200] [291217] [INFO] Listening at: http://127.0.0.1:5003 (291217)
[2021-05-17 08:52:07 +0200] [291217] [INFO] Using worker: sync
[2021-05-17 08:52:07 +0200] [291221] [INFO] Booting worker with pid: 291221
MLflow Registry
- umożliwia zapisywanie i ładowanie modeli z centralnego rejestru
- Modele można też serwować bezpośrednio z rejestru:
#!/usr/bin/env sh
# Set environment variable for the tracking URL where the Model Registry resides
export MLFLOW_TRACKING_URI=http://localhost:5000
# Serve the production model from the model registry
mlflow models serve -m "models:/sk-learn-random-forest-reg-model/Production"
- Żeby było to możliwe, musimy mieć uruchomiony serwer MLflow
- Umożliwia zarządzanie wersjami modeli i oznaczanie ich różnymi fazami, np. "Staging", "Production"