ium/IUM_07.Sacred.ipynb

1531 lines
45 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Sacred\n",
"> Every experiment is sacred <br>\n",
"> Every experiment is great <br>\n",
"> If an experiment is wasted <br>\n",
"> God gets quite irate\n",
">\n",
" <cite>&mdash;https://github.com/IDSIA/sacred / [Sens życia według Monty Pythona](https://en.wikipedia.org/wiki/Every_Sperm_Is_Sacred) </cite>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"- Przeprowadzanie eksperymentów (zmiana parametrów, trenowanie, ewaluacja) uczenia maszynowego jest kosztowne i czasochłonne\n",
"- Dlatego warto przeprowadzać je w zorganizowany sposób\n",
"- I tak, żebyśmy mogli powtórzyć / odtworzyć raz uzyskane wyniki"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"> Sacred is a tool to help you:\n",
"> - configure\n",
"> - organize\n",
"> - log \n",
"> - reproduce \n",
"> experiments. \n",
"> \n",
">It is designed to do all the tedious overhead work that you need to do around your actual experiment in order to:\n",
"> - keep track of all the parameters of your experiment\n",
"> - easily run your experiment for different settings\n",
"> - save configurations for individual runs in a database\n",
"> - reproduce your results\n",
" \n",
" <cite>&mdash;https://github.com/IDSIA/sacred</cite>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"- **ConfigScopes** A very convenient way of the local variables in a function to define the parameters your experiment uses.\n",
"- **Config Injection** You can access all parameters of your configuration from every function. They are automatically injected by name.\n",
"- **Command-line interface** You get a powerful command-line interface for each experiment that you can use to change parameters and run different variants.\n",
"- **Observers** Sacred provides Observers that log all kinds of information about your experiment, its dependencies, the configuration you used, the machine it is run on, and of course the result. These can be saved to a MongoDB, for easy access later.\n",
"- **Automatic seeding** helps controlling the randomness in your experiments, such that the results remain reproducible."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Instalacja"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting sacred\n",
" Downloading sacred-0.8.2-py2.py3-none-any.whl (106 kB)\n",
"\u001b[K |████████████████████████████████| 106 kB 1.7 MB/s eta 0:00:01\n",
"\u001b[?25hRequirement already satisfied: packaging>=18.0 in /media/tomek/Linux_data/home/tomek/anaconda3/lib/python3.8/site-packages (from sacred) (20.4)\n",
"Collecting jsonpickle<2.0,>=1.2\n",
" Downloading jsonpickle-1.5.2-py2.py3-none-any.whl (37 kB)\n",
"Requirement already satisfied: GitPython in /media/tomek/Linux_data/home/tomek/anaconda3/lib/python3.8/site-packages (from sacred) (3.1.14)\n",
"Collecting munch<3.0,>=2.0.2\n",
" Downloading munch-2.5.0-py2.py3-none-any.whl (10 kB)\n",
"Collecting py-cpuinfo>=4.0\n",
" Downloading py-cpuinfo-8.0.0.tar.gz (99 kB)\n",
"\u001b[K |████████████████████████████████| 99 kB 2.7 MB/s eta 0:00:011\n",
"\u001b[?25hCollecting docopt<1.0,>=0.3\n",
" Downloading docopt-0.6.2.tar.gz (25 kB)\n",
"Requirement already satisfied: wrapt<2.0,>=1.0 in /media/tomek/Linux_data/home/tomek/anaconda3/lib/python3.8/site-packages (from sacred) (1.11.2)\n",
"Requirement already satisfied: colorama>=0.4 in /media/tomek/Linux_data/home/tomek/anaconda3/lib/python3.8/site-packages (from sacred) (0.4.4)\n",
"Requirement already satisfied: pyparsing>=2.0.2 in /media/tomek/Linux_data/home/tomek/anaconda3/lib/python3.8/site-packages (from packaging>=18.0->sacred) (2.4.7)\n",
"Requirement already satisfied: six in /media/tomek/Linux_data/home/tomek/anaconda3/lib/python3.8/site-packages (from packaging>=18.0->sacred) (1.15.0)\n",
"Requirement already satisfied: gitdb<5,>=4.0.1 in /media/tomek/Linux_data/home/tomek/anaconda3/lib/python3.8/site-packages (from GitPython->sacred) (4.0.5)\n",
"Requirement already satisfied: smmap<4,>=3.0.1 in /media/tomek/Linux_data/home/tomek/anaconda3/lib/python3.8/site-packages (from gitdb<5,>=4.0.1->GitPython->sacred) (3.0.5)\n",
"Building wheels for collected packages: py-cpuinfo, docopt\n",
" Building wheel for py-cpuinfo (setup.py) ... \u001b[?25ldone\n",
"\u001b[?25h Created wheel for py-cpuinfo: filename=py_cpuinfo-8.0.0-py3-none-any.whl size=22245 sha256=556a8ea1e899c40b6266eab7562141327aecacfb2cdb6509279a85c91bf729b2\n",
" Stored in directory: /home/tomek/.cache/pip/wheels/57/cb/6d/bab2257f26c5be4a96ff65c3d2a7122c96529b73773ee37f36\n",
" Building wheel for docopt (setup.py) ... \u001b[?25ldone\n",
"\u001b[?25h Created wheel for docopt: filename=docopt-0.6.2-py2.py3-none-any.whl size=13704 sha256=a9cd4cc934499c413a39353a865132382f2f2b230f614d2a2a495b1ccc0b2dd4\n",
" Stored in directory: /home/tomek/.cache/pip/wheels/56/ea/58/ead137b087d9e326852a851351d1debf4ada529b6ac0ec4e8c\n",
"Successfully built py-cpuinfo docopt\n",
"Installing collected packages: jsonpickle, munch, py-cpuinfo, docopt, sacred\n",
"Successfully installed docopt-0.6.2 jsonpickle-1.5.2 munch-2.5.0 py-cpuinfo-8.0.0 sacred-0.8.2\n"
]
}
],
"source": [
"!pip install sacred"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Funkcja main"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"# %load sacred_hello.py\n",
"from sacred import Experiment\n",
"\n",
"ex = Experiment()\n",
"\n",
"@ex.automain\n",
"def my_main():\n",
" print('Witaj świecie!')\n"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"WARNING - sacred_hello - No observers have been added to this run\r\n",
"INFO - sacred_hello - Running command 'my_main'\r\n",
"INFO - sacred_hello - Started\r\n",
"Witaj świecie!\r\n",
"INFO - sacred_hello - Completed after 0:00:00\r\n"
]
}
],
"source": [
"!python IUM_07/sacred_hello.py"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"##### Co się dzieje w kodzie powyżej?\n",
"1. Tworzymy obiekt klasy Experiment\n",
"2. Dekorujemy funkcję \"ma_main\" dekoratorem [automain](https://sacred.readthedocs.io/en/stable/apidoc.html#sacred.Experiment.automain)\n",
" Dzięki temu:\n",
" - otrzymujemy interfejs CLI, m.in. do kontrolowania poziomu logowania, przekazywania parametrów itp.\n",
" - oznaczamy funkcję \"my_main\" jako główną funkcję, która będzie wywoływana podczas wykonywania eksperymentu\n",
" - funkcja oznaczona jako główna musi być ostatnią funkcją zdefiniowaną w pliku!\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"##### Co nam daje interejs CLI:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Usage:\r\n",
" sacred_hello.py [(with UPDATE...)] [options]\r\n",
" sacred_hello.py help [COMMAND]\r\n",
" sacred_hello.py (-h | --help)\r\n",
" sacred_hello.py COMMAND [(with UPDATE...)] [options]\r\n",
"\r\n",
"\r\n",
"\r\n",
"Options:\r\n",
" -b VALUE --beat-interval=VALUE Set the heart-beat interval for this run. Time\r\n",
" between two heartbeat events is measured in\r\n",
" seconds.\r\n",
" -C VALUE --capture=VALUE Control the way stdout and stderr are captured.\r\n",
" The argument value must be one of [no, sys, fd]\r\n",
" -c VALUE --comment=VALUE Add a comment to this run.\r\n",
" -d --debug Set this run to debug mode. Suppress warnings\r\n",
" about missing observers and don't filter the\r\n",
" stacktrace. Also enables usage with ipython\r\n",
" `--pdb`.\r\n",
" -e --enforce_clean Fail if any version control repository is\r\n",
" dirty.\r\n",
" -F VALUE --file_storage=VALUE Add a file-storage observer to the experiment.\r\n",
" The value of the arguement should be the base-\r\n",
" directory to write the runs to\r\n",
" -f --force Disable warnings about suspicious changes for\r\n",
" this run.\r\n",
" -h --help Print this help message and exit.\r\n",
" -l VALUE --loglevel=VALUE Set the LogLevel. Loglevel either as 0 - 50 or\r\n",
" as string: DEBUG(10), INFO(20), WARNING(30),\r\n",
" ERROR(40), CRITICAL(50)\r\n",
" -m VALUE --mongo_db=VALUE Add a MongoDB Observer to the experiment. The\r\n",
" argument value is the database specification.\r\n",
" Should be in the form: `[host:port:]db_name[.c\r\n",
" ollection[:id]][!priority]`\r\n",
" -n VALUE --name=VALUE Set the name for this run.\r\n",
" -D --pdb Automatically enter post-mortem debugging with\r\n",
" pdb on failure.\r\n",
" -p --print-config Always print the configuration first.\r\n",
" -P VALUE --priority=VALUE Sets the priority for a queued up experiment.\r\n",
" `--priority=NUMBER` The number represent the\r\n",
" priority for this run.\r\n",
" -q --queue Only queue this run, do not start it.\r\n",
" -S VALUE --s3=VALUE Add a S3 File observer to the experiment. The\r\n",
" argument value should be\r\n",
" `s3://<bucket>/path/to/exp`.\r\n",
" -s VALUE --sql=VALUE Add a SQL Observer to the experiment. The\r\n",
" typical form is:\r\n",
" dialect://username:password@host:port/database\r\n",
" -t VALUE --tiny_db=VALUE Add a TinyDB Observer to the experiment. The\r\n",
" argument is the path to be given to the\r\n",
" TinyDbObserver.\r\n",
" -u --unobserved Ignore all observers for this run.\r\n",
"\r\n",
"\r\n",
"Arguments:\r\n",
" COMMAND Name of command to run (see below for list of commands)\r\n",
" UPDATE Configuration assignments of the form foo.bar=17\r\n",
"\r\n",
"\r\n",
"Commands:\r\n",
" print_config Print the updated configuration and exit.\r\n",
" print_dependencies Print the detected source-files and dependencies.\r\n",
" save_config Store the updated configuration in a file.\r\n",
" print_named_configs Print the available named configs and exit.\r\n",
" my_main \r\n",
"\r\n"
]
}
],
"source": [
"!python IUM_07/sacred_hello.py -h"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Konfiguracje\n",
" - Konfiguracje pozwalają nam sparametryzować wywołania eksperymentu.\n",
" - Ułatwiają przekazywanie parametrów - zmienne z konfiguracji są wstrzykiwane do funkcji wywoływanych \n",
" - Mogą być automatycznie zapisywane (dzięki czemu możemy śledzić jak zmieniały się parametry i jaki miały wpływ na wyniki)\n",
" - Konfigurację można stworzyć w jeden z 3 sposobów:\n",
" - używając config scopes (zasięg konfiguracji)\n",
" - jako słownik\n",
" - wczytując ją z pliku"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Konfiguracje - config scopes\n",
"Jeśli oznaczymy jakąś funkcję dekoratorem `config`, to zostanie ona uruchoniona przed wywołaniem eksperymentu i wszystkie jej lokalne zmienne, które da sie zserializować jako json, zostaną dodane do konfiguracji. Potem ich wartości zostaną wstrzyknięte "
]
},
{
"cell_type": "code",
"execution_count": 91,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"from sacred import Experiment\n",
"\n",
"exint = Experiment(\"sacred_scopes\", interactive=True) #Jeśli wykonujemy interaktywnie (w konsoli Pythona albo w Jupyter):\n",
"# - musimy podać nazwę eksperymentu (domyślnie jako nazwa używana jest nazwa pliku źródłowego)\n",
"# - musimy dodać parametr \"interactive=True\"\n",
"# - zamiast \"automain\" używamy parametru \"main\"\n",
"\n",
"@exint.config\n",
"def my_config():\n",
" recipient = \"Świecie\"\n",
" greeting = \"Witaj\"\n",
" message = \"{0} {1}!\".format(greeting, recipient)\n",
"\n",
"\n",
"@exint.main\n",
"def my_main(message):\n",
" print(message)"
]
},
{
"cell_type": "code",
"execution_count": 92,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING - sacred_scopes - No observers have been added to this run\n",
"INFO - sacred_scopes - Running command 'my_main'\n",
"INFO - sacred_scopes - Started\n",
"INFO - sacred_scopes - Completed after 0:00:00\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Witaj Świecie!\n"
]
},
{
"data": {
"text/plain": [
"<sacred.run.Run at 0x7f423da33160>"
]
},
"execution_count": 92,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"exint.run()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"##### Możemy podejrzeć wartości zmiennych w konfiguracji:"
]
},
{
"cell_type": "code",
"execution_count": 93,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'recipient': 'Świecie', 'greeting': 'Witaj', 'message': 'Witaj Świecie!'}"
]
},
"execution_count": 93,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_config()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Parametry możemy podejrzeć i modyfikować z poziomu CLI\n",
" - wartości podane w CLI nadpiszą te podane w kodzie"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"# %load IUM_07/sacred_scopes.py\n",
"from sacred import Experiment\n",
"\n",
"ex = Experiment()\n",
"\n",
"@ex.config\n",
"def my_config():\n",
" recipient = \"Świecie\"\n",
" greeting = \"Witaj\"\n",
" message = \"{0} {1}!\".format(greeting, recipient)\n",
"\n",
"@ex.automain\n",
"def my_main(message):\n",
" print(message)"
]
},
{
"cell_type": "code",
"execution_count": 88,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"WARNING - sacred_scopes - No observers have been added to this run\r\n",
"INFO - sacred_scopes - Running command 'my_main'\r\n",
"INFO - sacred_scopes - Started\r\n",
"Witaj Przygodo!\r\n",
"INFO - sacred_scopes - Completed after 0:00:00\r\n"
]
}
],
"source": [
"!python IUM_07/sacred_scopes.py with 'recipient=Przygodo'"
]
},
{
"cell_type": "code",
"execution_count": 89,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"INFO - sacred_scopes - Running command 'print_config'\r\n",
"INFO - sacred_scopes - Started\r\n",
"Configuration (\u001b[34mmodified\u001b[0m, \u001b[32madded\u001b[0m, \u001b[31mtypechanged\u001b[0m, \u001b[2mdoc\u001b[0m):\r\n",
" greeting = 'Witaj'\r\n",
" message = 'Witaj Świecie!'\r\n",
" recipient = 'Świecie'\r\n",
" seed = 29744255 \u001b[2m# the random seed for this experiment\u001b[0m\r\n",
"INFO - sacred_scopes - Completed after 0:00:00\r\n"
]
}
],
"source": [
"!python IUM_07/sacred_scopes.py print_config"
]
},
{
"cell_type": "code",
"execution_count": 90,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"INFO - sacred_scopes - Running command 'print_config'\r\n",
"INFO - sacred_scopes - Started\r\n",
"Configuration (\u001b[34mmodified\u001b[0m, \u001b[32madded\u001b[0m, \u001b[31mtypechanged\u001b[0m, \u001b[2mdoc\u001b[0m):\r\n",
" greeting = 'Witaj'\r\n",
" message = 'Witaj Przygodo!'\r\n",
"\u001b[34m recipient = 'Przygodo'\u001b[0m\r\n",
" seed = 215765170 \u001b[2m# the random seed for this experiment\u001b[0m\r\n",
"INFO - sacred_scopes - Completed after 0:00:00\r\n"
]
}
],
"source": [
"!python IUM_07/sacred_scopes.py print_config with 'recipient=Przygodo'"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Wczytywanie konfiguracji z pliku"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"# %load IUM_07/config.json\n",
"{\n",
" \"recipient\": \"samotności\",\n",
" \"greeting\": \"Żegnaj\"\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 119,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"from sacred import Experiment\n",
"\n",
"ex = Experiment(\"sacred_scopes\", interactive=True) #Jeśli wykonujemy interaktywnie (w konsoli Pythona albo w Jupyter):\n",
"# - musimy podać nazwę eksperymentu (domyślnie jako nazwa używana jest nazwa pliku źródłowego)\n",
"# - musimy dodać parametr \"interactive=True\"\n",
"# - zamiast \"automain\" używamy parametru \"main\"\n",
"\n",
"@ex.config\n",
"def my_config():\n",
" recipient = \"Świecie\"\n",
" greeting = \"Witaj\"\n",
"\n",
"ex.add_config(\"IUM_07/config.json\")\n",
"\n",
"\n",
"@ex.main\n",
"def my_main(recipient, greeting):\n",
" print(\"{0} {1}!\".format(greeting, recipient))"
]
},
{
"cell_type": "code",
"execution_count": 120,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING - sacred_scopes - No observers have been added to this run\n",
"INFO - sacred_scopes - Running command 'my_main'\n",
"INFO - sacred_scopes - Started\n",
"INFO - sacred_scopes - Completed after 0:00:00\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Żegnaj samotności!\n"
]
}
],
"source": [
"r = ex.run()"
]
},
{
"cell_type": "code",
"execution_count": 121,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'recipient': 'samotności', 'greeting': 'Żegnaj', 'seed': 529757761}"
]
},
"execution_count": 121,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"r.config"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Możemy modyfikować części konfiguracji bezpośrednio przed wywołaniem"
]
},
{
"cell_type": "code",
"execution_count": 124,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING - sacred_scopes - No observers have been added to this run\n",
"INFO - sacred_scopes - Running command 'my_main'\n",
"INFO - sacred_scopes - Started\n",
"INFO - sacred_scopes - Completed after 0:00:00\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Żegnaj nudo!\n"
]
}
],
"source": [
"r = ex.run(config_updates={\"recipient\":\"nudo\"})"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Wtrzykiwanie zależności\n",
" - Oprócz funkcji głównej, wartości z konfiguracji są też wstrzykiwane do funkcji udekorowanych dekoratorem `@ex.capture`\n",
" - Możemy skorzystać w nich ze specjalnych parametrów, np.:\n",
" - `_log` - daje nam dostęp do obiektu logera (więcej: [logowanie](https://sacred.readthedocs.io/en/stable/logging.html))\n",
" - `_run` - daje dostęp do obiektu reprezentującego aktualne wywołanie eksperymentu (przykład później)"
]
},
{
"cell_type": "code",
"execution_count": 193,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING - sacred_scopes - No observers have been added to this run\n",
"INFO - sacred_scopes - Running command 'my_main'\n",
"INFO - sacred_scopes - Started\n",
"INFO - prepare_message - Enterred prepare_message\n",
"INFO - sacred_scopes - Completed after 0:00:00\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Witaj Świecie!\n"
]
},
{
"data": {
"text/plain": [
"<sacred.run.Run at 0x7f423c40d820>"
]
},
"execution_count": 193,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sacred import Experiment\n",
"\n",
"ex = Experiment(\"sacred_scopes\", interactive=True)\n",
"\n",
"@ex.config\n",
"def my_config():\n",
" recipient = \"Świecie\"\n",
" greeting = \"Witaj\"\n",
"\n",
"@ex.capture\n",
"def prepare_message(recipient, greeting, _log):\n",
" _log.info(\"Enterred prepare_message\")\n",
" return \"{0} {1}!\".format(greeting, recipient)\n",
"\n",
"@ex.main\n",
"def my_main():\n",
" print(prepare_message()) ## Nie musimy przekazywać wartości\n",
" \n",
"ex.run()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Obserwowanie eksperymentów\n",
"Sacred zapisuje szereg informacji na temat każdego eksperymentu:\n",
" - czas wykonania\n",
" - konfigurację\n",
" - tekst zwrócony na stdout/stderr\n",
" - błędy, jeśli wystąpiły\n",
" - podstawowe informacje o środowisku (maszynie), na której przeprowadzono eksperyment\n",
" - użyte pliki źródłowe\n",
" - użyte zależności i ich wersje\n",
" - pliki otwarte za pomocą ex.open_resource\n",
" - pliki dodane za pomocą ex.add_artifact"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Obserwowane infromacje mogą zostać zapisane za pomocą jednego z [obserwatorów](https://sacred.readthedocs.io/en/stable/observers.html):\n",
" - Mongo Observer - zapisuje dane w MongoDB\n",
" - File Storage Observer - zapisuje dane lokalnie w pliku\n",
" - TinyDB Observer - korzysta z lokalnej bazy zapisanej w pliku JSON\n",
" - SQL Observer - przechowuje informacje w bazie SQL\n",
" - S3 Observer - korzysta z AWS S3\n",
" - gcs_observer - korzysta z Google Cloud Storage\n",
" - Queue Observer - rodzaj lokalnego bufora nakładanego na jeden z powyższych\n",
" - Slack Observer - używany do powiadomień wysyłanych na komunikator Slack\n",
" - Telegram Observer - używany do powiadomień wysyłanych na komunikator Telegram"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### File storage observer\n",
"- zapisuje informacje o eksperymencie w lokalnych plikach \n",
"- można go dodać tak: `ex.observers.append(FileStorageObserver('my_runs_directory'))`, gdzie `my_runs_directory` to ścieżka, gdzie będą zapisywane informacje o eksperymentach"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"# %load IUM_07/file_observer.py\n",
"from sacred.observers import FileStorageObserver\n",
"from sacred import Experiment\n",
"\n",
"ex = Experiment(\"file_observer\")\n",
"\n",
"ex.observers.append(FileStorageObserver('my_runs'))\n",
"\n",
"@ex.config\n",
"def my_config():\n",
" recipient = \"Świecie\"\n",
" greeting = \"Witaj\"\n",
"\n",
"@ex.capture\n",
"def prepare_message(recipient, greeting):\n",
" return \"{0} {1}!\".format(greeting, recipient)\n",
"\n",
"@ex.automain\n",
"def my_main(recipient, greeting):\n",
" print(prepare_message()) ## Nie musimy przekazywać wartości"
]
},
{
"cell_type": "code",
"execution_count": 159,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"INFO - file_observer - Running command 'my_main'\r\n",
"INFO - file_observer - Started run with ID \"2\"\r\n",
"Witaj Świecie!\r\n",
"INFO - file_observer - Completed after 0:00:00\r\n"
]
}
],
"source": [
"!python IUM_07/file_observer.py"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Zobaczmy jakie informacje zostały zapisane"
]
},
{
"cell_type": "code",
"execution_count": 160,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 12\r\n",
"drwxrwxr-x 2 tomek tomek 4096 kwi 26 09:54 1\r\n",
"drwxrwxr-x 2 tomek tomek 4096 kwi 26 10:21 2\r\n",
"drwxrwxr-x 2 tomek tomek 4096 kwi 26 10:21 _sources\r\n"
]
}
],
"source": [
"!ls -l my_runs"
]
},
{
"cell_type": "code",
"execution_count": 164,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 16\r\n",
"-rw-rw-r-- 1 tomek tomek 76 kwi 26 10:21 config.json\r\n",
"-rw-rw-r-- 1 tomek tomek 159 kwi 26 10:21 cout.txt\r\n",
"-rw-rw-r-- 1 tomek tomek 2 kwi 26 10:21 metrics.json\r\n",
"-rw-rw-r-- 1 tomek tomek 1686 kwi 26 10:21 run.json\r\n"
]
}
],
"source": [
"!ls -l my_runs/2"
]
},
{
"cell_type": "code",
"execution_count": 162,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'greeting': 'Witaj', 'recipient': 'Świecie', 'seed': 178660254}"
]
},
"execution_count": 162,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# %load my_runs/2/config.json\n",
"{\n",
" \"greeting\": \"Witaj\",\n",
" \"recipient\": \"\\u015awiecie\",\n",
" \"seed\": 178660254\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 165,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"INFO - file_observer - Running command 'my_main'\r\n",
"INFO - file_observer - Started run with ID \"2\"\r\n",
"Witaj Świecie!\r\n",
"INFO - file_observer - Completed after 0:00:00\r\n"
]
}
],
"source": [
"!cat my_runs/2/cout.txt"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"# %load my_runs/2/run.json\n",
"{\n",
" \"artifacts\": [],\n",
" \"command\": \"my_main\",\n",
" \"experiment\": {\n",
" \"base_dir\": \"/home/tomek/AITech/repo/aitech-ium-private/IUM_07\",\n",
" \"dependencies\": [\n",
" \"numpy==1.19.2\",\n",
" \"sacred==0.8.2\"\n",
" ],\n",
" \"mainfile\": \"file_observer.py\",\n",
" \"name\": \"file_observer\",\n",
" \"repositories\": [\n",
" {\n",
" \"commit\": \"9a2064faaf4d209233ab0e20ad522638bb99b6f4\",\n",
" \"dirty\": true,\n",
" \"url\": \"git@git.wmi.amu.edu.pl:tzietkiewicz/aitech-ium-private.git\"\n",
" }\n",
" ],\n",
" \"sources\": [\n",
" [\n",
" \"file_observer.py\",\n",
" \"_sources/file_observer_bb0a5c4720d1072b641d23da080696b6.py\"\n",
" ]\n",
" ]\n",
" },\n",
" \"heartbeat\": \"2021-04-26T08:21:35.718761\",\n",
" \"host\": {\n",
" \"ENV\": {},\n",
" \"cpu\": \"Intel(R) Core(TM) i5-4200H CPU @ 2.80GHz\",\n",
" \"hostname\": \"tomek-asus\",\n",
" \"os\": [\n",
" \"Linux\",\n",
" \"Linux-5.4.0-72-generic-x86_64-with-glibc2.10\"\n",
" ],\n",
" \"python_version\": \"3.8.5\"\n",
" },\n",
" \"meta\": {\n",
" \"command\": \"my_main\",\n",
" \"options\": {\n",
" \"--beat-interval\": null,\n",
" \"--capture\": null,\n",
" \"--comment\": null,\n",
" \"--debug\": false,\n",
" \"--enforce_clean\": false,\n",
" \"--file_storage\": null,\n",
" \"--force\": false,\n",
" \"--help\": false,\n",
" \"--loglevel\": null,\n",
" \"--mongo_db\": null,\n",
" \"--name\": null,\n",
" \"--pdb\": false,\n",
" \"--print-config\": false,\n",
" \"--priority\": null,\n",
" \"--queue\": false,\n",
" \"--s3\": null,\n",
" \"--sql\": null,\n",
" \"--tiny_db\": null,\n",
" \"--unobserved\": false,\n",
" \"COMMAND\": null,\n",
" \"UPDATE\": [],\n",
" \"help\": false,\n",
" \"with\": false\n",
" }\n",
" },\n",
" \"resources\": [],\n",
" \"result\": null,\n",
" \"start_time\": \"2021-04-26T08:21:35.714091\",\n",
" \"status\": \"COMPLETED\",\n",
" \"stop_time\": \"2021-04-26T08:21:35.717141\"\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 170,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 4\r\n",
"-rw-rw-r-- 1 tomek tomek 463 kwi 26 10:21 file_observer_bb0a5c4720d1072b641d23da080696b6.py\r\n"
]
}
],
"source": [
"! ls -l my_runs/_sources\n",
"## W run.json możemy znaleźć ścieżkę do pliku z źródłami: \"_sources/file_observer_bb0a5c4720d1072b641d23da080696b6.py\"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"## Źródła zostały zapisane\n",
"# %load my_runs/_sources/file_observer_bb0a5c4720d1072b641d23da080696b6.py\n",
"from sacred.observers import FileStorageObserver\n",
"from sacred import Experiment\n",
"\n",
"ex = Experiment(\"file_observer\")\n",
"\n",
"ex.observers.append(FileStorageObserver('my_runs'))\n",
"\n",
"@ex.config\n",
"def my_config():\n",
" recipient = \"Świecie\"\n",
" greeting = \"Witaj\"\n",
"\n",
"@ex.capture\n",
"def prepare_message(recipient, greeting):\n",
" return \"{0} {1}!\".format(greeting, recipient)\n",
"\n",
"@ex.automain\n",
"def my_main(recipient, greeting):\n",
" print(prepare_message()) ## Nie musimy przekazywać wartości"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Dodawanie własnych informacji\n"
]
},
{
"cell_type": "code",
"execution_count": 183,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO - file_observer - Running command 'my_main'\n",
"INFO - file_observer - Started run with ID \"6\"\n",
"INFO - file_observer - Completed after 0:00:00\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Witaj Świecie!\n"
]
}
],
"source": [
"from sacred.observers import FileStorageObserver\n",
"from sacred import Experiment\n",
"from datetime import datetime\n",
"\n",
"ex = Experiment(\"file_observer\", interactive=True)\n",
"\n",
"ex.observers.append(FileStorageObserver('my_runs'))\n",
"\n",
"@ex.config\n",
"def my_config():\n",
" recipient = \"Świecie\"\n",
" greeting = \"Witaj\"\n",
"\n",
"### - Do \"przechwyconej\" przez @ex.capture funkcji prepare_message dodaliśmy specjalny parametr _run\n",
"### - Daje on dostęp do obiektu wywołania eksperymentu w trakcie jego wywołania\n",
"### - umożliwia m.in. zapisywanie dodatkowych informacji w słowniku info\n",
"@ex.capture\n",
"def prepare_message(recipient, greeting, _run):\n",
" _run.info[\"prepare_message_ts\"] = str(datetime.now())\n",
" return \"{0} {1}!\".format(greeting, recipient)\n",
"\n",
"@ex.main\n",
"def my_main(recipient, greeting):\n",
" print(prepare_message()) ## Nie musimy przekazywać wartości\n",
" \n",
"r = ex.run()"
]
},
{
"cell_type": "code",
"execution_count": 185,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"prepare_message_ts\": \"2021-04-26 10:39:59.268539\"\r\n",
"}"
]
}
],
"source": [
"cat my_runs/6/info.json"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Artefakty\n",
"\n",
"- Artefakty służą do zapisywania plików, np. z wytrenowanym modelem\n",
"- Plik można zapisać jako artefakt korzystając z : [ex.add_artifact()](https://sacred.readthedocs.io/en/stable/apidoc.html?highlight=artifact#sacred.Experiment.add_artifact)\n",
"```python\n",
"ex.add_artifact(\"model.pb\")\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
" ## Obserwator mongo\n",
" - Żeby skorzystać z obserwatora Mongo, musimy mieć dostęp do bazy Mongo.\n",
" - Można ją łatwo \"postawić\" za pomocą [docker-compose ](https://docs.docker.com/compose/).\n",
" - W tym celu wystarczy skopiować katalog [examples/docker](https://github.com/IDSIA/sacred/tree/master/examples/docker) z repozytorium SACRED i uruchomić `docker-compose up` - dostaniemy uruchomioną bazę MongoDB i dodatkowo [Omniboard ](https://vivekratnavel.github.io/omniboard/#/). Więcej informacji w [dokumentacji](https://sacred.readthedocs.io/en/stable/examples.html#docker-setup)\n",
" - Baza taka została już postawiona na serwerze Jenkins, więc pracując na Jenkinsie można skorzystać z lokalnej bazy (`localhost:27017`)"
]
},
{
"cell_type": "code",
"execution_count": 134,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: pymongo in /media/tomek/Linux_data/home/tomek/anaconda3/lib/python3.8/site-packages (3.11.3)\r\n"
]
}
],
"source": [
"!pip install pymongo"
]
},
{
"cell_type": "code",
"execution_count": 155,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO - sacred_scopes - Running command 'my_main'\n",
"INFO - sacred_scopes - Started run with ID \"2\"\n",
"INFO - sacred_scopes - Completed after 0:00:00\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Witaj Świecie!\n"
]
},
{
"data": {
"text/plain": [
"<sacred.run.Run at 0x7f423c3667c0>"
]
},
"execution_count": 155,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sacred.observers import MongoObserver\n",
"from sacred import Experiment\n",
"\n",
"ex = Experiment(\"sacred_scopes\", interactive=True)\n",
"ex.observers.append(MongoObserver(url='mongodb://mongo_user:mongo_password@localhost:27017',\n",
" db_name='sacred')) # Tutaj podajemy dane uwierzytelniające i nazwę bazy skonfigurowane w pliku .env podczas uruchamiania bazy.\n",
"# W przypadku instancji na Jenkinsie url będzie wyglądał następująco: mongodb://mongo_user:mongo_password_IUM_2021@localhost:27017\n",
"@ex.config\n",
"def my_config():\n",
" recipient = \"Świecie\"\n",
" greeting = \"Witaj\"\n",
"\n",
"@ex.capture\n",
"def prepare_message(recipient, greeting):\n",
" return \"{0} {1}!\".format(greeting, recipient)\n",
"\n",
"@ex.main\n",
"def my_main(recipient, greeting):\n",
" print(prepare_message()) ## Nie musimy przekazywać wartości\n",
" \n",
"ex.run()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"- Informacje o eksperymencie można obejrzeć na Omniboard: http://127.0.0.1:9000/sacred\n",
"- Instancja na Jenkinsie: http://tzietkiewicz.vm.wmi.amu.edu.pl:9000/sacred\n",
"<img width=\"75%\" src=\"IUM_07/omniboard.png\">"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Metryki\n",
"\n",
"- W trakcie eksperymentu możemy śledzić [metryki](https://sacred.readthedocs.io/en/stable/collected_information.html#metrics-api), np. aktualny loss\n",
"- W tym celu wystarczy:\n",
" - dodać do funkcji udekorowanej `@ex.main` albo `@ex.capure` parametr `_run`\n",
" - potem wywołać np. `_run.log_scalar()`"
]
},
{
"cell_type": "code",
"execution_count": 192,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO - sacred_scopes - Running command 'my_main'\n",
"INFO - sacred_scopes - Started run with ID \"9\"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Witaj Świecie!\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO - sacred_scopes - Completed after 0:00:50\n"
]
},
{
"data": {
"text/plain": [
"<sacred.run.Run at 0x7f423c2de550>"
]
},
"execution_count": 192,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sacred.observers import MongoObserver\n",
"from sacred import Experiment\n",
"import random\n",
"import time\n",
"\n",
"ex = Experiment(\"sacred_scopes\", interactive=True)\n",
"ex.observers.append(MongoObserver(url='mongodb://mongo_user:mongo_password@localhost:27017',\n",
" db_name='sacred')) # Tutaj podajemy dane uwierzytelniające i nazwę bazy skonfigurowane w pliku .env podczas uruchamiania bazy.\n",
"# W przypadku instancji na Jenkinsie url będzie wyglądał następująco: mongodb://mongo_user:mongo_password_IUM_2021@localhost:27017\n",
"@ex.config\n",
"def my_config():\n",
" recipient = \"Świecie\"\n",
" greeting = \"Witaj\"\n",
"\n",
"@ex.capture\n",
"def prepare_message(recipient, greeting):\n",
" return \"{0} {1}!\".format(greeting, recipient)\n",
"\n",
"@ex.main\n",
"def my_main(recipient, greeting, _run):\n",
" print(prepare_message()) ## Nie musimy przekazywać wartości \n",
" counter = 0\n",
" while counter < 20:\n",
" counter+=1\n",
" value = counter\n",
" ms_to_wait = random.randint(5, 5000)\n",
" time.sleep(ms_to_wait/1000)\n",
" noise = 1.0 + 0.1 * (random.randint(0, 10) - 5)\n",
" # This will add an entry for training.loss metric in every second iteration.\n",
" # The resulting sequence of steps for training.loss will be 0, 2, 4, ...\n",
" if counter % 2 == 0:\n",
" _run.log_scalar(\"training.loss\", value * 1.5 * noise, counter)\n",
" # Implicit step counter (0, 1, 2, 3, ...)\n",
" # incremented with each call for training.accuracy:\n",
" _run.log_scalar(\"training.accuracy\", value * 2 * noise)\n",
"\n",
"ex.run() "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Wartości metryk możemy na żywo śledzić w Omniboard\n",
"<img src=\"IUM_07/metrics.png\" width=\"75%\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Zadanie [15 pkt] (do 9 V 2021)\n",
"1. \"Owiń\" wywołanie swojego eksperymentu za pomocą Sacred, w ten sposób, żeby zapisane zostały [10pkt]:\n",
" - parametry, z którymi wywołany był trening\n",
" - powstały plik z modelem (jako artefakt)\n",
" - kod źródłowy użyty do przeprowadzenia treningu\n",
" - wyniki (np. ostateczny loss albo wyniki ewaluacji)\n",
"2. Wykorzystaj 2 obserwatory [5pkt]: \n",
" - MongoObserver, skorzytaj nastęþującego URL: `mongodb://mongo_user:mongo_password_IUM_2021@localhost:27017` (będziesz mógł przeglądać wyniki na http://tzietkiewicz.vm.wmi.amu.edu.pl:9000/sacred)\n",
" - FileObserver - zapisane pliki zarchiwizuj na Jenkinsie jako jego artefakty\n"
]
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": false,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": false,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}