diff --git a/wyk/12_Propagacja_wsteczna.ipynb b/wyk/12_Propagacja_wsteczna.ipynb
new file mode 100644
index 0000000..a603064
--- /dev/null
+++ b/wyk/12_Propagacja_wsteczna.ipynb
@@ -0,0 +1,1940 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "# 12. Sieci neuronowe – propagacja wsteczna"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "notes"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "%matplotlib inline\n",
+ "\n",
+ "import numpy as np\n",
+ "import math"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "## 12.1. Metoda propagacji wstecznej – wprowadzenie"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Architektura sieci neuronowych\n",
+ "\n",
+ "* Budowa warstwowa, najczęściej sieci jednokierunkowe i gęste.\n",
+ "* Liczbę i rozmiar warstw dobiera się do każdego problemu.\n",
+ "* Rozmiary sieci określane poprzez liczbę neuronów lub parametrów."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### _Feedforward_\n",
+ "\n",
+ "Mając daną $n$-warstwową sieć neuronową oraz jej parametry $\\Theta^{(1)}, \\ldots, \\Theta^{(L)} $ oraz $\\beta^{(1)}, \\ldots, \\beta^{(L)} $, obliczamy:\n",
+ "\n",
+ "$$a^{(l)} = g^{(l)}\\left( a^{(l-1)} \\Theta^{(l)} + \\beta^{(l)} \\right). $$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Funkcje $g^{(l)}$ to **funkcje aktywacji**.
\n",
+ "Dla $i = 0$ przyjmujemy $a^{(0)} = x$ (wektor wierszowy cech) oraz $g^{(0)}(x) = x$ (identyczność)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "* Parametry $\\Theta$ to wagi na połączeniach miedzy neuronami dwóch warstw.
\n",
+ "Rozmiar macierzy $\\Theta^{(l)}$, czyli macierzy wag na połączeniach warstw $a^{(l-1)}$ i $a^{(l)}$, to $\\dim(a^{(l-1)}) \\times \\dim(a^{(l)})$."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Parametry $\\beta$ zastępują tutaj dodawanie kolumny z jedynkami do macierzy cech.
Macierz $\\beta^{(l)}$ ma rozmiar równy liczbie neuronów w odpowiedniej warstwie, czyli $1 \\times \\dim(a^{(l)})$."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "* **Klasyfikacja**: dla ostatniej warstwy $L$ (o rozmiarze równym liczbie klas) przyjmuje się $g^{(L)}(x) = \\mathop{\\mathrm{softmax}}(x)$.\n",
+ "* **Regresja**: pojedynczy neuron wyjściowy; funkcją aktywacji może wtedy być np. funkcja identycznościowa."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Jak uczyć sieci neuronowe?"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "* W poznanych do tej pory algorytmach (regresja liniowa, regresja logistyczna) do uczenia używaliśmy funkcji kosztu, jej gradientu oraz algorytmu gradientu prostego (GD/SGD)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "* Dla sieci neuronowych potrzebowalibyśmy również znaleźć gradient funkcji kosztu."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Sprowadza się to do bardziej ogólnego problemu:
jak obliczyć gradient $\\nabla f(x)$ dla danej funkcji $f$ i wektora wejściowego $x$?"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Pochodna funkcji\n",
+ "\n",
+ "* **Pochodna** mierzy, jak szybko zmienia się wartość funkcji względem zmiany jej argumentów:\n",
+ "\n",
+ "$$ \\frac{d f(x)}{d x} = \\lim_{h \\to 0} \\frac{ f(x + h) - f(x) }{ h } $$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Pochodna cząstkowa i gradient\n",
+ "\n",
+ "* **Pochodna cząstkowa** mierzy, jak szybko zmienia się wartość funkcji względem zmiany jej *pojedynczego argumentu*."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "* **Gradient** to wektor pochodnych cząstkowych:\n",
+ "\n",
+ "$$ \\nabla f = \\left( \\frac{\\partial f}{\\partial x_1}, \\ldots, \\frac{\\partial f}{\\partial x_n} \\right) $$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "#### Gradient – przykłady"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "$$ f(x_1, x_2) = x_1 + x_2 \\qquad \\to \\qquad \\frac{\\partial f}{\\partial x_1} = 1, \\quad \\frac{\\partial f}{\\partial x_2} = 1, \\quad \\nabla f = (1, 1) $$ "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "$$ f(x_1, x_2) = x_1 \\cdot x_2 \\qquad \\to \\qquad \\frac{\\partial f}{\\partial x_1} = x_2, \\quad \\frac{\\partial f}{\\partial x_2} = x_1, \\quad \\nabla f = (x_2, x_1) $$ "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "$$ f(x_1, x_2) = \\max(x_1 + x_2) \\hskip{12em} \\\\\n",
+ "\\to \\qquad \\frac{\\partial f}{\\partial x_1} = \\mathbb{1}_{x \\geq y}, \\quad \\frac{\\partial f}{\\partial x_2} = \\mathbb{1}_{y \\geq x}, \\quad \\nabla f = (\\mathbb{1}_{x \\geq y}, \\mathbb{1}_{y \\geq x}) $$ "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Własności pochodnych cząstkowych\n",
+ "\n",
+ "Jezeli $f(x, y, z) = (x + y) \\, z$ oraz $x + y = q$, to:\n",
+ "$$f = q z,\n",
+ "\\quad \\frac{\\partial f}{\\partial q} = z,\n",
+ "\\quad \\frac{\\partial f}{\\partial z} = q,\n",
+ "\\quad \\frac{\\partial q}{\\partial x} = 1,\n",
+ "\\quad \\frac{\\partial q}{\\partial y} = 1 $$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "#### Reguła łańcuchowa\n",
+ "\n",
+ "$$ \\frac{\\partial f}{\\partial x} = \\frac{\\partial f}{\\partial q} \\, \\frac{\\partial q}{\\partial x},\n",
+ "\\quad \\frac{\\partial f}{\\partial y} = \\frac{\\partial f}{\\partial q} \\, \\frac{\\partial q}{\\partial y} $$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Propagacja wsteczna – prosty przykład"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "# Dla ustalonego wejścia\n",
+ "x = -2; y = 5; z = -4"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "(3, -12)\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Krok w przód\n",
+ "q = x + y\n",
+ "f = q * z\n",
+ "print(q, f)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[-4, -4, 3]\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Propagacja wsteczna dla f = q * z\n",
+ "# Oznaczmy symbolami `dfx`, `dfy`, `dfz`, `dfq` odpowiednio \n",
+ "# pochodne cząstkowe ∂f/∂x, ∂f/∂y, ∂f/∂z, ∂f/∂q\n",
+ "dfz = q\n",
+ "dfq = z\n",
+ "# Propagacja wsteczna dla q = x + y\n",
+ "dfx = 1 * dfq # z reguły łańcuchowej\n",
+ "dfy = 1 * dfq # z reguły łańcuchowej\n",
+ "print([dfx, dfy, dfz])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Właśnie tak wygląda obliczanie pochodnych metodą propagacji wstecznej!"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "* Spróbujmy czegoś bardziej skomplikowanego:
metodą propagacji wstecznej obliczmy pochodną funkcji sigmoidalnej."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Propagacja wsteczna – funkcja sigmoidalna"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "Funkcja sigmoidalna:\n",
+ "\n",
+ "$$f(\\theta,x) = \\frac{1}{1+e^{-(\\theta_0 x_0 + \\theta_1 x_1 + \\theta_2)}}$$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "$$\n",
+ "\\begin{array}{lcl}\n",
+ "f(x) = \\frac{1}{x} \\quad & \\rightarrow & \\quad \\frac{df}{dx} = -\\frac{1}{x^2} \\\\\n",
+ "f_c(x) = c + x \\quad & \\rightarrow & \\quad \\frac{df}{dx} = 1 \\\\\n",
+ "f(x) = e^x \\quad & \\rightarrow & \\quad \\frac{df}{dx} = e^x \\\\\n",
+ "f_a(x) = ax \\quad & \\rightarrow & \\quad \\frac{df}{dx} = a \\\\\n",
+ "\\end{array}\n",
+ "$$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[0.3932238664829637, -0.5898357997244456]\n",
+ "[-0.19661193324148185, -0.3932238664829637, 0.19661193324148185]\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Losowe wagi i dane\n",
+ "w = [2,-3,-3]\n",
+ "x = [-1, -2]\n",
+ "\n",
+ "# Krok w przód\n",
+ "dot = w[0]*x[0] + w[1]*x[1] + w[2]\n",
+ "f = 1.0 / (1 + math.exp(-dot)) # funkcja sigmoidalna\n",
+ "\n",
+ "# Krok w tył\n",
+ "ddot = (1 - f) * f # pochodna funkcji sigmoidalnej\n",
+ "dx = [w[0] * ddot, w[1] * ddot]\n",
+ "dw = [x[0] * ddot, x[1] * ddot, 1.0 * ddot]\n",
+ "\n",
+ "print(dx)\n",
+ "print(dw)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Obliczanie gradientów – podsumowanie"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Gradient $f$ dla $x$ mówi, jak zmieni się całe wyrażenie przy zmianie wartości $x$."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "* Gradienty łączymy, korzystając z **reguły łańcuchowej**."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "* W kroku \"wstecz\" gradienty informują, które części grafu powinny być zwiększone lub zmniejszone (i z jaką siłą), aby zwiększyć wartość na wyjściu."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "* W kontekście implementacji chcemy dzielić funkcję $f$ na części, dla których można łatwo obliczyć gradienty."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "## 12.2. Uczenie wielowarstwowych sieci neuronowych metodą propagacji wstecznej"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Mając algorytm SGD oraz gradienty wszystkich wag, moglibyśmy trenować każdą sieć."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Niech $\\Theta = (\\Theta^{(1)},\\Theta^{(2)},\\Theta^{(3)},\\beta^{(1)},\\beta^{(2)},\\beta^{(3)})$\n",
+ "* Funkcja sieci neuronowej z grafiki:\n",
+ "$$\\small h_\\Theta(x) = \\tanh(\\tanh(\\tanh(x\\Theta^{(1)}+\\beta^{(1)})\\Theta^{(2)} + \\beta^{(2)})\\Theta^{(3)} + \\beta^{(3)})$$\n",
+ "* Funkcja kosztu dla regresji:\n",
+ "$$J(\\Theta) = \\dfrac{1}{2m} \\sum_{i=1}^{m} (h_\\Theta(x^{(i)})- y^{(i)})^2 $$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Jak obliczymy gradienty?\n",
+ "\n",
+ "$$\\nabla_{\\Theta^{(l)}} J(\\Theta) = ? \\quad \\nabla_{\\beta^{(l)}} J(\\Theta) = ?$$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### W kierunku propagacji wstecznej\n",
+ "\n",
+ "* Pewna (niewielka) zmiana wagi $\\Delta z^l_j$ dla $j$-ego neuronu w warstwie $l$ pociąga za sobą (niewielką) zmianę kosztu: \n",
+ "\n",
+ "$$\\frac{\\partial J(\\Theta)}{\\partial z^{l}_j} \\Delta z^{l}_j$$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Jeżeli $\\frac{\\partial J(\\Theta)}{\\partial z^{l}_j}$ jest duża, $\\Delta z^l_j$ ze znakiem przeciwnym zredukuje koszt.\n",
+ "* Jeżeli $\\frac{\\partial J(\\Theta)}{\\partial z^l_j}$ jest bliska zeru, koszt nie będzie mocno poprawiony."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Definiujemy błąd $\\delta^l_j$ neuronu $j$ w warstwie $l$: \n",
+ "\n",
+ "$$\\delta^l_j := \\dfrac{\\partial J(\\Theta)}{\\partial z^l_j}$$ \n",
+ "$$\\delta^l := \\nabla_{z^l} J(\\Theta) \\quad \\textrm{ (zapis wektorowy)} $$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Podstawowe równania propagacji wstecznej\n",
+ "\n",
+ "$$\n",
+ "\\begin{array}{rcll}\n",
+ "\\delta^L & = & \\nabla_{a^L}J(\\Theta) \\odot { \\left( g^{L} \\right) }^{\\prime} \\left( z^L \\right) & (BP1) \\\\[2mm]\n",
+ "\\delta^{l} & = & \\left( \\left( \\Theta^{l+1} \\right) \\! ^\\top \\, \\delta^{l+1} \\right) \\odot {{ \\left( g^{l} \\right) }^{\\prime}} \\left( z^{l} \\right) & (BP2)\\\\[2mm]\n",
+ "\\nabla_{\\beta^l} J(\\Theta) & = & \\delta^l & (BP3)\\\\[2mm]\n",
+ "\\nabla_{\\Theta^l} J(\\Theta) & = & a^{l-1} \\odot \\delta^l & (BP4)\\\\\n",
+ "\\end{array}\n",
+ "$$\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "#### (BP1)\n",
+ "$$ \\delta^L_j \\; = \\; \\frac{ \\partial J }{ \\partial a^L_j } \\, g' \\!\\! \\left( z^L_j \\right) $$\n",
+ "$$ \\delta^L \\; = \\; \\nabla_{a^L}J(\\Theta) \\odot { \\left( g^{L} \\right) }^{\\prime} \\left( z^L \\right) $$\n",
+ "Błąd w ostatniej warstwie jest iloczynem szybkości zmiany kosztu względem $j$-tego wyjścia i szybkości zmiany funkcji aktywacji w punkcie $z^L_j$."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "#### (BP2)\n",
+ "$$ \\delta^{l} \\; = \\; \\left( \\left( \\Theta^{l+1} \\right) \\! ^\\top \\, \\delta^{l+1} \\right) \\odot {{ \\left( g^{l} \\right) }^{\\prime}} \\left( z^{l} \\right) $$\n",
+ "Aby obliczyć błąd w $l$-tej warstwie, należy przemnożyć błąd z następnej ($(l+1)$-szej) warstwy przez transponowany wektor wag, a uzyskaną macierz pomnożyć po współrzędnych przez szybkość zmiany funkcji aktywacji w punkcie $z^l$."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "#### (BP3)\n",
+ "$$ \\nabla_{\\beta^l} J(\\Theta) \\; = \\; \\delta^l $$\n",
+ "Błąd w $l$-tej warstwie jest równy wartości gradientu funkcji kosztu."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "#### (BP4)\n",
+ "$$ \\nabla_{\\Theta^l} J(\\Theta) \\; = \\; a^{l-1} \\odot \\delta^l $$\n",
+ "Gradient funkcji kosztu względem wag $l$-tej warstwy można obliczyć jako iloczyn po współrzędnych $a^{l-1}$ przez $\\delta^l$."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Algorytm propagacji wstecznej"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "Dla pojedynczego przykładu $(x,y)$:\n",
+ "1. **Wejście**: Ustaw aktywacje w warstwie cech $a^{(0)}=x$ \n",
+ "2. **Feedforward:** dla $l=1,\\dots,L$ oblicz \n",
+ "$z^{(l)} = a^{(l-1)} \\Theta^{(l)} + \\beta^{(l)}$ oraz $a^{(l)}=g^{(l)} \\!\\! \\left( z^{(l)} \\right)$\n",
+ "3. **Błąd wyjścia $\\delta^{(L)}$:** oblicz wektor $$\\delta^{(L)}= \\nabla_{a^{(L)}}J(\\Theta) \\odot {g^{\\prime}}^{(L)} \\!\\! \\left( z^{(L)} \\right) $$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "4. **Propagacja wsteczna błędu:** dla $l = L-1,L-2,\\dots,1$ oblicz $$\\delta^{(l)} = \\delta^{(l+1)}(\\Theta^{(l+1)})^T \\odot {g^{\\prime}}^{(l)} \\!\\! \\left( z^{(l)} \\right) $$\n",
+ "5. **Gradienty:** \n",
+ " * $\\dfrac{\\partial}{\\partial \\Theta_{ij}^{(l)}} J(\\Theta) = a_i^{(l-1)}\\delta_j^{(l)} \\textrm{ oraz } \\dfrac{\\partial}{\\partial \\beta_{j}^{(l)}} J(\\Theta) = \\delta_j^{(l)}$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "W naszym przykładzie:\n",
+ "\n",
+ "$$\\small J(\\Theta) = \\frac{1}{2} \\left( a^{(L)} - y \\right) ^2 $$\n",
+ "$$\\small \\dfrac{\\partial}{\\partial a^{(L)}} J(\\Theta) = a^{(L)} - y$$\n",
+ "\n",
+ "$$\\small \\tanh^{\\prime}(x) = 1 - \\tanh^2(x)$$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Algorytm SGD z propagacją wsteczną\n",
+ "\n",
+ "Pojedyncza iteracja:\n",
+ "1. Dla parametrów $\\Theta = (\\Theta^{(1)},\\ldots,\\Theta^{(L)})$ utwórz pomocnicze macierze zerowe $\\Delta = (\\Delta^{(1)},\\ldots,\\Delta^{(L)})$ o takich samych wymiarach (dla uproszczenia opuszczono wagi $\\beta$)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "2. Dla $m$ przykładów we wsadzie (*batch*), $i = 1,\\ldots,m$:\n",
+ " * Wykonaj algortym propagacji wstecznej dla przykładu $(x^{(i)}, y^{(i)})$ i przechowaj gradienty $\\nabla_{\\Theta}J^{(i)}(\\Theta)$ dla tego przykładu;\n",
+ " * $\\Delta := \\Delta + \\dfrac{1}{m}\\nabla_{\\Theta}J^{(i)}(\\Theta)$\n",
+ "3. Wykonaj aktualizację wag: $\\Theta := \\Theta - \\alpha \\Delta$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Propagacja wsteczna – podsumowanie\n",
+ "\n",
+ "* Algorytm pierwszy raz wprowadzony w latach 70. XX w.\n",
+ "* W 1986 David Rumelhart, Geoffrey Hinton i Ronald Williams pokazali, że jest znacznie szybszy od wcześniejszych metod.\n",
+ "* Obecnie najpopularniejszy algorytm uczenia sieci neuronowych."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "## 12.3. Przykłady implementacji wielowarstwowych sieci neuronowych"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "notes"
+ }
+ },
+ "source": [
+ "### Uwaga!\n",
+ "\n",
+ "Poniższe przykłady wykorzystują interfejs [Keras](https://keras.io), który jest częścią biblioteki [TensorFlow](https://www.tensorflow.org).\n",
+ "\n",
+ "Aby uruchomić TensorFlow w środowisku Jupyter, należy wykonać następujące czynności:\n",
+ "\n",
+ "#### Przed pierwszym uruchomieniem (wystarczy wykonać tylko raz)\n",
+ "\n",
+ "Instalacja biblioteki TensorFlow w środowisku Anaconda:\n",
+ "\n",
+ "1. Uruchom *Anaconda Navigator*\n",
+ "1. Wybierz kafelek *CMD.exe Prompt*\n",
+ "1. Kliknij przycisk *Launch*\n",
+ "1. Pojawi się konsola. Wpisz następujące polecenia, każde zatwierdzając wciśnięciem klawisza Enter:\n",
+ "```\n",
+ "conda create -n tf tensorflow\n",
+ "conda activate tf\n",
+ "conda install pandas matplotlib\n",
+ "jupyter notebook\n",
+ "```\n",
+ "\n",
+ "#### Przed każdym uruchomieniem\n",
+ "\n",
+ "Jeżeli chcemy korzystać z biblioteki TensorFlow, to środowisko Jupyter Notebook należy uruchomić w następujący sposób:\n",
+ "\n",
+ "1. Uruchom *Anaconda Navigator*\n",
+ "1. Wybierz kafelek *CMD.exe Prompt*\n",
+ "1. Kliknij przycisk *Launch*\n",
+ "1. Pojawi się konsola. Wpisz następujące polecenia, każde zatwierdzając wciśnięciem klawisza Enter:\n",
+ "```\n",
+ "conda activate tf\n",
+ "jupyter notebook\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Przykład: MNIST\n",
+ "\n",
+ "_Modified National Institute of Standards and Technology database_"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "* Zbiór cyfr zapisanych pismem odręcznym\n",
+ "* 60 000 przykładów uczących, 10 000 przykładów testowych\n",
+ "* Rozdzielczość każdego przykładu: 28 × 28 = 784 piksele"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "from tensorflow import keras\n",
+ "from tensorflow.keras.datasets import mnist\n",
+ "from tensorflow.keras.layers import Dense, Dropout\n",
+ "\n",
+ "# załaduj dane i podziel je na zbiory uczący i testowy\n",
+ "(x_train, y_train), (x_test, y_test) = mnist.load_data()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "notes"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "from matplotlib import pyplot as plt\n",
+ "\n",
+ "def draw_examples(examples, captions=None):\n",
+ " plt.figure(figsize=(16, 4))\n",
+ " m = len(examples)\n",
+ " for i, example in enumerate(examples):\n",
+ " plt.subplot(100 + m * 10 + i + 1)\n",
+ " plt.imshow(example, cmap=plt.get_cmap('gray'))\n",
+ " plt.show()\n",
+ " if captions is not None:\n",
+ " print(6 * ' ' + (10 * ' ').join(str(captions[i]) for i in range(m)))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/png": "iVBORw0KGgoAAAANSUhEUgAAA54AAACOCAYAAABZsdfhAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAcc0lEQVR4nO3deZBU5bnH8ecVAVFERJAQEYYIQYhsAgpeC0gAV2SRgBL2GKHEBVJCgUoMxiCIStUAEkUujCAlWGHVSJCwSFRCgQTvBVkGjMDgBAYF2Qxc9Nw/6O15menpnjnn9Onu76eKmvOb093nnebh9Lycfvo1juMIAAAAAABeuSTVAwAAAAAAZDYmngAAAAAATzHxBAAAAAB4ioknAAAAAMBTTDwBAAAAAJ5i4gkAAAAA8FS5Jp7GmLuMMbuNMXuNMePcGhTSD7WAMGoBItQBoqgFiFAHiKIWspcp6zqexpgKIrJHRLqKSIGIbBaRfo7jfO7e8JAOqAWEUQsQoQ4QRS1AhDpAFLWQ3S4tx31vEZG9juN8ISJijFkoIj1EpMTCMcaUbZaLVDrqOE6tUm5DLWQBx3FMAjdLqhaog7TEOQFh1AJEJKHXB+ogO3BOQFixtVCet9peJyIHY3JB6HvILPsTuA21gDBqIfNxTkAYtYBEUQfZgXMCwoqthfJc8Szuf7cu+h8JY8wwERlWjuMg+KgFhJVaC9RBVuCcgDBqASLUAaKohSxWnolngYhcH5PrishX9o0cx5klIrNEuFSewagFhJVaC9RBVuCcgDBqASLUAaKohSxWnrfabhaRRsaYBsaYSiLyoIiscGdYSDPUAsKoBYhQB4iiFiBCHSCKWshiZb7i6TjOeWPMYyKySkQqiMgcx3F2uDYypA1qAWHUAkSoA0RRCxChDhBFLWS3Mi+nUqaDcak8HX3qOE4btx+UWkg/CX6qbVKog7TEOQFh1AJEhNcHRHBOQFixtVCet9oCAAAAAFAqJp4AAAAAAE8x8QQAAAAAeIqJJwAAAADAU0w8AQAAAACeYuIJAAAAAPAUE08AAAAAgKeYeAIAAAAAPMXEEwAAAADgqUtTPQAgk7Vu3Vrlxx57TOVBgwapPG/ePJWnT5+u8tatW10cHQAAALySm5ur8hNPPBHZ3r59u9rXrVs3lffv3+/dwFKEK54AAAAAAE8x8QQAAAAAeIq32iaoQoUKKl911VUJ39d+e+Xll1+ucuPGjVV+9NFHVX755ZdV7tevn8r/+c9/VJ48eXJk+7nnnkt4nCi/li1bqrx69WqVq1WrprLjOCoPHDhQ5e7du6t8zTXXlHOEyASdO3dWecGCBSp37NhR5d27d3s+Jnhj/PjxKtvn9Esu0f9/3KlTJ5U//PBDT8YFwB1XXnmlylWrVlX53nvvVblWrVoqT506VeWzZ8+6ODokKycnR+UBAwao/MMPP0S2mzRpovbdeOONKvNWWwAAAAAAksTEEwAAAADgKSaeAAAAAABPZU2PZ7169VSuVKmSyrfddpvKt99+u8rVq1dXuXfv3q6NraCgQOVp06ap3KtXL5VPnjyp8meffaYyPT3+uuWWWyLbixcvVvvsXmC7p9P+uzx37pzKdk9nu3btVLaXV7Hvn+k6dOgQ2bafq6VLl/o9HN+0bdtW5c2bN6doJHDbkCFDVB47dqzKsf1BxbHPMQBSL7bvz/433b59e5VvuummpB67Tp06Kscu1wH/FRUVqbxhwwaV7c/uyDZc8QQAAAAAeIqJJwAAAADAU0w8AQAAAACeytgeT3s9xbVr16qczDqcbrN7dOx12k6dOqWyvUZfYWGhyseOHVOZNfvcZa+7evPNN6v81ltvRbbtXovS5OfnqzxlyhSVFy5cqPLHH3+ssl07kyZNSur46S52zcJGjRqpfZnU42mv1digQQOV69evr7IxxvMxwRv23+Vll12WopGgLG699VaVY9fws9fX/dnPfhb3sUaPHq3yV199pbL9WRSxr0UiIps2bYo/WLjGXn9x1KhRKvfv3z+yXaVKFbXPPl8fPHhQZfuzIOy1H/v27avyzJkzVd61a1cJo4YXTp8+rXImrsVZHlzxBAAAAAB4ioknAAAAAMBTTDwBAAAAAJ7K2B7PAwcOqPz111+r7GaPp91Hcfz4cZV//vOfq2yvtTh//nzXxgL3vf766yr369fPtce2+0WrVq2qsr0ma2xPo4hI8+bNXRtLOho0aFBke+PGjSkcibfs3uGHH35YZbu3i56e9NGlSxeVH3/88bi3t/9uu3XrpvLhw4fdGRgS8sADD6icm5urcs2aNSPbdi/f+vXrVa5Vq5bKL730Utxj249n3//BBx+Me38kzv6d8cUXX1TZroMrr7wy4ce2P+vhzjvvVLlixYoq2+eA2BorLsNf1atXV7lFixapGUhAccUTAAAAAOApJp4AAAAAAE8x8QQAAAAAeCpjezy/+eYblceMGaOy3Rfzz3/+U+Vp06bFffxt27ZFtrt27ar22Wv42Gt1jRw5Mu5jI7Vat26t8r333qtyvDUS7Z7Md999V+WXX35ZZXtdNrsO7TVaf/GLXyQ8lmxgr2+ZqWbPnh13v90jhOCy116cO3euyqV9/oDd98cacd669FL9a1KbNm1UfuONN1S2133esGFDZPv5559X+z766COVK1eurPI777yj8h133BF3rFu2bIm7H2XXq1cvlX/zm9+U+bH27dunsv07pL2OZ8OGDct8LPjPPgfUq1cv4fu2bdtWZbufNxPO99nxWxsAAAAAIGVKnXgaY+YYY44YY7bHfK+GMWa1MSY/9PVqb4eJIKAWEEYtQIQ6QBS1gDBqASLUAYqXyBXPPBG5y/reOBFZ4zhOIxFZE8rIfHlCLeCCPKEWQB0gKk+oBVyQJ9QCqAMUo9QeT8dxNhhjcqxv9xCRTqHtN0VkvYiMdXNgblu2bJnKa9euVfnkyZMq2+vuPPTQQyrH9urZPZ22HTt2qDxs2LC4tw+qTKkFW8uWLVVevXq1ytWqVVPZcRyVV65cGdm21/js2LGjyuPHj1fZ7t0rKipS+bPPPlP5hx9+UNnuP7XXBd26dat4IVW1YK9bWrt2bTcfPrBK6/uza9YvmXpO8NLgwYNV/vGPfxz39vZaj/PmzXN7SK7I1FoYMGCAyqX1W9v/FmPXdzxx4kTc+9prQZbW01lQUKDym2++Gff2fsnEWujTp09St//yyy9V3rx5c2R77Fj9Y9s9nbYmTZokdeygyMQ6SIT92R15eXkqT5gwocT72vuOHz+u8owZM8oxsmAoa49nbcdxCkVEQl+vdW9ISDPUAsKoBYhQB4iiFhBGLUCEOsh6nn+qrTFmmIik5yU+uIpagAh1gChqAWHUAkSoA0RRC5mprFc8Dxtj6oiIhL4eKemGjuPMchynjeM4bUq6DdIatYCwhGqBOsh4nBMQRi0gjNcHiHBOyHplveK5QkQGi8jk0Nflro3IJ6X1Wnz77bdx9z/88MOR7UWLFql9dh9ehku7WvjpT3+qsr3Gq91Pd/ToUZULCwtVju2rOXXqlNr3l7/8JW4urypVqqj85JNPqty/f39Xj1cKz2vhnnvuUdn++TOF3bvaoEGDuLc/dOiQl8NJVtqdE7xUs2ZNlX/961+rbL9e2D09f/zjHz0Zl0/SrhbstTaffvpple0e/5kzZ6ps9/GX9rtGrGeeeSbh24qIPPHEEyrbnxEQMGlXC7Fif+cTufizOj744AOV9+7dq/KRIyXOr0qVYZ9lkNZ1UBb2OSVej2c2SGQ5lbdFZKOINDbGFBhjHpILBdPVGJMvIl1DGRmOWkAYtQAR6gBR1ALCqAWIUAcoXiKfatuvhF2dXR4LAo5aQBi1ABHqAFHUAsKoBYhQByheWXs8AQAAAABIiOefapuu7Pdgt27dWuXY9Rm7dOmi9tnv9UdqVa5cWeXYNVhFLu4btNd0HTRokMpbtmxROUh9hvXq1Uv1EDzVuHHjEvfZ6+WmM7tG7R6fPXv2qGzXLFIrJycnsr148eKk7jt9+nSV161b58aQUIJnn31WZbun89y5cyqvWrVKZXtNxu+++67EY1122WUq2+t02udvY4zKdr/v8uUZ3x4XGPbajH726bVv3963Y8F7l1wSveaXZZ8JIyJc8QQAAAAAeIyJJwAAAADAU0w8AQAAAACeosezBKdPn1bZXsNp69atke033nhD7bN7cuyewFdffVVle10wuKtVq1Yq2z2dth49eqj84Ycfuj4muG/z5s2pHkKJqlWrpvJdd92l8oABA1S2e79s9rpg9tqPSK3Yv9/mzZvHve2aNWtUzs3N9WRMuKB69eoqjxgxQmX79dju6ezZs2dSx2vYsGFke8GCBWqf/dkRtj//+c8qT5kyJaljIzhi11y94oorkrpvs2bN4u7/5JNPVN64cWNSjw9/xfZ1ZuPv/1zxBAAAAAB4ioknAAAAAMBTvNU2Qfv27VN5yJAhke25c+eqfQMHDoyb7bdZzJs3T+XCwsKyDhPFmDp1qsr2R9Tbb6UN8ltrYz+GWyQ7P4q7JDVq1CjX/Vu0aKGyXSf2skl169ZVuVKlSpHt/v37q33235u95MKmTZtUPnv2rMqXXqpP1Z9++qkgOOy3X06ePLnE23700UcqDx48WOVvv/3WtXHhYrH/TkVEatasGff2sW+RFBG59tprVR46dKjK3bt3V/mmm26KbFetWlXts99mZ+e33npLZbsFCKlz+eWXq9y0aVOVf//736scr8Un2dd1e2kXuwa///77uPcHUokrngAAAAAATzHxBAAAAAB4ioknAAAAAMBT9HiW0dKlSyPb+fn5ap/dU9i5c2eVX3jhBZXr16+v8sSJE1U+dOhQmceZjbp166Zyy5YtVbb7aFasWOH1kFxj937YP8u2bdt8HI3/7N7I2J//tddeU/uefvrppB7bXvbC7vE8f/68ymfOnFH5888/j2zPmTNH7bOXVLL7iA8fPqxyQUGBylWqVFF5165dgtTJyclRefHixQnf94svvlDZ/ruHt86dO6dyUVGRyrVq1VL5X//6l8rJLn8Q24934sQJta9OnToqHz16VOV33303qWPBPRUrVlTZXpbN/jdv/13ar1WxdWAvd2Ivr2X3j9rsnv/7779fZXtJJrvmgVTiiicAAAAAwFNMPAEAAAAAnmLiCQAAAADwFD2eLti+fbvKffv2Vfm+++5T2V73c/jw4So3atRI5a5du5Z3iFnF7oez1207cuSIyosWLfJ8TImqXLmyyhMmTIh7+7Vr16r81FNPuT2kQBkxYoTK+/fvj2zfdttt5XrsAwcOqLxs2TKVd+7cqfI//vGPch0v1rBhw1S2+8zsvkCk1tixY1VOZj3deGt8wnvHjx9X2V6D9b333lPZXh/YXtN7+fLlKufl5an8zTffRLYXLlyo9tl9gfZ++Mf+PcHuu1yyZEnc+z/33HMq26/NH3/8cWTbrin7trFrvxbHfn2YNGmSyqW9ltnrRMNfseu2lvba0aFDB5VnzJjhyZj8xBVPAAAAAICnmHgCAAAAADzFxBMAAAAA4Cl6PD1g95DMnz9f5dmzZ6tsr8lkv6e7U6dOKq9fv75c48t2dn9DYWFhikZycU/n+PHjVR4zZozK9vqOr7zyisqnTp1ycXTB9+KLL6Z6CK6w1/q1JbNOJNxnrwV8xx13JHxfuwdw9+7dbgwJLtm0aZPKdv9cecW+nnfs2FHts/u76OX2j71Op92jab/22lauXKny9OnTVbZ/D4ytq/fff1/ta9asmcr2uptTpkxR2e4B7dGjh8oLFixQ+W9/+5vK9uvmsWPHpCSZvjZ4KsT+uy9tXWB7jdamTZuqHLt+eLrgiicAAAAAwFNMPAEAAAAAnmLiCQAAAADwFD2eLmjevLnKv/zlL1Vu27atynZPp81+z/aGDRvKMTrYVqxYkbJj271idh/JAw88oLLdH9a7d29PxoVgW7p0aaqHkNU++OADla+++uq4t49d43XIkCFeDAlpInZdabun0+7vYh1P71SoUEHl559/XuXRo0erfPr0aZXHjRunsv13Zfd0tmnTRuXY9RdbtWql9uXn56v8yCOPqLxu3TqVq1WrprK9hnX//v1V7t69u8qrV6+Wkhw8eFDlBg0alHhblM1rr70W2R4+fHhS97XX/B41apQbQ/IVVzwBAAAAAJ5i4gkAAAAA8BQTTwAAAACAp+jxTFDjxo1VfuyxxyLb9jo7P/rRj5J67O+//15le11Juy8E8Rlj4uaePXuqPHLkSM/G8tvf/lbl3/3udypfddVVKtvrbw0aNMibgQFI2DXXXKNyaefkmTNnRrazbW1daKtWrUr1ECAX98bZPZ1nzpxR2e69s/u827Vrp/LQoUNVvvvuu1WO7fX9wx/+oPbNnTtXZbvP0nbixAmV//rXv8bN/fr1U/lXv/pViY9t/84C9+3atSvVQ0gprngCAAAAADxV6sTTGHO9MWadMWanMWaHMWZk6Ps1jDGrjTH5oa/xP+YPaY9agAh1gChqAWHUAkSoA0RRCyhOIlc8z4vIk47jNBGRdiLyqDGmqYiME5E1juM0EpE1oYzMRi1AhDpAFLWAMGoBItQBoqgFXMTY60iVegdjlovIjNCfTo7jFBpj6ojIesdxGpdy3+QO5iO7L9N+T3xsT6eISE5OTpmPtWXLFpUnTpyocirXmSzGp47jtCluR1BroU+fPiq//fbbKts9ta+//rrKc+bMUfnrr79W2e7tGDhwYGS7RYsWal/dunVVPnDggMqx6/2JiOTm5sbdn0qO45jivh/UOkgnixYtUrlv374qDx48WOV58+Z5PqY40u6ckCy758pei7O0Hs+f/OQnke39+/e7Nq4AyvhaKK8777wzsv3++++rffbvX3Xq1FG5qKjIu4G5LOivD/ZnZ9SqVUvls2fPqmz34V1xxRUqN2zYMKnjT5gwIbI9adIktc/+nSTNcU5Iwp49e1S+4YYb4t7+kkv09UK7Dvft2+fOwNxRbC0k1eNpjMkRkVYisklEajuOUygiEvp6rQuDRJqgFiBCHSCKWkAYtQAR6gBR1ALCEv5UW2NMVRFZLCKjHMc5YX9SaJz7DRORYaXeEGmDWoAIdYAoagFh1AJEqANEUQuIldAVT2NMRblQNAscx1kS+vbh0CVyCX09Utx9HceZ5ThOm5IuvSO9UAsQoQ4QRS0gjFqACHWAKGoBtlKveJoL/zXx3yKy03GcqTG7VojIYBGZHPq63JMRuqR27doqN23aVOUZM2aofOONN5b5WJs2bVL5pZdeUnn5cv1Upcs6nZlSCxUqVFB5xIgRKvfu3Vtle82sRo0aJXysTz75ROV169ap/Oyzzyb8WEGRKXUQZHbvl93XERSZUgstW7ZUuUuXLirb5+hz586p/Oqrr6p8+PBh9waXJjKlFtwW2++bDYJaB//+979Vtns8K1eurLL9eQ02u193w4YNKi9btkzlL7/8MrKdYT2dJQpqLQTJjh07VC7tfJEu84V4Enmr7X+JyEAR+V9jzLbQ956WCwXzjjHmIRE5ICJ9ir87Mgi1ABHqAFHUAsKoBYhQB4iiFnCRUieejuN8JCIlvSG7s7vDQZBRCxChDhBFLSCMWoAIdYAoagHFCeb7twAAAAAAGSPhT7UNuho1aqhsr81o9/CUt+8itnfvlVdeUftWrVql8nfffVeuYyE5GzduVHnz5s0qt23bNu797TVd7f5gW+w6nwsXLlT7Ro4cGfe+QCLat2+vcl5eXmoGkqGqV6+usn0OsB06dEjl0aNHuz0kZIi///3vkW27VzsT+rXSRYcOHVTu2bOnyjfffLPKR47oz7ux1/c+duyYynbfN5CIWbNmqXzfffelaCT+4YonAAAAAMBTTDwBAAAAAJ5i4gkAAAAA8FRa9Xjeeuutke0xY8aofbfccovK1113XbmOdebMGZWnTZum8gsvvBDZPn36dLmOBXcVFBSofP/996s8fPhwlcePH5/U4+fm5qr8pz/9KbK9d+/epB4LKM6F5c8ApLvt27dHtvPz89U++7MmbrjhBpWLioq8G1iWOXnypMrz58+PmwE/fP755yrv3LlT5SZNmvg5HF9wxRMAAAAA4CkmngAAAAAAT6XVW2179epV7HYi7MvZ7733nsrnz59X2V4i5fjx40kdD8FRWFio8oQJE+JmwG8rV65UuU+fPikaSXbatWuXyrHLZYmI3H777X4OBxkqtkVHRGT27NkqT5w4UeXHH39cZfv3GADpbf/+/So3a9YsRSPxD1c8AQAAAACeYuIJAAAAAPAUE08AAAAAgKeM4zj+HcwY/w4Gt3zqOE4btx+UWkg/juO4vsYHdZCWOCcgjFpIQrVq1VR+5513VO7SpYvKS5YsUXno0KEqB2kpN14fEMI5AWHF1gJXPAEAAAAAnmLiCQAAAADwFBNPAAAAAICn0modTwAAgHR04sQJlfv27auyvY7nI488orK95jTregJIN1zxBAAAAAB4ioknAAAAAMBTTDwBAAAAAJ5iHU+UhjWZICKs04YIzgkIoxYgIrw+IIJzAsJYxxMAAAAA4D8mngAAAAAATzHxBAAAAAB4yu91PI+KyH4RqRnaDiLGptX36HGDXgtBHZcIdeA3xqZRC8FELfgnqOMSoQ78xtg0aiGYAlMLvn64UOSgxmzxovnYDYzNX0H9mYI6LpFgj62sgvwzMTZ/BflnYmz+CurPFNRxiQR7bGUV5J+JsfkryD8TY0sMb7UFAAAAAHiKiScAAAAAwFOpmnjOStFxE8HY/BXUnymo4xIJ9tjKKsg/E2PzV5B/Jsbmr6D+TEEdl0iwx1ZWQf6ZGJu/gvwzMbYEpKTHEwAAAACQPXirLQAAAADAU75OPI0xdxljdhtj9hpjxvl57BLGM8cYc8QYsz3mezWMMauNMfmhr1enYFzXG2PWGWN2GmN2GGNGBmVsbglSLQS1DkLjoBb8HUsga4E68H0sgayD0DioBX/HQi2kELWQ0LioA3/HEsg6CI0j8LXg28TTGFNBRF4VkbtFpKmI9DPGNPXr+CXIE5G7rO+NE5E1juM0EpE1oey38yLypOM4TUSknYg8GnqugjC2cgtgLeRJMOtAhFrwW54EsxaoA3/lSTDrQIRa8FueUAspQS0kjDrwV54Esw5E0qEWHMfx5Y+ItBeRVTH5KRF5yq/jxxlXjohsj8m7RaROaLuOiOwOwBiXi0jXII4tU2ohHeqAWqAWqAPqgFqgFqgFaoE6oA7StRb8fKvtdSJyMCYXhL4XNLUdxykUEQl9vTaVgzHG5IhIKxHZJAEbWzmkQy0E7rmmFlImUM81dZAygXuuqYWUCdxzTS2kTKCea+ogZQL3XAe1FvyceJpivsdH6sZhjKkqIotFZJTjOCdSPR4XUQtJohYgQh0gilpAGLUAEeoAUUGuBT8nngUicn1MrisiX/l4/EQdNsbUEREJfT2SikEYYyrKhaJZ4DjOkiCNzQXpUAuBea6phZQLxHNNHaRcYJ5raiHlAvNcUwspF4jnmjpIucA810GvBT8nnptFpJExpoExppKIPCgiK3w8fqJWiMjg0PZgufD+aF8ZY4yI/LeI7HQcZ2qQxuaSdKiFQDzX1EIgpPy5pg4CIRDPNbUQCIF4rqmFQEj5c00dBEIgnuu0qAWfm1zvEZE9IrJPRJ5JVWNrzHjeFpFCEfk/ufA/Kg+JyDVy4ROf8kNfa6RgXLfLhbcR/I+IbAv9uScIY8vEWghqHVAL1AJ1QB1QC9QCtUAtUAfUQabUggkNFAAAAAAAT/j5VlsAAAAAQBZi4gkAAAAA8BQTTwAAAACAp5h4AgAAAAA8xcQTAAAAAOApJp4AAAAAAE8x8QQAAAAAeIqJJwAAAADAU/8PSXdIbxrRR2wAAAAASUVORK5CYII=",
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {
+ "needs_background": "light"
+ },
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " 5 0 4 1 9 2 1\n"
+ ]
+ }
+ ],
+ "source": [
+ "draw_examples(x_train[:7], captions=y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "60000 przykładów uczących\n",
+ "10000 przykładów testowych\n"
+ ]
+ }
+ ],
+ "source": [
+ "num_classes = 10\n",
+ "\n",
+ "x_train = x_train.reshape(60000, 784) # 784 = 28 * 28\n",
+ "x_test = x_test.reshape(10000, 784)\n",
+ "x_train = x_train.astype('float32')\n",
+ "x_test = x_test.astype('float32')\n",
+ "x_train /= 255\n",
+ "x_test /= 255\n",
+ "print('{} przykładów uczących'.format(x_train.shape[0]))\n",
+ "print('{} przykładów testowych'.format(x_test.shape[0]))\n",
+ "\n",
+ "# przekonwertuj wektory klas na binarne macierze klas\n",
+ "y_train = keras.utils.to_categorical(y_train, num_classes)\n",
+ "y_test = keras.utils.to_categorical(y_test, num_classes)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {
+ "scrolled": true,
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Model: \"sequential\"\n",
+ "_________________________________________________________________\n",
+ "Layer (type) Output Shape Param # \n",
+ "=================================================================\n",
+ "dense (Dense) (None, 512) 401920 \n",
+ "_________________________________________________________________\n",
+ "dense_1 (Dense) (None, 512) 262656 \n",
+ "_________________________________________________________________\n",
+ "dense_2 (Dense) (None, 10) 5130 \n",
+ "=================================================================\n",
+ "Total params: 669,706\n",
+ "Trainable params: 669,706\n",
+ "Non-trainable params: 0\n",
+ "_________________________________________________________________\n"
+ ]
+ }
+ ],
+ "source": [
+ "model = keras.Sequential()\n",
+ "model.add(Dense(512, activation='relu', input_shape=(784,)))\n",
+ "# model.add(Dropout(0.2))\n",
+ "model.add(Dense(512, activation='relu'))\n",
+ "# model.add(Dropout(0.2))\n",
+ "model.add(Dense(num_classes, activation='softmax'))\n",
+ "model.summary()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "(60000, 784) (60000, 10)\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(x_train.shape, y_train.shape)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Epoch 1/5\n",
+ "469/469 [==============================] - 14s 31ms/step - loss: 0.2219 - accuracy: 0.9312 - val_loss: 0.1181 - val_accuracy: 0.9620\n",
+ "Epoch 2/5\n",
+ "469/469 [==============================] - 14s 30ms/step - loss: 0.0831 - accuracy: 0.9746 - val_loss: 0.0928 - val_accuracy: 0.9726\n",
+ "Epoch 3/5\n",
+ "469/469 [==============================] - 15s 31ms/step - loss: 0.0538 - accuracy: 0.9835 - val_loss: 0.0892 - val_accuracy: 0.9762\n",
+ "Epoch 4/5\n",
+ "321/469 [===================>..........] - ETA: 4s - loss: 0.0389 - accuracy: 0.9881 ETA: - E -"
+ ]
+ }
+ ],
+ "source": [
+ "model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.RMSprop(), metrics=['accuracy'])\n",
+ "\n",
+ "model.fit(x_train, y_train, batch_size=128, epochs=5, verbose=1,\n",
+ " validation_data=(x_test, y_test))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 60,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Test loss: 0.08859136700630188\n",
+ "Test accuracy: 0.9771999716758728\n"
+ ]
+ }
+ ],
+ "source": [
+ "score = model.evaluate(x_test, y_test, verbose=0)\n",
+ "\n",
+ "print('Test loss: {}'.format(score[0]))\n",
+ "print('Test accuracy: {}'.format(score[1]))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Warstwa _dropout_ to metoda regularyzacji, służy zapobieganiu nadmiernemu dopasowaniu sieci. Polega na tym, że część węzłów sieci jest usuwana w sposób losowy."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 61,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "notes"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Model: \"sequential_22\"\n",
+ "_________________________________________________________________\n",
+ "Layer (type) Output Shape Param # \n",
+ "=================================================================\n",
+ "dense_62 (Dense) (None, 512) 401920 \n",
+ "_________________________________________________________________\n",
+ "dense_63 (Dense) (None, 512) 262656 \n",
+ "_________________________________________________________________\n",
+ "dense_64 (Dense) (None, 10) 5130 \n",
+ "=================================================================\n",
+ "Total params: 669,706\n",
+ "Trainable params: 669,706\n",
+ "Non-trainable params: 0\n",
+ "_________________________________________________________________\n",
+ "Epoch 1/5\n",
+ "469/469 [==============================] - 10s 20ms/step - loss: 0.2203 - accuracy: 0.9317 - val_loss: 0.0936 - val_accuracy: 0.9697\n",
+ "Epoch 2/5\n",
+ "469/469 [==============================] - 10s 21ms/step - loss: 0.0816 - accuracy: 0.9746 - val_loss: 0.0747 - val_accuracy: 0.9779\n",
+ "Epoch 3/5\n",
+ "469/469 [==============================] - 10s 20ms/step - loss: 0.0544 - accuracy: 0.9827 - val_loss: 0.0674 - val_accuracy: 0.9798\n",
+ "Epoch 4/5\n",
+ "469/469 [==============================] - 10s 22ms/step - loss: 0.0384 - accuracy: 0.9879 - val_loss: 0.0746 - val_accuracy: 0.9806\n",
+ "Epoch 5/5\n",
+ "469/469 [==============================] - 10s 22ms/step - loss: 0.0298 - accuracy: 0.9901 - val_loss: 0.0736 - val_accuracy: 0.9801\n"
+ ]
+ },
+ {
+ "data": {
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 61,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# Bez warstw Dropout\n",
+ "\n",
+ "num_classes = 10\n",
+ "\n",
+ "(x_train, y_train), (x_test, y_test) = mnist.load_data()\n",
+ "\n",
+ "x_train = x_train.reshape(60000, 784) # 784 = 28 * 28\n",
+ "x_test = x_test.reshape(10000, 784)\n",
+ "x_train = x_train.astype('float32')\n",
+ "x_test = x_test.astype('float32')\n",
+ "x_train /= 255\n",
+ "x_test /= 255\n",
+ "\n",
+ "y_train = keras.utils.to_categorical(y_train, num_classes)\n",
+ "y_test = keras.utils.to_categorical(y_test, num_classes)\n",
+ "\n",
+ "model_no_dropout = keras.Sequential()\n",
+ "model_no_dropout.add(Dense(512, activation='relu', input_shape=(784,)))\n",
+ "model_no_dropout.add(Dense(512, activation='relu'))\n",
+ "model_no_dropout.add(Dense(num_classes, activation='softmax'))\n",
+ "model_no_dropout.summary()\n",
+ "\n",
+ "model_no_dropout.compile(loss='categorical_crossentropy',\n",
+ " optimizer=keras.optimizers.RMSprop(),\n",
+ " metrics=['accuracy'])\n",
+ "\n",
+ "model_no_dropout.fit(x_train, y_train,\n",
+ " batch_size=128,\n",
+ " epochs=5,\n",
+ " verbose=1,\n",
+ " validation_data=(x_test, y_test))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 62,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Test loss (no dropout): 0.07358124107122421\n",
+ "Test accuracy (no dropout): 0.9800999760627747\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Bez warstw Dropout\n",
+ "\n",
+ "score = model_no_dropout.evaluate(x_test, y_test, verbose=0)\n",
+ "\n",
+ "print('Test loss (no dropout): {}'.format(score[0]))\n",
+ "print('Test accuracy (no dropout): {}'.format(score[1]))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 63,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "notes"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Model: \"sequential_23\"\n",
+ "_________________________________________________________________\n",
+ "Layer (type) Output Shape Param # \n",
+ "=================================================================\n",
+ "dense_65 (Dense) (None, 2500) 1962500 \n",
+ "_________________________________________________________________\n",
+ "dense_66 (Dense) (None, 2000) 5002000 \n",
+ "_________________________________________________________________\n",
+ "dense_67 (Dense) (None, 1500) 3001500 \n",
+ "_________________________________________________________________\n",
+ "dense_68 (Dense) (None, 1000) 1501000 \n",
+ "_________________________________________________________________\n",
+ "dense_69 (Dense) (None, 500) 500500 \n",
+ "_________________________________________________________________\n",
+ "dense_70 (Dense) (None, 10) 5010 \n",
+ "=================================================================\n",
+ "Total params: 11,972,510\n",
+ "Trainable params: 11,972,510\n",
+ "Non-trainable params: 0\n",
+ "_________________________________________________________________\n",
+ "Epoch 1/10\n",
+ "469/469 [==============================] - 129s 275ms/step - loss: 0.9587 - accuracy: 0.7005 - val_loss: 0.5066 - val_accuracy: 0.8566\n",
+ "Epoch 2/10\n",
+ "469/469 [==============================] - 130s 276ms/step - loss: 0.2666 - accuracy: 0.9234 - val_loss: 0.3376 - val_accuracy: 0.9024\n",
+ "Epoch 3/10\n",
+ "469/469 [==============================] - 130s 277ms/step - loss: 0.1811 - accuracy: 0.9477 - val_loss: 0.1678 - val_accuracy: 0.9520\n",
+ "Epoch 4/10\n",
+ "469/469 [==============================] - 134s 287ms/step - loss: 0.1402 - accuracy: 0.9588 - val_loss: 0.1553 - val_accuracy: 0.9576\n",
+ "Epoch 5/10\n",
+ "469/469 [==============================] - 130s 278ms/step - loss: 0.1153 - accuracy: 0.9662 - val_loss: 0.1399 - val_accuracy: 0.9599\n",
+ "Epoch 6/10\n",
+ "469/469 [==============================] - 130s 277ms/step - loss: 0.0956 - accuracy: 0.9711 - val_loss: 0.1389 - val_accuracy: 0.9612\n",
+ "Epoch 7/10\n",
+ "469/469 [==============================] - 131s 280ms/step - loss: 0.0803 - accuracy: 0.9761 - val_loss: 0.1008 - val_accuracy: 0.9724\n",
+ "Epoch 8/10\n",
+ "469/469 [==============================] - 134s 286ms/step - loss: 0.0685 - accuracy: 0.9797 - val_loss: 0.1137 - val_accuracy: 0.9679\n",
+ "Epoch 9/10\n",
+ "469/469 [==============================] - 130s 278ms/step - loss: 0.0602 - accuracy: 0.9819 - val_loss: 0.1064 - val_accuracy: 0.9700\n",
+ "Epoch 10/10\n",
+ "469/469 [==============================] - 129s 274ms/step - loss: 0.0520 - accuracy: 0.9843 - val_loss: 0.1095 - val_accuracy: 0.9698\n"
+ ]
+ },
+ {
+ "data": {
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 63,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# Więcej warstw, inna funkcja aktywacji\n",
+ "\n",
+ "num_classes = 10\n",
+ "\n",
+ "(x_train, y_train), (x_test, y_test) = mnist.load_data()\n",
+ "\n",
+ "x_train = x_train.reshape(60000, 784) # 784 = 28 * 28\n",
+ "x_test = x_test.reshape(10000, 784)\n",
+ "x_train = x_train.astype('float32')\n",
+ "x_test = x_test.astype('float32')\n",
+ "x_train /= 255\n",
+ "x_test /= 255\n",
+ "\n",
+ "y_train = keras.utils.to_categorical(y_train, num_classes)\n",
+ "y_test = keras.utils.to_categorical(y_test, num_classes)\n",
+ "\n",
+ "model3 = Sequential()\n",
+ "model3.add(Dense(2500, activation='tanh', input_shape=(784,)))\n",
+ "model3.add(Dense(2000, activation='tanh'))\n",
+ "model3.add(Dense(1500, activation='tanh'))\n",
+ "model3.add(Dense(1000, activation='tanh'))\n",
+ "model3.add(Dense(500, activation='tanh'))\n",
+ "model3.add(Dense(num_classes, activation='softmax'))\n",
+ "model3.summary()\n",
+ "\n",
+ "model3.compile(loss='categorical_crossentropy',\n",
+ " optimizer=keras.optimizers.RMSprop(),\n",
+ " metrics=['accuracy'])\n",
+ "\n",
+ "model3.fit(x_train, y_train,\n",
+ " batch_size=128,\n",
+ " epochs=10,\n",
+ " verbose=1,\n",
+ " validation_data=(x_test, y_test))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 64,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Test loss: 0.10945799201726913\n",
+ "Test accuracy: 0.9697999954223633\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Więcej warstw, inna funkcja aktywacji\n",
+ "\n",
+ "score = model3.evaluate(x_test, y_test, verbose=0)\n",
+ "\n",
+ "print('Test loss: {}'.format(score[0]))\n",
+ "print('Test accuracy: {}'.format(score[1]))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "### Przykład: 4-pikselowy aparat fotograficzny\n",
+ "\n",
+ "https://www.youtube.com/watch?v=ILsA4nyG7I0"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 65,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "def generate_example(description):\n",
+ " variant = random.choice([1, -1])\n",
+ " if description == 's': # solid\n",
+ " return (np.array([[ 1.0, 1.0], [ 1.0, 1.0]]) if variant == 1 else\n",
+ " np.array([[-1.0, -1.0], [-1.0, -1.0]]))\n",
+ " elif description == 'v': # vertical\n",
+ " return (np.array([[ 1.0, -1.0], [ 1.0, -1.0]]) if variant == 1 else\n",
+ " np.array([[-1.0, 1.0], [-1.0, 1.0]]))\n",
+ " elif description == 'd': # diagonal\n",
+ " return (np.array([[ 1.0, -1.0], [-1.0, 1.0]]) if variant == 1 else\n",
+ " np.array([[-1.0, 1.0], [ 1.0, -1.0]]))\n",
+ " elif description == 'h': # horizontal\n",
+ " return (np.array([[ 1.0, 1.0], [-1.0, -1.0]]) if variant == 1 else\n",
+ " np.array([[-1.0, -1.0], [ 1.0, 1.0]]))\n",
+ " else:\n",
+ " return np.array([[random.uniform(-1, 1), random.uniform(-1, 1)],\n",
+ " [random.uniform(-1, 1), random.uniform(-1, 1)]])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 67,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "import random\n",
+ "\n",
+ "num_classes = 4\n",
+ "\n",
+ "trainset_size = 4000\n",
+ "testset_size = 1000\n",
+ "\n",
+ "y4_train = np.array([random.choice(['s', 'v', 'd', 'h']) for i in range(trainset_size)])\n",
+ "x4_train = np.array([generate_example(desc) for desc in y4_train])\n",
+ "\n",
+ "y4_test = np.array([random.choice(['s', 'v', 'd', 'h']) for i in range(testset_size)])\n",
+ "x4_test = np.array([generate_example(desc) for desc in y4_test])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 68,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/png": "iVBORw0KGgoAAAANSUhEUgAAA6oAAACQCAYAAAABdZZIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAPGUlEQVR4nO3dX4ild33H8c+3uwYhVKTN+ie7WZOLZSUVWswkVXqTXliSYJNeaIm9UKSwKOSyFwsF7WUveiUVQy5C4o22eGGXNq1NhRALymZXVJK2SRdRs0nARkvaVTFm+fVizrMzpLOzc/Y858xvnvN6wZA5c56c5zkn7/09893zzKRaawEAAIBe/Np+HwAAAABsZ1AFAACgKwZVAAAAumJQBQAAoCsGVQAAALpiUAUAAKArhxf5l6vqN5L8TZJbk/wgyR+31v57h+1+kOR/k1xO8kZrbWOR/dIfLTDQAokO2KIFBlog0QF7t+g7qqeTfL21diLJ12e3r+b3W2u/I7LJ0gIDLZDogC1aYKAFEh2wR4sOqg8keXz2+eNJ/mjBx+Pg0gIDLZDogC1aYKAFEh2wR4sOqu9srb2SJLN/vuMq27Uk/1xV56vq1IL7pE9aYKAFEh2wRQsMtECiA/bomj+jWlX/kuRdO9z153Ps5/daay9X1TuSPFlV/9Fae/oq+zuVZIjxjjn2wT6rqtZaq2tstqcW1qGDO+44uE/rhRdeyK9+9av/9/WjR4/m0KFDqar/aq0ducbDzN3CjTfeeMd73/veMZ4CI7hWBxsbG+38+fOvXqMF54c1sKzzw1TXhPPnz+/3ISzNss4PmeiacFC/V1jl+UEHB9tuHVRr7bofuKqeT3J3a+2Vqnp3kqdaayev8e/8RZJLrbW/2sPjX//BsS9aazV2C1PtYJE/ez07efJkXnjhhe+11n577BY2NjbauXPnRjxaluXkyZN56qmncvPNN59P8odxflh7yzg/THVNqLrWTH+gLeX8MNU1YYrfKyzz/KCDg6eqzl/tZ5AXvfT3TJJPzD7/RJK/22HnN1bVrw+fJ/mDJM8uuF/6poU1dv/99yfJb85uamFN3X///Xn88eFHkHTAFVrA+WHNOT+wV4sOqn+Z5ENV9Z9JPjS7naq6uaqemG3zziT/WlXfTXI2yT+01v5pwf3SKS1w+vTpJHmbFtbb6dOn8+STTybJ+6ID4vzAFc4Pa875gb1a6NLfZZvq2/dTtoefQZrbVDvo+c/eona7jGMRU73Mb8qW1cJU14UpW8b5YaprwsQv/bUmzMH3CnM/5iRfsHXtYNF3VAEAAGBUBlUAAAC6YlAFAACgKwZVAAAAumJQBQAAoCsGVQAAALpiUAUAAKArBlUAAAC6YlAFAACgKwZVAAAAumJQBQAAoCsGVQAAALpiUAUAAKArBlUAAAC6YlAFAACgKwZVAAAAumJQBQAAoCsGVQAAALoyyqBaVfdU1fNVdaGqTu9wf1XV52b3f6+q3j/GfumPFhhogZm36YDEmsAV1gQGWmBXCw+qVXUoyeeT3Jvk9iQfq6rb37TZvUlOzD5OJfnCovulW1pgoIU1d/ny5SQ5Hh2wyZpAYk0gzg/szRjvqN6V5EJr7futtdeTfDnJA2/a5oEkX2ybvpXk7VX17hH2TX+0QJLcGC2svbNnzybJL3XAjDWBxJpAnB/YmzEG1aNJXtx2++Lsa/NuwzRogSS5IVpYey+99FKSvL7tSzpYb9YEEmsCcX5gbw6P8Bi1w9fadWyzuWHVqWy+vc80XFcLOpikhVs4fvz4Eg6LZWltx2Xe+YGBNYHEmrCWxjw/6GC6xnhH9WKSW7bdPpbk5evYJknSWnuktbbRWtsY4dhYvVFa0MGB93qW0MKRI0dGP1CW59ixY8nmu+tXvhTnh3VmTSCxJpBxzw86mK4xBtVnkpyoqtuq6oYkDyY586ZtziT5+Oy3d30gyWuttVdG2Df90QJJ8rNoYe3deeedSfJWHTBjTSCxJhDnB/Zm4Ut/W2tvVNVDSb6W5FCSR1trz1XVp2b3P5zkiST3JbmQ5OdJPrnofumWFhhoYc0dPnw4SX4UHbDJmkBiTSDOD+xNXeUa8S5UVb8Hx45aazv9PMFCptpBz3/2FlVV55dxCc7GxkY7d+7c2A/LEi2rhamuC1O2jPPDVNeEqtFfqp5YE+bge4W5H3OSL9i6djDGpb8AAAAwGoMqAAAAXTGoAgAA0BWDKgAAAF0xqAIAANAVgyoAAABdMagCAADQFYMqAAAAXTGoAgAA0BWDKgAAAF0xqAIAANAVgyoAAABdMagCAADQFYMqAAAAXTGoAgAA0BWDKgAAAF0xqAIAANAVgyoAAABdMagCAADQlVEG1aq6p6qer6oLVXV6h/vvrqrXquo7s4/PjLFf+qMFBlpg5m06ILEmcIU1gYEW2NXhRR+gqg4l+XySDyW5mOSZqjrTWvu3N236jdbahxfdH93TAgMtrLnLly8nyfEkt0cHWBPYZE3A+YE9GeMd1buSXGitfb+19nqSLyd5YITH5WDSAklyY7Sw9s6ePZskv9QBM9YEEmsCcX5gbxZ+RzXJ0SQvbrt9Mcnv7rDdB6vqu0leTvJnrbXndnqwqjqV5NQIx8X+GKWF7R0cP348P/zhD5dxrPuqqvb7EJbphiyhhdntkQ+VJXt92+fOD+vNmrBHrbX9PoSl+MpXvpKPfvSj1gTy0ksvJSOdH3QwXWMMqjudId68wn47yXtaa5eq6r4kX01yYqcHa609kuSRJKmqaa7U6+W6WtjewcbGhg6mYeEWrAmT4PzAwJqwZq4ygFsT1tCYLehgusa49Pdiklu23T6Wzb/1uKK19j+ttUuzz59I8paqummEfdMfLZBs/i2pFkg2310f6GC9WRPW3LFjxxJrAtECezPGoPpMkhNVdVtV3ZDkwSRntm9QVe+q2bU5VXXXbL8/GWHf9EcLJMnPogU2vVUHzFgT1tydd96ZWBOIFtibhS/9ba29UVUPJflakkNJHm2tPVdVn5rd/3CSjyT5dFW9keQXSR5sU/0BDLTAQAskyY+iAzZZE9bc4cOHE2sC0QJ7Uz3/93ad+cHTWhv9t1psbGy0c+fOjf2w+26KvwBkm/OttY2xH9SacCBpgSTLOT9MtYOevzdbVFVZE+aghbkfc5Iv2Lp2MMalvwAAADAagyoAAABdMagCAADQFYMqAAAAXTGoAgAA0BWDKgAAAF0xqAIAANAVgyoAAABdMagCAADQFYMqAAAAXTGoAgAA0BWDKgAAAF0xqAIAANAVgyoAAABdMagCAADQFYMqAAAAXTGoAgAA0BWDKgAAAF0ZZVCtqker6sdV9exV7q+q+lxVXaiq71XV+8fYL/3RATO3aoEZLZDE+YErrAkMtMCuxnpH9bEk9+xy/71JTsw+TiX5wkj7pT86IElejRbYpAUGOiCxJrBFC+xqlEG1tfZ0kp/usskDSb7YNn0rydur6t1j7Jvu6IAkuRQtsEkLDHRAYk1gixbY1ap+RvVokhe33b44+xrrRQcMtMBACyQ6YIsWGGhhzR1e0X5qh6+1HTesOpXNt/eZnuvq4Pjx48s8JvaHNYGBFkh0wBYtMNhTCzqYrlW9o3oxyS3bbh9L8vJOG7bWHmmtbbTWNlZyZKzSdXVw5MiRlRwcK2VNYKAFEh2wRQsM9tSCDqZrVYPqmSQfn/32rg8kea219sqK9k0/dMBACwy0QKIDtmiBgRbW3CiX/lbVl5LcneSmqrqY5LNJ3pIkrbWHkzyR5L4kF5L8PMknx9gvXfpmdEByW7TAJi0w0AGJNYEtWmBX1dqOl/13oar6PTh21Frb6ecJFrKxsdHOnTs39sPuu6rRX6qenF/GJTjWhANJCyRZzvlhqh30/L3ZoqrKmjAHLcz9mJN8wda1g1Vd+gsAAAB7YlAFAACgKwZVAAAAumJQBQAAoCsGVQAAALpiUAUAAKArBlUAAAC6YlAFAACgKwZVAAAAumJQBQAAoCsGVQAAALpiUAUAAKArBlUAAAC6YlAFAACgKwZVAAAAumJQBQAAoCsGVQAAALpiUAUAAKArowyqVfVoVf24qp69yv13V9VrVfWd2cdnxtgv/dEBM7dqgRktkMT5gSusCQy0wK4Oj/Q4jyX56yRf3GWbb7TWPjzS/ujXPdEByatJ/iRaQAtscX4gsSawRQvsapR3VFtrTyf56RiPxYGnA5LkUrTAJi0w0AGJNYEtWmBXq/wZ1Q9W1Xer6h+r6rdWuF/6ogMGWmCgBRIdsEULDLSwxsa69Pdavp3kPa21S1V1X5KvJjmx04ZVdSrJqdnNS0meX8kRJjdl8xKEqVnl83rPNe6/7g6qSgeLm0QLsSaMQQvzmWoLOpjPyl6vqlrFbrbTwny0MI5RWtjHDpIVvV7r2kG11kbZQ1XdmuTvW2vv28O2P0iy0Vrr5sRfVedaaxv7fRxjW/Xz0kG/tDAfLYy6v1ujhe7oYD5T7SDRwry0MOr+bo0WutPL81rJpb9V9a6a/VVAVd012+9PVrFv+qEDBlpgoAUSHbBFCwy0wCiX/lbVl5LcneSmqrqY5LNJ3pIkrbWHk3wkyaer6o0kv0jyYBvrrVy6oQMGWmCgBRIdsEULDLTAtYx26e9BV1WnWmuP7PdxjG2qz2tZpvx6Tfm5LcOUX68pP7dlmOrrNdXntSxTfr2m/NyWYcqv15Sf2zJM9fXq5XkZVAEAAOjKKv/3NAAAAHBNaz+oVtU9VfV8VV2oqtP7fTxjqapHq+rHVfXsfh/LQaEFBlNsQQfzm2IHiRauhxYYTLEFHcxvih0k/bWw1oNqVR1K8vkk9ya5PcnHqur2/T2q0TyW5J79PoiDQgsMJtzCY9HBnk24g0QLc9ECgwm38Fh0sGcT7iDprIW1HlST3JXkQmvt+62115N8OckD+3xMo2itPZ3kp/t9HAeIFhhMsgUdzG2SHSRauA5aYDDJFnQwt0l2kPTXwroPqkeTvLjt9sXZ11g/WmCgBRIdsEULDLRAooOVWfdBtXb4ml+DvJ60wEALJDpgixYYaIFEByuz7oPqxSS3bLt9LMnL+3Qs7C8tMNACiQ7YogUGWiDRwcqs+6D6TJITVXVbVd2Q5MEkZ/b5mNgfWmCgBRIdsEULDLRAooOVWetBtbX2RpKHknwtyb8n+dvW2nP7e1TjqKovJflmkpNVdbGq/nS/j6lnWmAw1RZ0MJ+pdpBoYV5aYDDVFnQwn6l2kPTXQrXmkmoAAAD6sdbvqAIAANAfgyoAAABdMagCAADQFYMqAAAAXTGoAgAA0BWDKgAAAF0xqAIAANAVgyoAAABd+T89XtOTyy31ugAAAABJRU5ErkJggg==",
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {
+ "needs_background": "light"
+ },
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " s d h s d v v\n"
+ ]
+ }
+ ],
+ "source": [
+ "draw_examples(x4_train[:7], captions=y4_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 69,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "x4_train = x4_train.reshape(trainset_size, 4)\n",
+ "x4_test = x4_test.reshape(testset_size, 4)\n",
+ "x4_train = x4_train.astype('float32')\n",
+ "x4_test = x4_test.astype('float32')\n",
+ "\n",
+ "y4_train = np.array([{'s': 0, 'v': 1, 'd': 2, 'h': 3}[desc] for desc in y4_train])\n",
+ "y4_test = np.array([{'s': 0, 'v': 1, 'd': 2, 'h': 3}[desc] for desc in y4_test])\n",
+ "\n",
+ "y4_train = keras.utils.to_categorical(y4_train, num_classes)\n",
+ "y4_test = keras.utils.to_categorical(y4_test, num_classes)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 70,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Model: \"sequential_24\"\n",
+ "_________________________________________________________________\n",
+ "Layer (type) Output Shape Param # \n",
+ "=================================================================\n",
+ "dense_71 (Dense) (None, 4) 20 \n",
+ "_________________________________________________________________\n",
+ "dense_72 (Dense) (None, 4) 20 \n",
+ "_________________________________________________________________\n",
+ "dense_73 (Dense) (None, 8) 40 \n",
+ "_________________________________________________________________\n",
+ "dense_74 (Dense) (None, 4) 36 \n",
+ "=================================================================\n",
+ "Total params: 116\n",
+ "Trainable params: 116\n",
+ "Non-trainable params: 0\n",
+ "_________________________________________________________________\n"
+ ]
+ }
+ ],
+ "source": [
+ "model4 = keras.Sequential()\n",
+ "model4.add(Dense(4, activation='tanh', input_shape=(4,)))\n",
+ "model4.add(Dense(4, activation='tanh'))\n",
+ "model4.add(Dense(8, activation='relu'))\n",
+ "model4.add(Dense(num_classes, activation='softmax'))\n",
+ "model4.summary()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 71,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "model4.layers[0].set_weights(\n",
+ " [np.array([[ 1.0, 0.0, 1.0, 0.0],\n",
+ " [ 0.0, 1.0, 0.0, 1.0],\n",
+ " [ 1.0, 0.0, -1.0, 0.0],\n",
+ " [ 0.0, 1.0, 0.0, -1.0]],\n",
+ " dtype=np.float32), np.array([0., 0., 0., 0.], dtype=np.float32)])\n",
+ "model4.layers[1].set_weights(\n",
+ " [np.array([[ 1.0, -1.0, 0.0, 0.0],\n",
+ " [ 1.0, 1.0, 0.0, 0.0],\n",
+ " [ 0.0, 0.0, 1.0, -1.0],\n",
+ " [ 0.0, 0.0, -1.0, -1.0]],\n",
+ " dtype=np.float32), np.array([0., 0., 0., 0.], dtype=np.float32)])\n",
+ "model4.layers[2].set_weights(\n",
+ " [np.array([[ 1.0, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],\n",
+ " [ 0.0, 0.0, 1.0, -1.0, 0.0, 0.0, 0.0, 0.0],\n",
+ " [ 0.0, 0.0, 0.0, 0.0, 1.0, -1.0, 0.0, 0.0],\n",
+ " [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, -1.0]],\n",
+ " dtype=np.float32), np.array([0., 0., 0., 0., 0., 0., 0., 0.], dtype=np.float32)])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 73,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "model4.layers[3].set_weights(\n",
+ " [np.array([[ 1.0, 0.0, 0.0, 0.0],\n",
+ " [ 1.0, 0.0, 0.0, 0.0],\n",
+ " [ 0.0, 1.0, 0.0, 0.0],\n",
+ " [ 0.0, 1.0, 0.0, 0.0],\n",
+ " [ 0.0, 0.0, 1.0, 0.0],\n",
+ " [ 0.0, 0.0, 1.0, 0.0],\n",
+ " [ 0.0, 0.0, 0.0, 1.0],\n",
+ " [ 0.0, 0.0, 0.0, 1.0]],\n",
+ " dtype=np.float32), np.array([0., 0., 0., 0.], dtype=np.float32)])\n",
+ "\n",
+ "model4.compile(loss='categorical_crossentropy',\n",
+ " optimizer=keras.optimizers.Adagrad(),\n",
+ " metrics=['accuracy'])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 74,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[array([[ 1., 0., 1., 0.],\n",
+ " [ 0., 1., 0., 1.],\n",
+ " [ 1., 0., -1., 0.],\n",
+ " [ 0., 1., 0., -1.]], dtype=float32), array([0., 0., 0., 0.], dtype=float32)]\n",
+ "[array([[ 1., -1., 0., 0.],\n",
+ " [ 1., 1., 0., 0.],\n",
+ " [ 0., 0., 1., -1.],\n",
+ " [ 0., 0., -1., -1.]], dtype=float32), array([0., 0., 0., 0.], dtype=float32)]\n",
+ "[array([[ 1., -1., 0., 0., 0., 0., 0., 0.],\n",
+ " [ 0., 0., 1., -1., 0., 0., 0., 0.],\n",
+ " [ 0., 0., 0., 0., 1., -1., 0., 0.],\n",
+ " [ 0., 0., 0., 0., 0., 0., 1., -1.]], dtype=float32), array([0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)]\n",
+ "[array([[1., 0., 0., 0.],\n",
+ " [1., 0., 0., 0.],\n",
+ " [0., 1., 0., 0.],\n",
+ " [0., 1., 0., 0.],\n",
+ " [0., 0., 1., 0.],\n",
+ " [0., 0., 1., 0.],\n",
+ " [0., 0., 0., 1.],\n",
+ " [0., 0., 0., 1.]], dtype=float32), array([0., 0., 0., 0.], dtype=float32)]\n"
+ ]
+ }
+ ],
+ "source": [
+ "for layer in model4.layers:\n",
+ " print(layer.get_weights())"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 75,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "array([[0.17831734, 0.17831734, 0.17831734, 0.465048 ]], dtype=float32)"
+ ]
+ },
+ "execution_count": 75,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "model4.predict([np.array([[1.0, 1.0], [-1.0, -1.0]]).reshape(1, 4)])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 76,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Test loss: 0.7656148672103882\n",
+ "Test accuracy: 1.0\n"
+ ]
+ }
+ ],
+ "source": [
+ "score = model4.evaluate(x4_test, y4_test, verbose=0)\n",
+ "\n",
+ "print('Test loss: {}'.format(score[0]))\n",
+ "print('Test accuracy: {}'.format(score[1]))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 77,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Model: \"sequential_25\"\n",
+ "_________________________________________________________________\n",
+ "Layer (type) Output Shape Param # \n",
+ "=================================================================\n",
+ "dense_75 (Dense) (None, 4) 20 \n",
+ "_________________________________________________________________\n",
+ "dense_76 (Dense) (None, 4) 20 \n",
+ "_________________________________________________________________\n",
+ "dense_77 (Dense) (None, 8) 40 \n",
+ "_________________________________________________________________\n",
+ "dense_78 (Dense) (None, 4) 36 \n",
+ "=================================================================\n",
+ "Total params: 116\n",
+ "Trainable params: 116\n",
+ "Non-trainable params: 0\n",
+ "_________________________________________________________________\n"
+ ]
+ }
+ ],
+ "source": [
+ "model5 = Sequential()\n",
+ "model5.add(Dense(4, activation='tanh', input_shape=(4,)))\n",
+ "model5.add(Dense(4, activation='tanh'))\n",
+ "model5.add(Dense(8, activation='relu'))\n",
+ "model5.add(Dense(num_classes, activation='softmax'))\n",
+ "model5.compile(loss='categorical_crossentropy',\n",
+ " optimizer=keras.optimizers.RMSprop(),\n",
+ " metrics=['accuracy'])\n",
+ "model5.summary()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 78,
+ "metadata": {
+ "scrolled": true,
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Epoch 1/8\n",
+ "125/125 [==============================] - 0s 3ms/step - loss: 1.3126 - accuracy: 0.3840 - val_loss: 1.1926 - val_accuracy: 0.6110\n",
+ "Epoch 2/8\n",
+ "125/125 [==============================] - 0s 2ms/step - loss: 1.0978 - accuracy: 0.5980 - val_loss: 1.0085 - val_accuracy: 0.6150\n",
+ "Epoch 3/8\n",
+ "125/125 [==============================] - 0s 2ms/step - loss: 0.9243 - accuracy: 0.7035 - val_loss: 0.8416 - val_accuracy: 0.7380\n",
+ "Epoch 4/8\n",
+ "125/125 [==============================] - 0s 2ms/step - loss: 0.7522 - accuracy: 0.8740 - val_loss: 0.6738 - val_accuracy: 1.0000\n",
+ "Epoch 5/8\n",
+ "125/125 [==============================] - 0s 2ms/step - loss: 0.5811 - accuracy: 1.0000 - val_loss: 0.5030 - val_accuracy: 1.0000\n",
+ "Epoch 6/8\n",
+ "125/125 [==============================] - 0s 2ms/step - loss: 0.4134 - accuracy: 1.0000 - val_loss: 0.3428 - val_accuracy: 1.0000\n",
+ "Epoch 7/8\n",
+ "125/125 [==============================] - 0s 2ms/step - loss: 0.2713 - accuracy: 1.0000 - val_loss: 0.2161 - val_accuracy: 1.0000\n",
+ "Epoch 8/8\n",
+ "125/125 [==============================] - 0s 1ms/step - loss: 0.1621 - accuracy: 1.0000 - val_loss: 0.1225 - val_accuracy: 1.0000\n"
+ ]
+ },
+ {
+ "data": {
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 78,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "model5.fit(x4_train, y4_train, epochs=8, validation_data=(x4_test, y4_test))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 79,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "array([[3.2040708e-02, 1.0065207e-03, 4.9596769e-04, 9.6645677e-01]],\n",
+ " dtype=float32)"
+ ]
+ },
+ "execution_count": 79,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "model5.predict([np.array([[1.0, 1.0], [-1.0, -1.0]]).reshape(1, 4)])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 80,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Test loss: 0.1224619448184967\n",
+ "Test accuracy: 1.0\n"
+ ]
+ }
+ ],
+ "source": [
+ "score = model5.evaluate(x4_test, y4_test, verbose=0)\n",
+ "\n",
+ "print('Test loss: {}'.format(score[0]))\n",
+ "print('Test accuracy: {}'.format(score[1]))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 81,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "notes"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "import contextlib\n",
+ "\n",
+ "@contextlib.contextmanager\n",
+ "def printoptions(*args, **kwargs):\n",
+ " original = np.get_printoptions()\n",
+ " np.set_printoptions(*args, **kwargs)\n",
+ " try:\n",
+ " yield\n",
+ " finally: \n",
+ " np.set_printoptions(**original)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 82,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[array([[ 0.7, 0.2, -0.7, 0.7],\n",
+ " [-0.5, 0.9, 0.6, 0.6],\n",
+ " [ 1.1, 0.2, 0.1, 0.2],\n",
+ " [ 0.7, 0.1, 0.3, -0.7]], dtype=float32), array([ 0. , 0.1, -0.1, -0.2], dtype=float32)]\n",
+ "[array([[ 0.7, 0.5, -1.1, -1.2],\n",
+ " [ 0.7, 0.9, -0.6, 0.3],\n",
+ " [ 0.1, 1.4, -0.6, 0.8],\n",
+ " [ 1.5, 0.1, -0.1, 0.9]], dtype=float32), array([-0.4, 0.2, -0. , 0.2], dtype=float32)]\n",
+ "[array([[-1. , 1. , -0.7, -0.3, 0.2, 1.3, -0.7, 0.9],\n",
+ " [-0.9, 0.5, 0.8, -1.3, -1.2, 1.3, 0.4, -1. ],\n",
+ " [ 0.9, 0.2, 0.3, 0.4, 1.3, -0.9, -0.1, -0.2],\n",
+ " [-0.4, 0.5, 1.1, -0.6, 1.1, 0.1, -1.5, -1. ]], dtype=float32), array([-0.1, 0.1, 0.1, 0.1, 0.2, -0. , 0.1, 0.2], dtype=float32)]\n",
+ "[array([[ 0.7, -0.5, 0.8, -0.5],\n",
+ " [-0.3, -1.6, -0.2, 0.1],\n",
+ " [-1.5, 0.9, 0.1, -0.5],\n",
+ " [ 0.6, 0.7, 1. , -1.4],\n",
+ " [ 0.7, -1.2, -1.6, 1.2],\n",
+ " [ 1. , -1.2, 0.3, -1.5],\n",
+ " [-0.2, 0. , 0.6, 1.3],\n",
+ " [-0.8, 0.2, -0.6, -1. ]], dtype=float32), array([-0.6, 0.5, -0.3, 0.4], dtype=float32)]\n"
+ ]
+ }
+ ],
+ "source": [
+ "with printoptions(precision=1, suppress=True):\n",
+ " for layer in model5.layers:\n",
+ " print(layer.get_weights())"
+ ]
+ }
+ ],
+ "metadata": {
+ "author": "Paweł Skórzewski",
+ "celltoolbar": "Slideshow",
+ "email": "pawel.skorzewski@amu.edu.pl",
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "lang": "pl",
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.6"
+ },
+ "livereveal": {
+ "start_slideshow_at": "selected",
+ "theme": "white"
+ },
+ "subtitle": "10.Sieci neuronowe – propagacja wsteczna[wykład]",
+ "title": "Uczenie maszynowe",
+ "vscode": {
+ "interpreter": {
+ "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
+ }
+ },
+ "year": "2021"
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/wyk/13_CNN.ipynb b/wyk/13_CNN.ipynb
new file mode 100644
index 0000000..547e7d4
--- /dev/null
+++ b/wyk/13_CNN.ipynb
@@ -0,0 +1,615 @@
+{
+ "cells": [
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# 13. Splotowe sieci neuronowe"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "Konwolucyjne sieci neuronowe wykorzystuje się do:\n",
+ "\n",
+ "* rozpoznawania obrazu\n",
+ "* analizy wideo\n",
+ "* innych zagadnień o podobnej strukturze"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "Innymi słowy, CNN przydają się, gdy mamy bardzo dużo danych wejściowych, w których istotne jest ich sąsiedztwo."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "### Warstwy konwolucyjne"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "Dla uproszczenia przyjmijmy, że mamy dane w postaci jendowymiarowej – np. chcemy stwierdzić, czy na danym nagraniu obecny jest głos człowieka."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Nasze nagranie możemy reprezentować jako ciąg $n$ próbek dźwiękowych:\n",
+ "$$(x_0, x_1, \\ldots, x_n)$$\n",
+ "(możemy traktować je jak jednowymiarowe „piksele”)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Najprostsza metoda – „zwykła” jednowarstwowa sieć neuronowa (każdy z każdym):"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Najprostsza metoda – „zwykła” jednowarstwowa sieć neuronowa (każdy z każdym) nie poradzi sobie zbyt dobrze w tym przypadku:\n",
+ "\n",
+ "* dużo danych wejściowych\n",
+ "* nie wykrywa własności „lokalnych” wejścia"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "Chcielibyśmy wykrywać pewne lokalne „wzory” w danych wejściowych.\n",
+ "\n",
+ "W tym celu tworzymy mniejszą sieć neuronową (mniej neuronów wejściowych) i _kopiujemy_ ją tak, żeby każda jej kopia działała na pewnym fragmencie wejścia (fragmenty mogą nachodzić na siebie)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Warstwę sieci A nazywamy **warstwą konwolucyjną** (konwolucja = splot).\n",
+ "\n",
+ "Warstw konwolucyjnych może być więcej niż jedna."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Tak definiujemy formalnie funckję splotu dla 2 wymiarów:\n",
+ "\n",
+ "$$\n",
+ "\\left[\\begin{array}{ccc}\n",
+ "a & b & c\\\\\n",
+ "d & e & f\\\\\n",
+ "g & h & i\\\\\n",
+ "\\end{array}\\right]\n",
+ "*\n",
+ "\\left[\\begin{array}{ccc}\n",
+ "1 & 2 & 3\\\\\n",
+ "4 & 5 & 6\\\\\n",
+ "7 & 8 & 9\\\\\n",
+ "\\end{array}\\right] \n",
+ "=\\\\\n",
+ "(1 \\cdot a)+(2 \\cdot b)+(3 \\cdot c)+(4 \\cdot d)+(5 \\cdot e)\\\\+(6 \\cdot f)+(7 \\cdot g)+(8 \\cdot h)+(9 \\cdot i)\n",
+ "$$\n",
+ "\n",
+ "Więcej: https://en.wikipedia.org/wiki/Kernel_(image_processing)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Jednostka warstwy konwolucyjnej może się składać z jednej lub kilku warstw neuronów."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "Jeden neuron może odpowiadać np. za wykrywanie pionowych krawędzi, drugi poziomych, a jeszcze inny np. krzyżujących się linii."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "### _Pooling_"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "Obrazy składają się na ogół z milionów pikseli. Oznacza to, że nawet po zastosowaniu kilku warstw konwolucyjnych mielibyśmy sporo parametrów do wytrenowania.\n",
+ "\n",
+ "Żeby zredukować liczbę parametrów, a dzięki temu uprościć obliczenia, stosuje się warstwy ***pooling***.\n",
+ "\n",
+ "*Pooling* to rodzaj próbkowania. Najpopularniejszą jego odmianą jest *max-pooling*, czyli wybieranie najwyższej wartości spośród kilku sąsiadujących pikseli (rys. 12.1)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "\n",
+ "\n",
+ "Rys. 12.1. - źródło: [Aphex34](https://commons.wikimedia.org/wiki/File:Max_pooling.png), [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0), Wikimedia Commons"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Warstwy _pooling_ i konwolucyjne można przeplatać ze sobą (rys. 12.2)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ "\n",
+ "\n",
+ "Rys. 12.2. - źródło: [Aphex34](https://commons.wikimedia.org/wiki/File:Typical_cnn.png), [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0), Wikimedia Commons"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "_Pooling_ – idea: nie jest istotne, w którym *dokładnie* miejscu na obrazku dana cecha (krawędź, oko, itp.) się znajduje, wystarczy przybliżona lokalizacja."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Do sieci konwolucujnych możemy dokładać też warstwy ReLU."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "https://www.youtube.com/watch?v=FmpDIaiMIeA"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "notes"
+ }
+ },
+ "source": [
+ "Zobacz też: https://colah.github.io/posts/2014-07-Conv-Nets-Modular/"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "### Przykład: MNIST"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 23,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "notes"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "%matplotlib inline\n",
+ "\n",
+ "import math\n",
+ "import matplotlib.pyplot as plt\n",
+ "import numpy as np\n",
+ "import random\n",
+ "\n",
+ "from IPython.display import YouTubeVideo"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 24,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "notes"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "import keras\n",
+ "from keras.datasets import mnist\n",
+ "\n",
+ "from keras.models import Sequential\n",
+ "from keras.layers import Dense, Dropout, Flatten\n",
+ "from keras.layers import Conv2D, MaxPooling2D\n",
+ "\n",
+ "# załaduj dane i podziel je na zbiory uczący i testowy\n",
+ "(x_train, y_train), (x_test, y_test) = mnist.load_data()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 25,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "notes"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "def draw_examples(examples, captions=None):\n",
+ " plt.figure(figsize=(16, 4))\n",
+ " m = len(examples)\n",
+ " for i, example in enumerate(examples):\n",
+ " plt.subplot(100 + m * 10 + i + 1)\n",
+ " plt.imshow(example, cmap=plt.get_cmap('gray'))\n",
+ " plt.show()\n",
+ " if captions is not None:\n",
+ " print(6 * ' ' + (10 * ' ').join(str(captions[i]) for i in range(m)))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 26,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/png": "iVBORw0KGgoAAAANSUhEUgAAA6IAAACPCAYAAADgImbyAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAHEtJREFUeJzt3XmQVNXZx/HniIAYREQIISKCggiRTUDB1wITwBVZJKKEPUYoUYSUUKASgzEIolLFIlEkMIKUaIVVI0EElKiEAgnmZXXAyJYRUEE2Iy963z/o5T5Hpqd7uvvc2z3fT9UU9ze3u+/pnme659D93GM8zxMAAAAAAFw5J+gBAAAAAADKFiaiAAAAAACnmIgCAAAAAJxiIgoAAAAAcIqJKAAAAADAKSaiAAAAAACnmIgCAAAAAJxiIgoAAAAAcCqtiagx5hZjzA5jzE5jzOhMDQq5h1pAFLUAEeoAcdQCRKgDxFELiPE8r1RfIlJORHaJyOUiUkFEPhaRxiVcx+Mr574OZboWQnCf+MpCHVALZeYr488J1ELOfvH6wFdW6oBayNkvaoGvpGvB87y03hG9VkR2ep73qed5p0Rkvoh0TeP2EE67k7gMtZD/kqkDEWqhLOA5AVHUAkSoA8RRC4hK6u/GdCail4jIXl/eF/meYowZZIzZYIzZkMaxEG4l1gJ1UGZQCxDh9QFxPCdAhOcExFELiDk32wfwPG+GiMwQETHGeNk+HsKJOkAUtYAoagEi1AHiqAVEUQtlQzrviO4XkUt9uXbkeyh7qAVEUQsQoQ4QRy1AhDpAHLWAmHQmoutFpIExpp4xpoKI3CMiSzMzLOQYagFR1AJEqAPEUQsQoQ4QRy0gptQfzfU877Qx5kERWS5nzoA1y/O8LRkbGXIGtYAoagEi1AHiqAWIUAeIoxbgZyKnRXZzMD7jnYs+8jyvVSZvkDrISRmvAxFqIUdRC4ji9QEiPCcgjlpAVFK1kM5HcwEAAAAASBkTUQAAAACAU0xEAQAAAABOMREFAAAAADjFRBQAAAAA4BQTUQAAAACAU0xEAQAAAABOMREFAAAAADh1btADAPJVy5YtVX7wwQdV7tevn8pz5sxReerUqSpv3Lgxg6MDAABANk2ePFnlhx56KLa9efNmta9z584q7969O3sDCwneEQUAAAAAOMVEFAAAAADgFB/NTVK5cuVUvvDCC5O+rv2RzPPPP1/lhg0bqvzAAw+o/Oyzz6rcq1cvlf/73/+qPGHChNj2E088kfQ4kZ7mzZurvGLFCpWrVKmisud5Kvft21flLl26qHzxxRenO0TkiQ4dOqg8b948ldu3b6/yjh07sj4mZMeYMWNUtp/TzzlH/3/yjTfeqPJ7772XlXEByIwLLrhA5cqVK6t8++23q1yjRg2VJ02apPK3336bwdEhVXXr1lW5T58+Kn///fex7UaNGql9V111lcp8NBcAAAAAgAxjIgoAAAAAcIqJKAAAAADAqTLTI1qnTh2VK1SooPL111+v8g033KBy1apVVe7Ro0fGxrZv3z6Vp0yZonL37t1VPnbsmMoff/yxyvQEuXPttdfGthcsWKD22X3Edk+o/XM8deqUynZPaJs2bVS2l3Oxr18WtGvXLrZtP16LFi1yPRxnWrdurfL69esDGgkybcCAASqPGjVKZX9/0dnYzzMAgufvG7R/p9u2bavy1VdfndJt16pVS2X/8iBw79ChQyqvWbNGZfv8H2Ud74gCAAAAAJxiIgoAAAAAcIqJKAAAAADAqbztEbXXdFy1apXKqawDmml2j4+9Ttzx48dVttcILCoqUvnw4cMqs2Zg5thrvl5zzTUqv/LKK7Ftu0+jJIWFhSpPnDhR5fnz56v8wQcfqGzXzfjx41M6fj7wr5nYoEEDtS+fekTttSLr1aun8mWXXaayMSbrY0J22D/L8847L6CRIFXXXXedyv71A+21fX/2s58lvK0RI0ao/J///Edl+zwW/tciEZF169YlHiwyyl7/cfjw4Sr37t07tl2pUiW1z36+3rt3r8r2+STstSd79uyp8vTp01Xevn17ccNGFpw4cULlsrAWaDp4RxQAAAAA4BQTUQAAAACAU0xEAQAAAABO5W2P6J49e1T+8ssvVc5kj6jdi3HkyBGVf/7zn6tsr/c4d+7cjI0FmfXiiy+q3KtXr4zdtt1vWrlyZZXt9WD9/ZAiIk2bNs3YWHJVv379Yttr164NcCTZZfcf33fffSrb/WH0BOWOjh07qjx06NCEl7d/tp07d1b5wIEDmRkYSnT33XerPHnyZJWrV68e27b7AN99912Va9SoofIzzzyT8Nj27dnXv+eeexJeH6mx/2Z8+umnVbZr4YILLkj6tu3zRdx8880qly9fXmX7OcBfZ2fLcKtq1aoqN2vWLKCR5AbeEQUAAAAAOMVEFAAAAADgFBNRAAAAAIBTedsj+tVXX6k8cuRIle2+mn/+858qT5kyJeHtb9q0KbbdqVMntc9eQ8heL2zYsGEJbxvBadmypcq33367yonWZ7R7Ot944w2Vn332WZXtdeHsGrTXh/3FL36R9FjKCnt9zXw1c+bMhPvtHiOEl73+4+zZs1Uu6fwFdu8ga9Rlz7nn6j+RWrVqpfJLL72ksr3u9Jo1a2LbTz75pNr3/vvvq1yxYkWVX3/9dZVvuummhGPdsGFDwv1IT/fu3VX+zW9+U+rb2rVrl8r235D2OqL169cv9bHgnv08UKdOnaSv27p1a5XtfuB8fL4vG3/FAQAAAABCg4koAAAAAMCpEieixphZxpiDxpjNvu9VM8asMMYURv69KLvDRBhQC4iiFiBCHSCOWkAUtQAR6gDJSaZHtEBEponIHN/3RovISs/zJhhjRkfyqMwPL3MWL16s8qpVq1Q+duyYyva6P/fee6/K/n4/uyfUtmXLFpUHDRqUeLDhVSB5UAt+zZs3V3nFihUqV6lSRWXP81RetmxZbNteY7R9+/YqjxkzRmW77+/QoUMqf/zxxyp///33Ktv9q/a6pBs3bpQsKpAAasFeO7VmzZqZvPnQKqlv0K5bhwokz54Tsq1///4q//SnP014eXu9yTlz5pz9gsErkDyrhT59+qhcUq+2/XvoX1vy6NGjCa9rr0NZUk/ovn37VH755ZcTXt6xAsmzWrjrrrtSuvxnn32m8vr162Pbo0bpu233hNoaNWqU0rFDpEDyrA6SYZ//o6CgQOWxY8cWe11735EjR1SeNm1aOkMLpRLfEfU8b42IfGV9u6uIRJ/1XhaRbhkeF0KIWkAUtQAR6gBx1AKiqAWIUAdITmnPmlvT87yiyPbnIlLs2xLGmEEikrNvAaJESdUCdVAmUAsQ4fUBcTwnIIpagAivD7CkvXyL53meMcZLsH+GiMwQEUl0OeS+RLVAHZQt1AJEeH1AHM8JiKIWIMLrA84o7UT0gDGmlud5RcaYWiJyMJODcqGkfo2vv/464f777rsvtv3aa6+pfXYvX57LqVq48sorVbbXl7V78b744guVi4qKVPb35Rw/flzt++tf/5owp6tSpUoqP/zwwyr37t07o8dLQtZr4bbbblPZfgzyhd37Wq9evYSX379/fzaHk6qcek7IturVq6v861//WmX79cLuCfrjH/+YnYG5kVO1YK/1+eijj6psnyNg+vTpKtvnASjp7wy/xx57LOnLiog89NBDKtvnGAihnKoFm/9vPpEfnuvj7bffVnnnzp0qHzxY+rubZ+dCyOk6KA37eSVRj2hZVNrlW5aKSPSMC/1FZElmhoMcRC0gilqACHWAOGoBUdQCRKgDWJJZvuVVEVkrIg2NMfuMMfeKyAQR6WSMKRSRjpGMPEctIIpagAh1gDhqAVHUAkSoAySnxI/mep7Xq5hdHTI8FoQctYAoagEi1AHiqAVEUQsQoQ6QnLRPVpSv7M9wt2zZUmX/GpEdO3ZU++xeAQSnYsWKKvvXfxX5Yc+hvZ5sv379VN6wYYPKYepRrFOnTtBDyLqGDRsWu89erzeX2XVq9wh98sknKtt1i2DVrVs3tr1gwYKUrjt16lSVV69enYkh4Swef/xxle2e0FOnTqm8fPlyle31IL/55ptij3XeeeepbK8Taj9/G2NUtnuFlyzhE40u2WtDuuzza9u2rbNjIfvOOSf+YdQydk6ZsyptjygAAAAAAKXCRBQAAAAA4BQTUQAAAACAU/SIFuPEiRMq22tIbdy4Mbb90ksvqX12T4/dV/j888+rbK9Nhsxp0aKFynZPqK1r164qv/feexkfE7Jj/fr1QQ+hWFWqVFH5lltuUblPnz4q2/1jNntdMnvtSQTL//Nt2rRpwsuuXLlS5cmTJ2dlTBCpWrWqykOGDFHZfi22e0K7deuW0vHq168f2543b57aZ593wvaXv/xF5YkTJ6Z0bISLf93XH/3oRyldt0mTJgn3f/jhhyqvXbs2pduHW/6+UP7+5x1RAAAAAIBjTEQBAAAAAE7x0dwk7dq1S+UBAwbEtmfPnq329e3bN2G2P5YxZ84clYuKiko7TFgmTZqksn1KfPujt2H+KK7/lN8inPbbVq1atbSu36xZM5XtWrGXaapdu7bKFSpUiG337t1b7bN/dvYyD+vWrVP522+/Vfncc/VT9UcffSQID/sjmxMmFL9G+/vvv69y//79Vf76668zNzAo/t9REZHq1asnvLz/45QiIj/+8Y9VHjhwoMpdunRR+eqrr45tV65cWe2zP5Jn51deeUVlu10IwTr//PNVbty4scq///3vVU7UFpTqa7u9lIxdh999913C6wNhwjuiAAAAAACnmIgCAAAAAJxiIgoAAAAAcIoe0VJatGhRbLuwsFDts/sSO3TooPJTTz2l8mWXXabyuHHjVN6/f3+px1nWdO7cWeXmzZurbPfhLF26NOtjyhS7b8S+L5s2bXI5nEDYvZX+x+CFF15Q+x599NGUbtteZsPuET19+rTKJ0+eVHnr1q2x7VmzZql99hJOdi/ygQMHVN63b5/KlSpVUnn79u2C4NStW1flBQsWJH3dTz/9VGX7Z4/sOXXqlMqHDh1SuUaNGir/+9//VjnVpRb8vXxHjx5V+2rVqqXyF198ofIbb7yR0rGQWeXLl1fZXgrO/p23f572a5W/FuzlVezlvOz+U5t9zoA777xTZXsJKLvugTDhHVEAAAAAgFNMRAEAAAAATjERBQAAAAA4RY9oBmzevFnlnj17qnzHHXeobK87OnjwYJUbNGigcqdOndIdYplh99LZ68YdPHhQ5ddeey3rY0pWxYoVVR47dmzCy69atUrlRx55JNNDCp0hQ4aovHv37tj29ddfn9Zt79mzR+XFixervG3bNpX/8Y9/pHU8v0GDBqls96rZfYUI1qhRo1ROZU3fRGuMIruOHDmisr3+65tvvqmyvTaxvZ74kiVLVC4oKFD5q6++im3Pnz9f7bN7Cu39cMv+W8Hu21y4cGHC6z/xxBMq26/PH3zwQWzbriv7sv71Z8/Gfn0YP368yiW9ltnrVMMt/7qxJb12tGvXTuVp06ZlZUxB4h1RAAAAAIBTTEQBAAAAAE4xEQUAAAAAOEWPaBbYfShz585VeebMmSrba0LZnwm/8cYbVX733XfTG2AZZvdGFBUVBTSSH/aEjhkzRuWRI0eqbK8t+dxzz6l8/PjxDI4uNzz99NNBDyEj7LWGbamsU4nMs9cjvummm5K+rt1HuGPHjoyMCelbt26dynbvXbr8r+Xt27dX++zeMPrA3bLXCbV7PO3XX9uyZctUnjp1qsr234H+2nrrrbfUviZNmqhsr/s5ceJEle0e0q5du6o8b948ld955x2V7dfNw4cPS3HKwvrkrvl/90tam9heI7Zx48Yq+9cvz1W8IwoAAAAAcIqJKAAAAADAKSaiAAAAAACn6BHNgKZNm6r8y1/+UuXWrVurbPeE2uzPfK9ZsyaN0cFv6dKlgR3b7jOze1Duvvtule3esh49emRnYAi9RYsWBT2EMu3tt99W+aKLLkp4ef8aswMGDMjGkJAD/Ota2z2hdm8Y64hmV7ly5VR+8sknVR4xYoTKJ06cUHn06NEq2z8vuye0VatWKvvXf2zRooXaV1hYqPL999+v8urVq1WuUqWKyvYa2r1791a5S5cuKq9YsUKKs3fvXpXr1atX7GVROi+88EJse/DgwSld115zfPjw4RkZU5B4RxQAAAAA4BQTUQAAAACAU0xEAQAAAABO0SOapIYNG6r84IMPxrbtdX5+8pOfpHTb3333ncr22pZ2bwmKZ4xJmLt166bysGHDsjaW3/72tyr/7ne/U/nCCy9U2V77q1+/ftkZGICUXHzxxSqX9Jw8ffr02HZZXN8XZyxfvjzoISDC7q2ze0JPnjypst27Z/eJt2nTRuWBAweqfOutt6rs7xf+wx/+oPbNnj1bZbtP03b06FGV//a3vyXMvXr1UvlXv/pVsbdt/92CzNu+fXvQQwgV3hEFAAAAADhV4kTUGHOpMWa1MWarMWaLMWZY5PvVjDErjDGFkX8Tn0YQOY9agAh1gDhqAVHUAkSoA8RRC0hGMu+InhaRhz3PaywibUTkAWNMYxEZLSIrPc9rICIrIxn5jVqACHWAOGoBUdQCRKgDxFELKFGJPaKe5xWJSFFk+5gxZpuIXCIiXUXkxsjFXhaRd0VkVFZG6YDd12l/pt7fEyoiUrdu3VIfa8OGDSqPGzdO5SDXukwkF2rBXpvNzvbPecqUKSrPmjVL5S+//FJluy+kb9++se1mzZqpfbVr11Z5z549Ktv9Q/6+sjDLhTrIdXZv85VXXqmyf53KIOVrLdg9W+eck1oXy4cffpjJ4eSEfK2FdNx8881BD8G5sNbB448/nnC/vc6ovc732LFjVa5fv35Kx/dff/z48WqffZ6QTHv11VcT5mwJay0EberUqbHtoUOHqn1XXHFFwuva5zXx35aIyK5du9IcnXspvboaY+qKSAsRWSciNSNFJiLyuYjUzOjIEGrUAkSoA8RRC4iiFiBCHSCOWkBxkj5rrjGmsogsEJHhnucd9f+Pved5njHGK+Z6g0Rk0Nn2ITeVphaog/zDcwKiqAVE8foAEZ4TEEctIJGk3hE1xpSXM0U0z/O8hZFvHzDG1IrsryUiB892Xc/zZnie18rzvFaZGDCCVdpaoA7yC88JiKIWEMXrA0R4TkActYCSlPiOqDnzXxd/FpFtnudN8u1aKiL9RWRC5N8lWRlhhtSsqd/5b9y4scrTpk1T+aqrrir1sdatW6fyM888o/KSJfqhypV1QvOhFuw+kCFDhqjco0cPle31uho0aJD0sew+sdWrV6tcUs9KWOVDHYSd3ducao+iK/lSC82bN1e5Y8eOKtvP0adOnVL5+eefV/nAgQMZHF1uyJdayKTLL7886CE4F9Y6+Pzzz1WuUaOGyhUrVlTZPueD7a233lJ5zZo1Ki9evFjlzz77LLad7Z7QsAhrLYTJli1bVC7pOSNX5gupSOajuf8jIn1F5H+NMZsi33tUzhTQ68aYe0Vkt4j0zM4QESLUAkSoA8RRC4iiFiBCHSCOWkCJkjlr7vsiYorZ3SGzw0GYUQsQoQ4QRy0gilqACHWAOGoByQjn570AAAAAAHkr6bPmhl21atVUfvHFF1W2e4DS7d3w9/8999xzap+9PuQ333yT1rGQvLVr16q8fv16lVu3bp3w+vY6o3Zvsc2/zuj8+fPVPnu9J6C02rZtq3JBQUEwA8lTVatWVdl+HrDt379f5REjRmR8TMh9f//732Pbdp93PvZ6hVm7du1U7tatm8rXXHONygcP6vPn2GuMHz58WGW7bxxIxowZM1S+4447AhpJcHhHFAAAAADgFBNRAAAAAIBTTEQBAAAAAE7lVI/oddddF9seOXKk2nfttdeqfMkll6R1rJMnT6o8ZcoUlZ966qnY9okTJ9I6FjJn3759Kt95550qDx48WOUxY8akdPuTJ09W+U9/+lNse+fOnSndFlCcM8uvAchlmzdvjm0XFhaqffZ5Kq644gqVDx06lL2BlUHHjh1Tee7cuQkz4MLWrVtV3rZtm8qNGjVyOZxA8I4oAAAAAMApJqIAAAAAAKdy6qO53bt3P+t2Muy3v998802VT58+rbK9JMuRI0dSOh7CoaioSOWxY8cmzEAQli1bpvJdd90V0EjKpu3bt6vsX55LROSGG25wORzkIX87j4jIzJkzVR43bpzKQ4cOVdn+GwZA7tu9e7fKTZo0CWgkweEdUQAAAACAU0xEAQAAAABOMREFAAAAADhlPM9zdzBj3B0MmfKR53mtMnmD1EFOyngdiFALOYpaQBSvD0mqUqWKyq+//rrKHTt2VHnhwoUqDxw4UOWQLRvHcwKiqAVEJVULvCMKAAAAAHCKiSgAAAAAwCkmogAAAAAAp3JqHVEAAIBcc/ToUZV79uypsr2O6P3336+yveY164oCyAe8IwoAAAAAcIqJKAAAAADAKSaiAAAAAACn6BEFAABwyO4ZHTp0aMIMAPmId0QBAAAAAE4xEQUAAAAAOMVEFAAAAADglOse0S9EZLeIVI9shxFj0y7Lwm1SB+nJlzoQoRbSRS24xdg0Xh/CJ1/qQIRaSBe14FZYxxbUuJKqBeN5XrYH8sODGrPB87xWzg+cBMbmTpjvD2NzK8z3ibG5Feb7xNjcCfP9YWxuhfk+MTa3wnyfwjq2sI4rio/mAgAAAACcYiIKAAAAAHAqqInojICOmwzG5k6Y7w9jcyvM94mxuRXm+8TY3Anz/WFsboX5PjE2t8J8n8I6trCOS0QC6hEFAAAAAJRdfDQXAAAAAOAUE1EAAAAAgFNOJ6LGmFuMMTuMMTuNMaNdHvssY5lljDlojNns+141Y8wKY0xh5N+LAhrbpcaY1caYrcaYLcaYYWEaXyZQC0mPjVpwO5ZQ1gJ14HwsoayDyDioBbdjoRYCRC0kNS7qwO1YQlkHkXHkXC04m4gaY8qJyPMicquINBaRXsaYxq6OfxYFInKL9b3RIrLS87wGIrIykoNwWkQe9jyvsYi0EZEHIo9VWMaXFmohJdSCWwUSzlqgDtwqkHDWgQi14FqBUAuBoBaSRh24VSDhrAORXKwFz/OcfIlIWxFZ7suPiMgjro5fzJjqishmX94hIrUi27VEZEeQ4/ONa4mIdArr+KgFaoFaoA6oA2qBWgj8saMWqAXqgDrIqVpw+dHcS0Rkry/vi3wvTGp6nlcU2f5cRGoGORgREWNMXRFpISLrJITjKyVqoRSohcCE6rGmDgITuseaWghM6B5raiEwoXqsqYPAhO6xzpVa4GRFxfDO/LdBoGvbGGMqi8gCERnued5R/74wjK+sCMNjTS2EQ9CPNXUQDmF4rKmFcAjDY00thEPQjzV1EA5heKxzqRZcTkT3i8ilvlw78r0wOWCMqSUiEvn3YFADMcaUlzNFNM/zvIVhG1+aqIUUUAuBC8VjTR0ELjSPNbUQuNA81tRC4ELxWFMHgQvNY51rteByIrpeRBoYY+oZYyqIyD0istTh8ZOxVET6R7b7y5nPVjtnjDEi8mcR2eZ53iTfrlCMLwOohSRRC6EQ+GNNHYRCKB5raiEUQvFYUwuhEPhjTR2EQige65ysBcdNs7eJyCcisktEHguyOVZEXhWRIhH5PznzefN7ReRiOXM2qUIReUdEqgU0thvkzNvm/xKRTZGv28IyPmqBWqAWqAPqgOcEaoFaoBaCf6ypA+ogl2vBRAYOAAAAAIATnKwIAAAAAOAUE1EAAAAAgFNMRAEAAAAATjERBQAAAAA4xUQUAAAAAOAUE1EAAAAAgFNMRAEAAAAATv0/VRGGEPckXi4AAAAASUVORK5CYII=",
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " 5 0 4 1 9 2 1\n"
+ ]
+ }
+ ],
+ "source": [
+ "draw_examples(x_train[:7], captions=y_train)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 27,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "batch_size = 128\n",
+ "num_classes = 10\n",
+ "epochs = 12\n",
+ "\n",
+ "# input image dimensions\n",
+ "img_rows, img_cols = 28, 28"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 28,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "notes"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "if keras.backend.image_data_format() == 'channels_first':\n",
+ " x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)\n",
+ " x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)\n",
+ " input_shape = (1, img_rows, img_cols)\n",
+ "else:\n",
+ " x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n",
+ " x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n",
+ " input_shape = (img_rows, img_cols, 1)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 29,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "x_train shape: (60000, 28, 28, 1)\n",
+ "60000 train samples\n",
+ "10000 test samples\n"
+ ]
+ }
+ ],
+ "source": [
+ "x_train = x_train.astype('float32')\n",
+ "x_test = x_test.astype('float32')\n",
+ "x_train /= 255\n",
+ "x_test /= 255\n",
+ "print('x_train shape: {}'.format(x_train.shape))\n",
+ "print('{} train samples'.format(x_train.shape[0]))\n",
+ "print('{} test samples'.format(x_test.shape[0]))\n",
+ "\n",
+ "# convert class vectors to binary class matrices\n",
+ "y_train = keras.utils.to_categorical(y_train, num_classes)\n",
+ "y_test = keras.utils.to_categorical(y_test, num_classes)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 30,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "model = Sequential()\n",
+ "model.add(Conv2D(32, kernel_size=(3, 3),\n",
+ " activation='relu',\n",
+ " input_shape=input_shape))\n",
+ "model.add(Conv2D(64, (3, 3), activation='relu'))\n",
+ "model.add(MaxPooling2D(pool_size=(2, 2)))\n",
+ "model.add(Dropout(0.25))\n",
+ "model.add(Flatten())\n",
+ "model.add(Dense(128, activation='relu'))\n",
+ "model.add(Dropout(0.5))\n",
+ "model.add(Dense(num_classes, activation='softmax'))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 31,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "model.compile(loss=keras.losses.categorical_crossentropy,\n",
+ " optimizer=keras.optimizers.Adadelta(),\n",
+ " metrics=['accuracy'])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 32,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Train on 60000 samples, validate on 10000 samples\n",
+ "Epoch 1/12\n",
+ "60000/60000 [==============================] - 333s - loss: 0.3256 - acc: 0.9037 - val_loss: 0.0721 - val_acc: 0.9780\n",
+ "Epoch 2/12\n",
+ "60000/60000 [==============================] - 342s - loss: 0.1088 - acc: 0.9683 - val_loss: 0.0501 - val_acc: 0.9835\n",
+ "Epoch 3/12\n",
+ "60000/60000 [==============================] - 366s - loss: 0.0837 - acc: 0.9748 - val_loss: 0.0429 - val_acc: 0.9860\n",
+ "Epoch 4/12\n",
+ "60000/60000 [==============================] - 311s - loss: 0.0694 - acc: 0.9788 - val_loss: 0.0380 - val_acc: 0.9878\n",
+ "Epoch 5/12\n",
+ "60000/60000 [==============================] - 325s - loss: 0.0626 - acc: 0.9815 - val_loss: 0.0334 - val_acc: 0.9886\n",
+ "Epoch 6/12\n",
+ "60000/60000 [==============================] - 262s - loss: 0.0552 - acc: 0.9835 - val_loss: 0.0331 - val_acc: 0.9890\n",
+ "Epoch 7/12\n",
+ "60000/60000 [==============================] - 218s - loss: 0.0494 - acc: 0.9852 - val_loss: 0.0291 - val_acc: 0.9903\n",
+ "Epoch 8/12\n",
+ "60000/60000 [==============================] - 218s - loss: 0.0461 - acc: 0.9859 - val_loss: 0.0294 - val_acc: 0.9902\n",
+ "Epoch 9/12\n",
+ "60000/60000 [==============================] - 219s - loss: 0.0423 - acc: 0.9869 - val_loss: 0.0287 - val_acc: 0.9907\n",
+ "Epoch 10/12\n",
+ "60000/60000 [==============================] - 218s - loss: 0.0418 - acc: 0.9875 - val_loss: 0.0299 - val_acc: 0.9906\n",
+ "Epoch 11/12\n",
+ "60000/60000 [==============================] - 218s - loss: 0.0388 - acc: 0.9879 - val_loss: 0.0304 - val_acc: 0.9905\n",
+ "Epoch 12/12\n",
+ "60000/60000 [==============================] - 218s - loss: 0.0366 - acc: 0.9889 - val_loss: 0.0275 - val_acc: 0.9910\n"
+ ]
+ },
+ {
+ "data": {
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 32,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "model.fit(x_train, y_train,\n",
+ " batch_size=batch_size,\n",
+ " epochs=epochs,\n",
+ " verbose=1,\n",
+ " validation_data=(x_test, y_test))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 33,
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "('Test loss:', 0.027530849870144449)\n",
+ "('Test accuracy:', 0.99099999999999999)\n"
+ ]
+ }
+ ],
+ "source": [
+ "score = model.evaluate(x_test, y_test, verbose=0)\n",
+ "print('Test loss:', score[0])\n",
+ "print('Test accuracy:', score[1])"
+ ]
+ }
+ ],
+ "metadata": {
+ "author": "Paweł Skórzewski",
+ "celltoolbar": "Slideshow",
+ "email": "pawel.skorzewski@amu.edu.pl",
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "lang": "pl",
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]"
+ },
+ "livereveal": {
+ "start_slideshow_at": "selected",
+ "theme": "white"
+ },
+ "subtitle": "12.Splotowe sieci neuronowe[wykład]",
+ "title": "Uczenie maszynowe",
+ "vscode": {
+ "interpreter": {
+ "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
+ }
+ },
+ "year": "2021"
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/wyk/14_RNN.ipynb b/wyk/14_RNN.ipynb
new file mode 100644
index 0000000..af3fab9
--- /dev/null
+++ b/wyk/14_RNN.ipynb
@@ -0,0 +1,516 @@
+{
+ "cells": [
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# 14. Rekurencyjne sieci neuronowe"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## 14.1. Rekurencyjne sieci neuronowe"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "## RNN – _Recurrent Neural Network_\n",
+ "\n",
+ "## LSTM – _Long Short Term Memory_"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "https://www.youtube.com/watch?v=WCUNPb-5EYI"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Rekurencyjna sieć neuronowa – schemat\n",
+ "\n",
+ "Rys. 11.1.\n",
+ "\n",
+ "\n",
+ "\n",
+ "Rys. 11.1 - źródło: [fdeloche](https://commons.wikimedia.org/wiki/File:Recurrent_neural_network_unfold.svg), [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0), Wikimedia Commons"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### LSTM – schemat\n",
+ "\n",
+ "Rys. 11.2.\n",
+ "\n",
+ "\n",
+ "\n",
+ "Rys. 11.2 - źródło: [fdeloche](https://commons.wikimedia.org/wiki/File:Long_Short-Term_Memory.svg), [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0), Wikimedia Commons"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Rekurencyjne sieci neuronowe znajduja zastosowanie w przetwarzaniu sekwencji, np. szeregów czasowych i tekstów.\n",
+ "* LSTM są rozwinięciem RNN, umożliwiają „zapamiętywanie” i „zapominanie”."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Co potrafią generować rekurencyjne sieci neuronowe?\n",
+ "\n",
+ "http://karpathy.github.io/2015/05/21/rnn-effectiveness/"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Przewidywanie ciągów czasowych za pomocą LSTM – przykład\n",
+ "\n",
+ "https://machinelearningmastery.com/time-series-forecasting-long-short-term-memory-network-python/"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "## GRU – _Gated Recurrent Unit_\n",
+ "\n",
+ "* Rodzaj rekurencyjnej sieci neuronowej wprwadzony w 2014 roku\n",
+ "* Ma prostszą budowę niż LSTM (2 bramki zamiast 3).\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### GRU – schemat\n",
+ "\n",
+ "Rys. 11.3\n",
+ "\n",
+ "\n",
+ "\n",
+ "Rys. 11.3 - źródło: [fdeloche](https://commons.wikimedia.org/wiki/File:Gated_Recurrent_Unit.svg), [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0), Wikimedia Commons"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### GRU vs LSTM\n",
+ "* LSTM – 3 bramki: wejścia (*input*), wyjścia (*output*) i zapomnienia (*forget*); GRU – 2 bramki: resetu (*reset*) i aktualizacji (*update*). Bramka resetu pełni podwójną funkcję: zastępuje bramki wyjścia i zapomnienia.\n",
+ "* GRU i LSTM mają podobną skuteczność, ale GRU dzięki prostszej budowie bywa bardziej wydajna.\n",
+ "* LSTM sprawdza się lepiej w przetwarzaniu tekstu, ponieważ lepiej zapamiętuje zależności długosystansowe."
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "## 14.2. Autoencoder"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Uczenie nienadzorowane\n",
+ "* Dane: zbiór nieanotowanych przykładów uczących $\\{ x^{(1)}, x^{(2)}, x^{(3)}, \\ldots \\}$, $x^{(i)} \\in \\mathbb{R}^{n}$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Autoencoder (encoder-decoder)\n",
+ "\n",
+ "Sieć neuronowa taka, że:\n",
+ "* warstwa wejściowa ma $n$ neuronów\n",
+ "* warstwa wyjściowa ma $n$ neuronów\n",
+ "* warstwa środkowa ma $k < n$ neuronów\n",
+ "* $y^{(i)} = x^{(i)}$ dla każdego $i$\n",
+ "\n",
+ "(rys. 13.1)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "\n",
+ "\n",
+ "Rys. 13.1 - źródło: [Michela Massi](https://commons.wikimedia.org/wiki/File:Autoencoder_schema.png), [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0), Wikimedia Commons"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Co otrzymujemy dzięki takiej sieci?\n",
+ "\n",
+ "* $y^{(i)} = x^{(i)} \\; \\Longrightarrow \\;$ Autoencoder próbuje nauczyć się funkcji $h(x) \\approx x$, czyli funkcji identycznościowej.\n",
+ "* Warstwy środkowe mają mniej neuronów niż warstwy zewnętrzne, więc żeby to osiągnąć, sieć musi znaleźć bardziej kompaktową (tu: $k$-wymiarową) reprezentację informacji zawartej w wektorach $x_{(i)}$.\n",
+ "* Otrzymujemy metodę kompresji danych."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Innymi słowy:\n",
+ "* Ograniczenia nałożone na reprezentację danych w warstwie ukrytej pozwala na „odkrycie” pewnej **struktury** w danych.\n",
+ "* _Decoder_ musi odtworzyć do pierwotnej postaci reprezentację danych skompresowaną przez _encoder_.\n",
+ "\n",
+ "(rys. 13.2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "\n",
+ "\n",
+ "Rys. 13.2 - źródło: [Chervinskii](https://commons.wikimedia.org/wiki/File:Autoencoder_structure.png), [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0), Wikimedia Commons"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Całkowita liczba warstw w sieci autoencodera może być większa niż 3.\n",
+ "* Jako funkcji kosztu na ogół używa się błędu średniokwadratowego (*mean squared error*, MSE) lub entropii krzyżowej (*binary crossentropy*).\n",
+ "* Autoencoder może wykryć ciekawe struktury w danych nawet jeżeli $k \\geq n$, jeżeli na sieć nałoży się inne ograniczenia.\n",
+ "* W wyniku działania autoencodera uzyskujemy na ogół kompresję **stratną**."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Autoencoder a PCA\n",
+ "\n",
+ "Widzimy, że autoencoder można wykorzystać do redukcji liczby wymiarów. Podobną rolę pełni poznany na jednym z poprzednich wykładów algorytm PCA (analiza głównych składowych, *principal component analysis*).\n",
+ "\n",
+ "Faktycznie, jeżeli zastosujemy autoencoder z liniowymi funkcjami aktywacji i pojedynczą sigmoidalną warstwą ukrytą, to na podstawie uzyskanych wag można odtworzyć główne składowe używając rozkładu według wartości osobliwych (*singular value decomposition*, SVD)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Autoencoder – zastosowania\n",
+ "\n",
+ "Autoencoder sprawdza się gorzej niż inne algorytmy kompresji, więc nie stosuje się go raczej jako metody kompresji danych, ale ma inne zastosowania:\n",
+ "* odszumianie danych (jeżeli na wejściu zamiast „czystych” danych użyjemy danych zaszumionych, to otrzymamy sieć, która może usuwać szum z danych)\n",
+ "* redukcja wymiarowości\n",
+ "* VAE (*variational autoencoders*) – http://kvfrans.com/variational-autoencoders-explained/"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "## 14.3. Word embeddings"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "_Word embeddings_ – sposoby reprezentacji słów jako wektorów liczbowych"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Znaczenie wyrazu jest reprezentowane przez sąsiednie wyrazy:\n",
+ "\n",
+ "“A word is characterized by the company it keeps.” (John R. Firth, 1957)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Pomysł pojawił sie jeszcze w latach 60. XX w.\n",
+ "* _Word embeddings_ można uzyskiwać na różne sposoby, ale dopiero w ostatnim dziesięcioleciu stało się opłacalne użycie w tym celu sieci neuronowych."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Przykład – 2 zdania: \n",
+ "* \"have a good day\"\n",
+ "* \"have a great day\"\n",
+ "\n",
+ "Słownik:\n",
+ "* {\"a\", \"day\", \"good\", \"great\", \"have\"}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Aby wykorzystać metody uczenia maszynowego do analizy danych tekstowych, musimy je jakoś reprezentować jako liczby.\n",
+ "* Najprostsza metoda to wektory jednostkowe:\n",
+ " * \"a\" = $(1, 0, 0, 0, 0)$\n",
+ " * \"day\" = $(0, 1, 0, 0, 0)$\n",
+ " * \"good\" = $(0, 0, 1, 0, 0)$\n",
+ " * \"great\" = $(0, 0, 0, 1, 0)$\n",
+ " * \"have\" = $(0, 0, 0, 0, 1)$\n",
+ "* Taka metoda nie uwzględnia jednak podobieństw i różnic między znaczeniami wyrazów."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Metody uzyskiwania *word embeddings*:\n",
+ "* Common Bag of Words (CBOW)\n",
+ "* Skip Gram\n",
+ "\n",
+ "Obie opierają się na odpowiednim użyciu autoencodera."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Skip Gram a CBOW\n",
+ "\n",
+ "* Skip Gram lepiej reprezentuje rzadkie wyrazy i lepiej działa, jeżeli mamy mało danych.\n",
+ "* CBOW jest szybszy i lepiej reprezentuje częste wyrazy."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "### Popularne modele _word embeddings_\n",
+ "* Word2Vec (Google)\n",
+ "* GloVe (Stanford)\n",
+ "* FastText (Facebook)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "notes"
+ }
+ },
+ "source": [
+ "Więcej o word embeddings: https://towardsdatascience.com/introduction-to-word-embedding-and-word2vec-652d0c2060fa"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "## 14.4. Tłumaczenie neuronowe\n",
+ "\n",
+ "_Neural Machine Translation_ (NMT)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Neuronowe tłumaczenie maszynowe również opiera się na modelu *encoder-decoder*:\n",
+ "* *Encoder* koduje z języka źródłowego na abstrakcyjną reprezentację.\n",
+ "* *Decoder* odkodowuje z abstrakcyjnej reprezentacji na język docelowy."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In *Advances in neural information processing systems* (pp. 3104-3112)."
+ ]
+ }
+ ],
+ "metadata": {
+ "author": "Paweł Skórzewski",
+ "celltoolbar": "Slideshow",
+ "email": "pawel.skorzewski@amu.edu.pl",
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "lang": "pl",
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]"
+ },
+ "livereveal": {
+ "start_slideshow_at": "selected",
+ "theme": "white"
+ },
+ "subtitle": "11.Rekurencyjne sieci neuronowe[wykład]",
+ "title": "Uczenie maszynowe",
+ "vscode": {
+ "interpreter": {
+ "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
+ }
+ },
+ "year": "2021"
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/wyk/15_Uczenie_przez_wzmacnianie.ipynb b/wyk/15_Uczenie_przez_wzmacnianie.ipynb
new file mode 100644
index 0000000..28735f6
--- /dev/null
+++ b/wyk/15_Uczenie_przez_wzmacnianie.ipynb
@@ -0,0 +1,303 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# 15. Uczenie przez wzmacnianie i systemy dialogowe"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "# 15.1. Uczenie przez wzmacnianie"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "### Paradygmat uczenia przez wzmacnianie"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ ""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Paradygmat uczenia przez wzmacnianie naśladuje sposób, w jaki uczą się dzieci.\n",
+ "* Interakcja ze środowiskiem."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* W chwili $t$ agent w stanie $S_t$ podejmuje akcję $A_t$, następnie obserwuje zmianę w środowisku w stanie $S_{t+1}$ i otrzymuje nagrodę $R_{t+1}$ (rys. 13.2)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "fragment"
+ }
+ },
+ "source": [
+ ""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "* Celem jest znalezienie takiej taktyki wyboru kolejnej akcji, aby zmaksymalizować wartość końcowej nagrody. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Zastosowanie uczenia przez wzmacnianie:\n",
+ "* strategie gier\n",
+ "* systemy dialogowe\n",
+ "* sterowanie"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "### Uczenie przez wzmacnianie jako proces decyzyjny Markowa"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Paradygmat uczenia przez wzmacnianie można formalnie opisać jako proces decyzyjny Markowa:\n",
+ "$$ (S, A, T, R) $$\n",
+ "gdzie:\n",
+ "* $S$ – skończony zbiór stanów\n",
+ "* $A$ – skończony zbiór akcji\n",
+ "* $T \\colon A \\times S \\to S$ – funkcja przejścia która opisuje, jak zmienia się środowisko pod wpływem wybranych akcji\n",
+ "* $R \\colon A \\times S \\to \\mathbb{R}$ – funkcja nagrody"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Albo, jeśli przyjmiemy, że środowisko zmienia się w sposób niedeterministyczny:\n",
+ "$$ (S, A, P, R) $$\n",
+ "gdzie:\n",
+ "* $S$ – skończony zbiór stanów\n",
+ "* $A$ – skończony zbiór akcji\n",
+ "* $P \\colon A \\times S \\times S \\to [0, 1]$ – prawdopodobieństwo przejścia\n",
+ "* $R \\colon A \\times S \\times S \\to \\mathbb{R}$ – funkcja nagrody"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Na przykład, prawdopodobieństwo, że akcja $a$ spowoduje przejście ze stanu $s$ do $s'$:\n",
+ "$$ P_a(s, s') \\; = \\; \\mathbf{P}( \\, s_{t+1} = s' \\, | \\, s_t = s, a_t = a \\,) $$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "### Strategia\n",
+ "\n",
+ "* Strategią (*policy*) nazywamy odwzorowanie $\\pi \\colon S \\to A$, które bieżącemu stanowi przyporządkuje kolejną akcję do wykonania.\n",
+ "* Algorytm uczenia przez wzmacnianie będzie starał się zoptymalizować strategię tak, żeby na koniec otrzymać jak najwyższą nagrodę.\n",
+ "* W chwili $t$, ostateczna końcowa nagroda jest zdefiniowana jako:\n",
+ "$$ R_t := r_{t+1} + \\gamma \\, r_{t+2} + \\gamma^2 \\, r_{t+3} + \\ldots = \\sum_{k=0}^T \\gamma^k \\, r_{t+k+1} \\; , $$\n",
+ "gdzie $0 < \\gamma < 1$ jest czynnikiem, który określa, jak bardzo bieżemy pod uwagę nagrody, które otrzymamy w odległej przyszłości."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "Algorytm szuka optymalnej strategii metodą prób i błędów – podejmując akcje i obserwując ich wpływ na środowisko. W podejmowaniu decyzji pomoże mu oszacowanie wartości następujących funkcji:\n",
+ "* Funkcja wartości ($V$) odzwierciedla, jak atrakcyjne w dalekiej perspektywie jest przejście do danego stanu:\n",
+ "$$ V_{\\pi}(s) = \\mathbf{E}_{\\pi}(R \\, | \\, s_t = s) $$\n",
+ "* Funkcja $Q$ odzwierciedla, jak atrakcyjne w dalekiej perspektywie jest przejście do danego stanu przez podjęcie danej akcji:\n",
+ "$$ Q_{\\pi}(s, a) = \\mathbf{E}_{\\pi}(R \\, | \\, s_t = s, a_t = a) $$"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "## Algorytmy uczenia przez wzmacnianie\n",
+ "* Programowanie dynamiczne (DP):\n",
+ " * *bootstrapping* – aktualizacja oczacowań dla danego stanu na podstawie oszacowań dla możliwych stanów następnych\n",
+ "* Metody Monte Carlo (MC)\n",
+ "* Uczenie oparte na różnicach czasowych (*temporal difference learning*, TD):\n",
+ " * *on-policy* – aktualizacja bieżącej strategii:\n",
+ " * SARSA (*state–action–reward–state–action*)\n",
+ " * *off-policy* – eksploracja strategii innych niż bieżąca:\n",
+ " * *Q-Learning*\n",
+ " * *Actor–Critic*"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Przykłady\n",
+ "\n",
+ "* Odwrócone wahadło (*cart and pole*): https://www.youtube.com/watch?v=46wjA6dqxOM\n",
+ "* Symulacja autonomicznego samochodu: https://www.youtube.com/watch?v=G-GpY7bevuw"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "slide"
+ }
+ },
+ "source": [
+ "# 15.2. Systemy dialogowe"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "## Rodzaje systemów dialogowych\n",
+ "* Chatboty\n",
+ "* Systemy zorientowane na zadania (*task-oriented systems*, *goal-oriented systems*):\n",
+ " * szukanie informacji\n",
+ " * wypełnianie formularzy\n",
+ " * rozwiązywanie problemów\n",
+ " * systemy edukacyjne i tutorialowe\n",
+ " * inteligentni asystenci"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "slideshow": {
+ "slide_type": "subslide"
+ }
+ },
+ "source": [
+ "## Architektura systemu dialogowego\n",
+ "\n",
+ "(rys. 13.3)\n",
+ "\n",
+ ""
+ ]
+ }
+ ],
+ "metadata": {
+ "author": "Paweł Skórzewski",
+ "celltoolbar": "Slideshow",
+ "email": "pawel.skorzewski@amu.edu.pl",
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "lang": "pl",
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]"
+ },
+ "livereveal": {
+ "start_slideshow_at": "selected",
+ "theme": "white"
+ },
+ "subtitle": "15.Uczenie przez wzmacnianie i systemy dialogowe[wykład]",
+ "title": "Uczenie maszynowe",
+ "vscode": {
+ "interpreter": {
+ "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
+ }
+ },
+ "year": "2021"
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/wyk/exp1.png b/wyk/exp1.png
new file mode 100644
index 0000000..8d8c146
Binary files /dev/null and b/wyk/exp1.png differ
diff --git a/wyk/exp2.png b/wyk/exp2.png
new file mode 100644
index 0000000..551480a
Binary files /dev/null and b/wyk/exp2.png differ
diff --git a/wyk/exp3.png b/wyk/exp3.png
new file mode 100644
index 0000000..b1e771b
Binary files /dev/null and b/wyk/exp3.png differ
diff --git a/wyk/nn3.png b/wyk/nn3.png
new file mode 100644
index 0000000..e2da2f9
Binary files /dev/null and b/wyk/nn3.png differ