uczenie-maszynowe/wyk/12_Propagacja_wsteczna.ipynb
2023-01-26 11:38:50 +01:00

2111 lines
85 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# 12. Sieci neuronowe propagacja wsteczna"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 12.1. Metoda propagacji wstecznej wprowadzenie"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"nn1.png\" alt=\"Rys. 12.1. Wielowarstwowa sieć neuronowa\" style=\"height: 100%\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Architektura sieci neuronowych\n",
"\n",
"* Budowa warstwowa, najczęściej sieci jednokierunkowe i gęste.\n",
"* Liczbę i rozmiar warstw dobiera się do każdego problemu.\n",
"* Rozmiary sieci określane poprzez liczbę neuronów lub parametrów."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### _Feedforward_\n",
"\n",
"Mając daną $n$-warstwową sieć neuronową oraz jej parametry $\\Theta^{(1)}, \\ldots, \\Theta^{(L)} $ oraz $\\beta^{(1)}, \\ldots, \\beta^{(L)} $, obliczamy:\n",
"\n",
"$$a^{(l)} = g^{(l)}\\left( a^{(l-1)} \\Theta^{(l)} + \\beta^{(l)} \\right). $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"nn2.png\" alt=\"Rys. 12.2. Wielowarstwowa sieć neuronowa - feedforward\" style=\"height:100%\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Funkcje $g^{(l)}$ to **funkcje aktywacji**.<br/>\n",
"Dla $i = 0$ przyjmujemy $a^{(0)} = x$ (wektor wierszowy cech) oraz $g^{(0)}(x) = x$ (identyczność)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Parametry $\\Theta$ to wagi na połączeniach miedzy neuronami dwóch warstw.<br/>\n",
"Rozmiar macierzy $\\Theta^{(l)}$, czyli macierzy wag na połączeniach warstw $a^{(l-1)}$ i $a^{(l)}$, to $\\dim(a^{(l-1)}) \\times \\dim(a^{(l)})$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Parametry $\\beta$ zastępują tutaj dodawanie kolumny z jedynkami do macierzy cech.<br/>Macierz $\\beta^{(l)}$ ma rozmiar równy liczbie neuronów w odpowiedniej warstwie, czyli $1 \\times \\dim(a^{(l)})$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* **Klasyfikacja**: dla ostatniej warstwy $L$ (o rozmiarze równym liczbie klas) przyjmuje się $g^{(L)}(x) = \\mathop{\\mathrm{softmax}}(x)$.\n",
"* **Regresja**: pojedynczy neuron wyjściowy; funkcją aktywacji może wtedy być np. funkcja identycznościowa."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Jak uczyć sieci neuronowe?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* W poznanych do tej pory algorytmach (regresja liniowa, regresja logistyczna) do uczenia używaliśmy funkcji kosztu, jej gradientu oraz algorytmu gradientu prostego (GD/SGD)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Dla sieci neuronowych potrzebowalibyśmy również znaleźć gradient funkcji kosztu."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Sprowadza się to do bardziej ogólnego problemu:<br/>jak obliczyć gradient $\\nabla f(x)$ dla danej funkcji $f$ i wektora wejściowego $x$?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Pochodna funkcji\n",
"\n",
"* **Pochodna** mierzy, jak szybko zmienia się wartość funkcji względem zmiany jej argumentów:\n",
"\n",
"$$ \\frac{d f(x)}{d x} = \\lim_{h \\to 0} \\frac{ f(x + h) - f(x) }{ h } $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Pochodna cząstkowa i gradient\n",
"\n",
"* **Pochodna cząstkowa** mierzy, jak szybko zmienia się wartość funkcji względem zmiany jej *pojedynczego argumentu*."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* **Gradient** to wektor pochodnych cząstkowych:\n",
"\n",
"$$ \\nabla f = \\left( \\frac{\\partial f}{\\partial x_1}, \\ldots, \\frac{\\partial f}{\\partial x_n} \\right) $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Gradient przykłady"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$$ f(x_1, x_2) = x_1 + x_2 \\qquad \\to \\qquad \\frac{\\partial f}{\\partial x_1} = 1, \\quad \\frac{\\partial f}{\\partial x_2} = 1, \\quad \\nabla f = (1, 1) $$ "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$$ f(x_1, x_2) = x_1 \\cdot x_2 \\qquad \\to \\qquad \\frac{\\partial f}{\\partial x_1} = x_2, \\quad \\frac{\\partial f}{\\partial x_2} = x_1, \\quad \\nabla f = (x_2, x_1) $$ "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"$$ f(x_1, x_2) = \\max(x_1 + x_2) \\hskip{12em} \\\\\n",
"\\to \\qquad \\frac{\\partial f}{\\partial x_1} = \\mathbb{1}_{x \\geq y}, \\quad \\frac{\\partial f}{\\partial x_2} = \\mathbb{1}_{y \\geq x}, \\quad \\nabla f = (\\mathbb{1}_{x \\geq y}, \\mathbb{1}_{y \\geq x}) $$ "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Własności pochodnych cząstkowych\n",
"\n",
"Jezeli $f(x, y, z) = (x + y) \\, z$ oraz $x + y = q$, to:\n",
"$$f = q z,\n",
"\\quad \\frac{\\partial f}{\\partial q} = z,\n",
"\\quad \\frac{\\partial f}{\\partial z} = q,\n",
"\\quad \\frac{\\partial q}{\\partial x} = 1,\n",
"\\quad \\frac{\\partial q}{\\partial y} = 1 $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Reguła łańcuchowa\n",
"\n",
"$$ \\frac{\\partial f}{\\partial x} = \\frac{\\partial f}{\\partial q} \\, \\frac{\\partial q}{\\partial x},\n",
"\\quad \\frac{\\partial f}{\\partial y} = \\frac{\\partial f}{\\partial q} \\, \\frac{\\partial q}{\\partial y} $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Propagacja wsteczna prosty przykład"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# Dla ustalonego wejścia\n",
"x = -2\n",
"y = 5\n",
"z = -4"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3 -12\n"
]
}
],
"source": [
"# Krok w przód\n",
"q = x + y\n",
"f = q * z\n",
"print(q, f)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[-4, -4, 3]\n"
]
}
],
"source": [
"# Propagacja wsteczna dla f = q * z\n",
"# Oznaczmy symbolami `dfx`, `dfy`, `dfz`, `dfq` odpowiednio\n",
"# pochodne cząstkowe ∂f/∂x, ∂f/∂y, ∂f/∂z, ∂f/∂q\n",
"dfz = q\n",
"dfq = z\n",
"# Propagacja wsteczna dla q = x + y\n",
"dfx = 1 * dfq # z reguły łańcuchowej\n",
"dfy = 1 * dfq # z reguły łańcuchowej\n",
"print([dfx, dfy, dfz])"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"exp1.png\" alt=\"Rys. 12.3. Propagacja wsteczna - przykład 1\" style=\"height:100%\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Właśnie tak wygląda obliczanie pochodnych metodą propagacji wstecznej!"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Spróbujmy czegoś bardziej skomplikowanego:<br/>metodą propagacji wstecznej obliczmy pochodną funkcji sigmoidalnej."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Propagacja wsteczna funkcja sigmoidalna"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Funkcja sigmoidalna:\n",
"\n",
"$$f(\\theta,x) = \\frac{1}{1+e^{-(\\theta_0 x_0 + \\theta_1 x_1 + \\theta_2)}}$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"$$\n",
"\\begin{array}{lcl}\n",
"f(x) = \\frac{1}{x} \\quad & \\rightarrow & \\quad \\frac{df}{dx} = -\\frac{1}{x^2} \\\\\n",
"f_c(x) = c + x \\quad & \\rightarrow & \\quad \\frac{df}{dx} = 1 \\\\\n",
"f(x) = e^x \\quad & \\rightarrow & \\quad \\frac{df}{dx} = e^x \\\\\n",
"f_a(x) = ax \\quad & \\rightarrow & \\quad \\frac{df}{dx} = a \\\\\n",
"\\end{array}\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"exp2.png\" alt=\"Rys. 12.4. Propagacja wsteczna - przykład 2\" style=\"height:100%\"/>"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0.3932238664829637, -0.5898357997244456]\n",
"[-0.19661193324148185, -0.3932238664829637, 0.19661193324148185]\n"
]
}
],
"source": [
"from math import exp\n",
"\n",
"\n",
"# Losowe wagi i dane\n",
"w = [2, -3, -3]\n",
"x = [-1, -2]\n",
"\n",
"# Krok w przód\n",
"dot = w[0] * x[0] + w[1] * x[1] + w[2]\n",
"f = 1.0 / (1 + exp(-dot)) # funkcja sigmoidalna\n",
"\n",
"# Krok w tył\n",
"ddot = (1 - f) * f # pochodna funkcji sigmoidalnej\n",
"dx = [w[0] * ddot, w[1] * ddot]\n",
"dw = [x[0] * ddot, x[1] * ddot, 1.0 * ddot]\n",
"\n",
"print(dx)\n",
"print(dw)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Obliczanie gradientów podsumowanie"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Gradient $f$ dla $x$ mówi, jak zmieni się całe wyrażenie przy zmianie wartości $x$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Gradienty łączymy, korzystając z **reguły łańcuchowej**."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* W kroku \"wstecz\" gradienty informują, które części grafu powinny być zwiększone lub zmniejszone (i z jaką siłą), aby zwiększyć wartość na wyjściu."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* W kontekście implementacji chcemy dzielić funkcję $f$ na części, dla których można łatwo obliczyć gradienty."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 12.2. Uczenie wielowarstwowych sieci neuronowych metodą propagacji wstecznej"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Mając algorytm SGD oraz gradienty wszystkich wag, moglibyśmy trenować każdą sieć."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Niech $\\Theta = (\\Theta^{(1)},\\Theta^{(2)},\\Theta^{(3)},\\beta^{(1)},\\beta^{(2)},\\beta^{(3)})$\n",
"* Funkcja sieci neuronowej z grafiki:\n",
"$$\\small h_\\Theta(x) = \\tanh(\\tanh(\\tanh(x\\Theta^{(1)}+\\beta^{(1)})\\Theta^{(2)} + \\beta^{(2)})\\Theta^{(3)} + \\beta^{(3)})$$\n",
"* Funkcja kosztu dla regresji:\n",
"$$J(\\Theta) = \\dfrac{1}{2m} \\sum_{i=1}^{m} (h_\\Theta(x^{(i)})- y^{(i)})^2 $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Jak obliczymy gradienty?\n",
"\n",
"$$\\nabla_{\\Theta^{(l)}} J(\\Theta) = ? \\quad \\nabla_{\\beta^{(l)}} J(\\Theta) = ?$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### W kierunku propagacji wstecznej\n",
"\n",
"* Pewna (niewielka) zmiana wagi $\\Delta z^l_j$ dla $j$-ego neuronu w warstwie $l$ pociąga za sobą (niewielką) zmianę kosztu: \n",
"\n",
"$$\\frac{\\partial J(\\Theta)}{\\partial z^{l}_j} \\Delta z^{l}_j$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Jeżeli $\\frac{\\partial J(\\Theta)}{\\partial z^{l}_j}$ jest duża, $\\Delta z^l_j$ ze znakiem przeciwnym zredukuje koszt.\n",
"* Jeżeli $\\frac{\\partial J(\\Theta)}{\\partial z^l_j}$ jest bliska zeru, koszt nie będzie mocno poprawiony."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Definiujemy błąd $\\delta^l_j$ neuronu $j$ w warstwie $l$: \n",
"\n",
"$$\\delta^l_j := \\dfrac{\\partial J(\\Theta)}{\\partial z^l_j}$$ \n",
"$$\\delta^l := \\nabla_{z^l} J(\\Theta) \\quad \\textrm{ (zapis wektorowy)} $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Podstawowe równania propagacji wstecznej\n",
"\n",
"$$\n",
"\\begin{array}{rcll}\n",
"\\delta^L & = & \\nabla_{a^L}J(\\Theta) \\odot { \\left( g^{L} \\right) }^{\\prime} \\left( z^L \\right) & (BP1) \\\\[2mm]\n",
"\\delta^{l} & = & \\left( \\left( \\Theta^{l+1} \\right) \\! ^\\top \\, \\delta^{l+1} \\right) \\odot {{ \\left( g^{l} \\right) }^{\\prime}} \\left( z^{l} \\right) & (BP2)\\\\[2mm]\n",
"\\nabla_{\\beta^l} J(\\Theta) & = & \\delta^l & (BP3)\\\\[2mm]\n",
"\\nabla_{\\Theta^l} J(\\Theta) & = & a^{l-1} \\odot \\delta^l & (BP4)\\\\\n",
"\\end{array}\n",
"$$\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### (BP1)\n",
"$$ \\delta^L_j \\; = \\; \\frac{ \\partial J }{ \\partial a^L_j } \\, g' \\!\\! \\left( z^L_j \\right) $$\n",
"$$ \\delta^L \\; = \\; \\nabla_{a^L}J(\\Theta) \\odot { \\left( g^{L} \\right) }^{\\prime} \\left( z^L \\right) $$\n",
"Błąd w ostatniej warstwie jest iloczynem szybkości zmiany kosztu względem $j$-tego wyjścia i szybkości zmiany funkcji aktywacji w punkcie $z^L_j$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### (BP2)\n",
"$$ \\delta^{l} \\; = \\; \\left( \\left( \\Theta^{l+1} \\right) \\! ^\\top \\, \\delta^{l+1} \\right) \\odot {{ \\left( g^{l} \\right) }^{\\prime}} \\left( z^{l} \\right) $$\n",
"Aby obliczyć błąd w $l$-tej warstwie, należy przemnożyć błąd z następnej ($(l+1)$-szej) warstwy przez transponowany wektor wag, a uzyskaną macierz pomnożyć po współrzędnych przez szybkość zmiany funkcji aktywacji w punkcie $z^l$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### (BP3)\n",
"$$ \\nabla_{\\beta^l} J(\\Theta) \\; = \\; \\delta^l $$\n",
"Błąd w $l$-tej warstwie jest równy wartości gradientu funkcji kosztu."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### (BP4)\n",
"$$ \\nabla_{\\Theta^l} J(\\Theta) \\; = \\; a^{l-1} \\odot \\delta^l $$\n",
"Gradient funkcji kosztu względem wag $l$-tej warstwy można obliczyć jako iloczyn po współrzędnych $a^{l-1}$ przez $\\delta^l$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Algorytm propagacji wstecznej"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Dla pojedynczego przykładu $(x,y)$:\n",
"1. **Wejście**: Ustaw aktywacje w warstwie cech $a^{(0)}=x$ \n",
"2. **Feedforward:** dla $l=1,\\dots,L$ oblicz \n",
"$z^{(l)} = a^{(l-1)} \\Theta^{(l)} + \\beta^{(l)}$ oraz $a^{(l)}=g^{(l)} \\!\\! \\left( z^{(l)} \\right)$\n",
"3. **Błąd wyjścia $\\delta^{(L)}$:** oblicz wektor $$\\delta^{(L)}= \\nabla_{a^{(L)}}J(\\Theta) \\odot {g^{\\prime}}^{(L)} \\!\\! \\left( z^{(L)} \\right) $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"4. **Propagacja wsteczna błędu:** dla $l = L-1,L-2,\\dots,1$ oblicz $$\\delta^{(l)} = \\delta^{(l+1)}(\\Theta^{(l+1)})^T \\odot {g^{\\prime}}^{(l)} \\!\\! \\left( z^{(l)} \\right) $$\n",
"5. **Gradienty:** \n",
" * $\\dfrac{\\partial}{\\partial \\Theta_{ij}^{(l)}} J(\\Theta) = a_i^{(l-1)}\\delta_j^{(l)} \\textrm{ oraz } \\dfrac{\\partial}{\\partial \\beta_{j}^{(l)}} J(\\Theta) = \\delta_j^{(l)}$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"W naszym przykładzie:\n",
"\n",
"$$\\small J(\\Theta) = \\frac{1}{2} \\left( a^{(L)} - y \\right) ^2 $$\n",
"$$\\small \\dfrac{\\partial}{\\partial a^{(L)}} J(\\Theta) = a^{(L)} - y$$\n",
"\n",
"$$\\small \\tanh^{\\prime}(x) = 1 - \\tanh^2(x)$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"nn3.png\" alt=\"Rys. 12.5. Propagacja wsteczna - schemat\" style=\"height:100%\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Algorytm SGD z propagacją wsteczną\n",
"\n",
"Pojedyncza iteracja:\n",
"1. Dla parametrów $\\Theta = (\\Theta^{(1)},\\ldots,\\Theta^{(L)})$ utwórz pomocnicze macierze zerowe $\\Delta = (\\Delta^{(1)},\\ldots,\\Delta^{(L)})$ o takich samych wymiarach (dla uproszczenia opuszczono wagi $\\beta$)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"2. Dla $m$ przykładów we wsadzie (*batch*), $i = 1,\\ldots,m$:\n",
" * Wykonaj algortym propagacji wstecznej dla przykładu $(x^{(i)}, y^{(i)})$ i przechowaj gradienty $\\nabla_{\\Theta}J^{(i)}(\\Theta)$ dla tego przykładu;\n",
" * $\\Delta := \\Delta + \\dfrac{1}{m}\\nabla_{\\Theta}J^{(i)}(\\Theta)$\n",
"3. Wykonaj aktualizację wag: $\\Theta := \\Theta - \\alpha \\Delta$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Propagacja wsteczna podsumowanie\n",
"\n",
"* Algorytm pierwszy raz wprowadzony w latach 70. XX w.\n",
"* W 1986 David Rumelhart, Geoffrey Hinton i Ronald Williams pokazali, że jest znacznie szybszy od wcześniejszych metod.\n",
"* Obecnie najpopularniejszy algorytm uczenia sieci neuronowych."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 12.3. Przykłady implementacji wielowarstwowych sieci neuronowych"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"source": [
"### Uwaga!\n",
"\n",
"Poniższe przykłady wykorzystują interfejs [Keras](https://keras.io), który jest częścią biblioteki [TensorFlow](https://www.tensorflow.org).\n",
"\n",
"Aby uruchomić TensorFlow w środowisku Jupyter, należy wykonać następujące czynności:\n",
"\n",
"#### Przed pierwszym uruchomieniem (wystarczy wykonać tylko raz)\n",
"\n",
"Instalacja biblioteki TensorFlow w środowisku Anaconda:\n",
"\n",
"1. Uruchom *Anaconda Navigator*\n",
"1. Wybierz kafelek *CMD.exe Prompt*\n",
"1. Kliknij przycisk *Launch*\n",
"1. Pojawi się konsola. Wpisz następujące polecenia, każde zatwierdzając wciśnięciem klawisza Enter:\n",
"```\n",
"conda create -n tf tensorflow\n",
"conda activate tf\n",
"conda install pandas matplotlib\n",
"jupyter notebook\n",
"```\n",
"\n",
"#### Przed każdym uruchomieniem\n",
"\n",
"Jeżeli chcemy korzystać z biblioteki TensorFlow, to środowisko Jupyter Notebook należy uruchomić w następujący sposób:\n",
"\n",
"1. Uruchom *Anaconda Navigator*\n",
"1. Wybierz kafelek *CMD.exe Prompt*\n",
"1. Kliknij przycisk *Launch*\n",
"1. Pojawi się konsola. Wpisz następujące polecenia, każde zatwierdzając wciśnięciem klawisza Enter:\n",
"```\n",
"conda activate tf\n",
"jupyter notebook\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Przykład: MNIST\n",
"\n",
"_Modified National Institute of Standards and Technology database_"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Zbiór cyfr zapisanych pismem odręcznym\n",
"* 60 000 przykładów uczących, 10 000 przykładów testowych\n",
"* Rozdzielczość każdego przykładu: 28 × 28 = 784 piksele"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-01-26 10:52:17.922141: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA\n",
"To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
"2023-01-26 10:52:18.163925: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory\n",
"2023-01-26 10:52:18.163996: I tensorflow/compiler/xla/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.\n",
"2023-01-26 10:52:19.577890: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory\n",
"2023-01-26 10:52:19.578662: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory\n",
"2023-01-26 10:52:19.578677: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz\n",
"11490434/11490434 [==============================] - 1s 0us/step\n"
]
}
],
"source": [
"from tensorflow import keras\n",
"from tensorflow.keras.datasets import mnist\n",
"from tensorflow.keras.layers import Dense, Dropout\n",
"\n",
"# załaduj dane i podziel je na zbiory uczący i testowy\n",
"(x_train, y_train), (x_test, y_test) = mnist.load_data()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"from matplotlib import pyplot as plt\n",
"\n",
"\n",
"def draw_examples(examples, captions=None):\n",
" plt.figure(figsize=(16, 4))\n",
" m = len(examples)\n",
" for i, example in enumerate(examples):\n",
" plt.subplot(100 + m * 10 + i + 1)\n",
" plt.imshow(example, cmap=plt.get_cmap(\"gray\"))\n",
" plt.show()\n",
" if captions is not None:\n",
" print(6 * \" \" + (10 * \" \").join(str(captions[i]) for i in range(m)))"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABQcAAADFCAYAAADpJUQuAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAomElEQVR4nO3de3RU1d3G8V8CJFyTyC2BQiQqFRQJGgFRFqBE0KoQoKIUBKwFCwFEK6X4oqCIQVBbrmK1kKIoaBFQvFKuVUPK1S5AIlKEIEkAJRcCJErO+4fLVNx78ExmJjOz9/ez1vnDh30m+4Qnh2Eznh3hOI4jAAAAAAAAAKwTGewJAAAAAAAAAAgOFgcBAAAAAAAAS7E4CAAAAAAAAFiKxUEAAAAAAADAUiwOAgAAAAAAAJZicRAAAAAAAACwFIuDAAAAAAAAgKVYHAQAAAAAAAAsxeIgAAAAAAAAYCkWBwEAAAAAAABL1QzUC8+fP19mzZol+fn5kpycLHPnzpVOnTr97HkVFRVy9OhRadCggURERARqerCU4zhSUlIizZs3l8hI79bGq9ppEXqNwPGl0yLcqxF6gtVpEXqNwOFeDdNwr4aJuFfDNF512gmAZcuWOVFRUc6iRYucPXv2OCNGjHDi4uKcgoKCnz03NzfXEREOjoAeubm51dZpes1RHYe3nfa113SaI9BHdXeaXnNUx8G9msO0g3s1h4kH92oO0w43nQ7I4mCnTp2c9PT0yv8+d+6c07x5cycjI+Nnzy0sLAz6N47D/KOwsLDaOk2vOarj8LbTvvaaTnME+qjuTtNrjuo4uFdzmHZwr+Yw8eBezWHa4abTfn/mYHl5uWzfvl1SU1Mrs8jISElNTZWsrCxlfFlZmRQXF1ceJSUl/p4SoPDm49redlqEXqP6efu/IHCvRqgLdKdF6DWqH/dqmIZ7NUzEvRqmcdNpvy8OnjhxQs6dOyfx8fHn5fHx8ZKfn6+Mz8jIkNjY2MqjZcuW/p4S4BNvOy1CrxH6uFfDNNyrYSLu1TAN92qYiHs1TBD03YonTZokRUVFlUdubm6wpwT4jF7DNHQaJqLXMA2dhonoNUxDpxGK/L5bcePGjaVGjRpSUFBwXl5QUCAJCQnK+OjoaImOjvb3NAC/8bbTIvQaoY97NUzDvRom4l4N03Cvhom4V8MEfv/kYFRUlKSkpMi6desqs4qKClm3bp106dLF318OCDg6DRPRa5iGTsNE9BqmodMwEb2GEbzehseFZcuWOdHR0U5mZqazd+9eZ+TIkU5cXJyTn5//s+cWFRUFfScXDvOPoqKiaus0veaojsPbTvvaazrNEeijujtNrzmq4+BezWHawb2aw8SDezWHaYebTgdkcdBxHGfu3LlOYmKiExUV5XTq1MnZsmWLq/P4weCojqMqN/yqdppec1THUZVO+9JrOs0R6KO6O02vOarj4F7NYdrBvZrDxIN7NYdph5tORziO40gIKS4ultjY2GBPA4YrKiqSmJiYavt69BqBRqdhmurutAi9RuBxr4ZpuFfDRNyrYRo3nQ76bsUAAAAAAAAAgoPFQQAAAAAAAMBSLA4CAAAAAAAAlmJxEAAAAAAAALAUi4MAAAAAAACApVgcBAAAAAAAACzF4iAAAAAAAABgKRYHAQAAAAAAAEuxOAgAAAAAAABYisVBAAAAAAAAwFIsDgIAAAAAAACWqhnsCQCAWykpKdp8zJgxSjZ06FDt2CVLlijZ3LlztWN37NjhxewAAAAAAFUxe/ZsbT5u3Dgl2717t5Ldfvvt2vMPHTrk28QswScHAQAAAAAAAEuxOAgAAAAAAABYisVBAAAAAAAAwFIsDgIAAAAAAACWYkOSEFWjRg1tHhsb69Pr6jZuqFu3rnbs5ZdfrmTp6enasc8884ySDRo0SMnOnj2rPX/GjBlK9vjjj2vHwg4dOnRQsrVr12rHxsTEKJnjONqx99xzj5L16dNHO7ZRo0YXmCEQfnr27KlkS5cuVbLu3btrz8/JyfH7nACdyZMnK5mn9wWRkeq/dffo0UPJNm3a5PO8AMAUDRo00Ob169dXsttuu007tkmTJkr23HPPaceWlZV5MTuYrlWrVko2ZMgQ7diKigola9u2rZK1adNGez4bkrjDJwcBAAAAAAAAS7E4CAAAAAAAAFiKxUEAAAAAAADAUiwOAgAAAAAAAJZicRAAAAAAAACwFLsV+ygxMVGbR0VFKdn111+vHdu1a1cli4uL044dMGCA+8n56MiRI0o2Z84c7dh+/fopWUlJiZJ9+umn2vPZQdBunTp1UrIVK1YomafdunU7E+v6JyJSXl6uZJ52Jb7uuuuUbMeOHa5eE1XXrVs3JfP0e7Ry5cpAT8coHTt2VLKtW7cGYSbA94YPH67NJ06cqGS63Qo98bRjPQCYTrcLrO6e2qVLF+357dq18+nrN2vWTJuPGzfOp9eFWY4fP65kmzdv1o7t06dPoKcD4ZODAAAAAAAAgLVYHAQAAAAAAAAsxeIgAAAAAAAAYCkWBwEAAAAAAABLsSGJFzp06KBk69ev1471tHFCKPL0gO/Jkycr2alTp7Rjly5dqmR5eXlKdvLkSe35OTk5F5oiwlDdunWV7JprrtGOfeWVV5TM08OM3dq/f782nzlzppItW7ZMO/bjjz9WMt3PRUZGhpezw4X06NFDyVq3bq0dy4YkepGR+n/7S0pKUrKLL75YySIiIvw+J0BH1z8Rkdq1a1fzTGCyzp07a/MhQ4YoWffu3ZXsyiuvdP21Hn74YW1+9OhRJdNtSiiif1+UnZ3teg4wT5s2bZRs/Pjx2rGDBw9Wsjp16iiZpz/rc3NzlczTRn9t27ZVsoEDB2rHLliwQMn27dunHQvzlZaWKtmhQ4eCMBP8gE8OAgAAAAAAAJZicRAAAAAAAACwFIuDAAAAAAAAgKVYHAQAAAAAAAAsxYYkXjh8+LCSff3119qx1bkhie4BxYWFhdqxN954o5KVl5drx7788ss+zQt2e+GFF5Rs0KBB1fb1PW1+Ur9+fSXbtGmTdqxuY4z27dv7NC/8vKFDhypZVlZWEGYSvjxt6DNixAgl0z34ngeEIxBSU1OVbOzYsa7P99TL22+/XckKCgrcTwxGueuuu5Rs9uzZ2rGNGzdWMt0mDRs3btSe36RJEyWbNWvWz8zwwl/L0+vefffdrl8X4UH398Wnn35aO1bX6wYNGvj09T1t3te7d28lq1Wrlnas7r6s+7m6UA47xcXFKVlycnL1TwSV+OQgAAAAAAAAYCkWBwEAAAAAAABLsTgIAAAAAAAAWIrFQQAAAAAAAMBSLA4CAAAAAAAAlmK3Yi988803SjZhwgTtWN3OeTt37tSOnTNnjus57Nq1S8luvvlmJSstLdWef+WVVyrZAw884PrrAz+VkpKizW+77TYl87Qrn45uB+G3335bO/aZZ55RsqNHj2rH6n4OT548qR170003KZk314CqiYzk36189dJLL7ke62m3QqCqunbtqs0XL16sZLrdOj3xtAvsoUOHXL8GwlPNmupfWa699lrt2BdffFHJ6tatqx27efNmJZs2bZqSffTRR9rzo6Ojlez111/Xju3Vq5c219m2bZvrsQhf/fr1U7Lf/e53AflaBw4cUDLd3yFFRHJzc5Xssssu8/ucYDfdfTkxMdGn1+zYsaM21+2qzXsHFX8DAwAAAAAAACzF4iAAAAAAAABgKRYHAQAAAAAAAEuxOAgAAAAAAABYyusNSTZv3iyzZs2S7du3S15enqxcuVLS0tIqf91xHJkyZYq8+OKLUlhYKDfccIM8//zz0rp1a3/OO2SsWrVKm69fv17JSkpKtGOTk5OV7L777tOO1W284GnzEZ09e/Yo2ciRI12fbyI67V6HDh2UbO3atdqxMTExSuY4jnbse++9p2SDBg1Ssu7du2vPnzx5spJ52pDh+PHjSvbpp59qx1ZUVCiZbqOVa665Rnv+jh07tHmghUun27dvr83j4+OrdR4m8maTB08/w6EmXHoNkWHDhmnz5s2bu36NjRs3KtmSJUuqOqWQRKfdGzJkiJJ5s/GSp/vcXXfdpWTFxcWuX1d3vjcbjxw5ckSb//3vf3f9GqGGXrt35513+nT+l19+qc23bt2qZBMnTlQy3cYjnrRt29b1WNPQ6cDQbR6ZmZmpHTt16lRXr+lpXGFhoZLNmzfP1WvaxOtPDpaWlkpycrLMnz9f++szZ86UOXPmyMKFCyU7O1vq1asnvXv3lrNnz/o8WSAQ6DRMQ6dhInoN09BpmIhewzR0Grbw+pODt956q9x6663aX3McR/7yl7/I5MmTpW/fviLy/b/yxsfHy6pVq+Tuu+/2bbZAANBpmIZOw0T0Gqah0zARvYZp6DRs4ddnDh48eFDy8/MlNTW1MouNjZXOnTtLVlaW9pyysjIpLi4+7wBCRVU6LUKvEbroNExEr2EaOg0T0WuYhk7DJH5dHMzPzxcR9XlR8fHxlb/2UxkZGRIbG1t5tGzZ0p9TAnxSlU6L0GuELjoNE9FrmIZOw0T0Gqah0zBJ0HcrnjRpkhQVFVUe3jwYFQhV9BqmodMwEb2Gaeg0TESvYRo6jVDk9TMHLyQhIUFERAoKCqRZs2aVeUFBgXaXUxGR6OhoiY6O9uc0QoI3Hw0uKipyPXbEiBFKtnz5ciXT7bIK71Wl0yLh3+tf/vKX2nzChAlK5mlX1BMnTihZXl6edqxuV75Tp04p2TvvvKM931MeCHXq1FGyP/zhD9qxgwcPDvR0vBZKnf7Vr36lzXXfY3im2905KSnJ9flfffWVP6cTFKHUa9s0btxYyX77299qx+rem+h2EBQRefLJJ32aV7iztdPTpk3T5o888oiSOY6jHbtgwQIlmzx5snasr/8r3//93//5dP64ceO0+fHjx3163VBla6890f29buTIkdqxH374oZJ98cUX2rHHjh3zbWIauvcaoNP+5unPALe7FcM3fv3kYFJSkiQkJMi6desqs+LiYsnOzpYuXbr480sB1YJOwzR0Giai1zANnYaJ6DVMQ6dhEq8/OXjq1Knz/pXi4MGDsmvXLmnYsKEkJibK+PHj5cknn5TWrVtLUlKSPProo9K8eXNJS0vz57wBv6HTMA2dhonoNUxDp2Eieg3T0GnYwuvFwW3btsmNN95Y+d8PPfSQiIgMGzZMMjMz5Y9//KOUlpbKyJEjpbCwULp27Srvv/++1K5d23+zBvyITsM0dBomotcwDZ2Gieg1TEOnYQuvFwd79Ojh8RkfIiIRERHyxBNPyBNPPOHTxIDqQqdhGjoNE9FrmIZOw0T0Gqah07CFXzckQdXoHrCZkpKiHdu9e3clS01NVTLdQ2sBHd3DcJ955hntWN0GEiUlJdqxQ4cOVbJt27Zpx4b7BhSJiYnBnkJYuvzyy12P3bNnTwBnEt50P6+eHhz++eefK5mnn2Hgp1q1aqVkK1as8Ok1586dq803bNjg0+si9D322GNKptt4RESkvLxcyT744APt2IkTJyrZmTNnXM9L92mfXr16acfq/vyPiIjQjtVtsrN69WrX84J5jh49qmShuvECz89DMEVGqltlsAGr//l1QxIAAAAAAAAA4YPFQQAAAAAAAMBSLA4CAAAAAAAAlmJxEAAAAAAAALAUi4MAAAAAAACApditOASUlpYq2YgRI7Rjd+zYoWQvvviiknna5U+3W+z8+fO1Yy+0ZTvMcfXVVyuZbldiT/r27avNN23aVOU5AT+1devWYE8hIGJiYrT5LbfcomRDhgzRjvW0i6bOtGnTlKywsND1+bCbrpft27d3ff66deuUbPbs2T7NCeEhLi5OyUaPHq1knt576nYmTktL83VactlllynZ0qVLlSwlJcX1a/7jH//Q5jNnznQ/McAH48aNU7J69er59JpXXXWV67GffPKJNs/KyvJpDrCXbmdi1ir8j08OAgAAAAAAAJZicRAAAAAAAACwFIuDAAAAAAAAgKVYHAQAAAAAAAAsxYYkIerAgQPafPjw4Uq2ePFiJbvnnnu05+tyTw+oXbJkiZLl5eVpxyJ8Pffcc0oWERGhHavbZMTkjUciI9V/P9E9EBeB17Bhw4C8bnJyspJ56n9qaqqStWjRQjs2KipKyQYPHqxkuo6JiJw5c0bJsrOztWPLysqUrGZN/R/v27dv1+bAj3na6GHGjBmuzv/oo4+0+bBhw5SsqKjI9bwQvnT3xMaNG7s+X7fBQtOmTbVj7733XiXr06ePdmy7du2UrH79+krm6cH3uvyVV17RjtVtQAj8VN26dbX5FVdcoWRTpkzRjnW7saCn9yDevNc9evSokul+BkVEzp075/p1AVQ/PjkIAAAAAAAAWIrFQQAAAAAAAMBSLA4CAAAAAAAAlmJxEAAAAAAAALAUG5KEmZUrVyrZ/v37lUy3yYSISM+ePZXsqaee0o69+OKLlWz69OnasV999ZU2R2i5/fbblaxDhw5K5unB22+99Za/pxTSdA9k1n1vdu3aVQ2zMY9u0w0R/fd44cKF2rGPPPKIT3No3769knnakOS7775TstOnT2vH7t27V8kWLVqkZNu2bdOer9vop6CgQDv2yJEjSlanTh3t2H379mlz2KtVq1ZKtmLFCp9e87///a8299RhmK+8vFzJjh8/rmRNmjTRnn/w4EEl8/RexRu6zRSKi4uVrFmzZtrzT5w4oWRvv/22z/OCWWrVqqXNr776aiXzdP/VddDT+yhdr7OyspTslltu0Z7vaVMUHd0GaP3799eOnT17tpLp7g0AgoNPDgIAAAAAAACWYnEQAAAAAAAAsBSLgwAAAAAAAIClWBwEAAAAAAAALMXiIAAAAAAAAGApdis2wO7du5Vs4MCB2rF33HGHki1evFg79v7771ey1q1ba8fefPPNF5oiQoRuB9OoqCglO3bsmPb85cuX+31O1S06OlrJpk6d6vr89evXK9mkSZN8mZK1Ro8erc0PHTqkZNdff31A5nD48GElW7VqlXbsZ599pmRbtmzx95Q8GjlypDbX7e7pabdY4KcmTpyoZLqd2r0xY8YMn86HeQoLC5UsLS1NydasWaM9v2HDhkp24MAB7djVq1crWWZmpnbsN998o2TLli1TMk+7FevGwm6699WedgV+8803Xb/u448/rmS696QiIh9//LGS6X6GPJ3frl071/PSvQfJyMjQjnX7nqusrMz114cdIiPVz7R5816lW7duSjZv3jyf5mQiPjkIAAAAAAAAWIrFQQAAAAAAAMBSLA4CAAAAAAAAlmJxEAAAAAAAALAUG5IYSvfgZxGRl19+Wcleeukl7diaNdV66B7mKSLSo0cPJdu4caPH+SG0eXoQcF5eXjXPpOp0G4+IiEyePFnJJkyYoB175MgRJXv22WeV7NSpU17ODhfy9NNPB3sKIalnz56ux65YsSKAM0E46tChgzbv1auXT6+r2/whJyfHp9eEHbKzs5VMt7lBIOne13bv3l3JPD34ns2f7FarVi0l020c4ul9ps57772nzefOnatknv6+p/s5evfdd5Xsqquu0p5fXl6uZDNnztSO1W1e0rdvX+3YpUuXKtk///lPJfP0PvDkyZPaXGfXrl2uxyL06e7BjuO4Pr9///5KdsUVV2jH7t271/3EDMMnBwEAAAAAAABLsTgIAAAAAAAAWIrFQQAAAAAAAMBSLA4CAAAAAAAAlmJxEAAAAAAAALAUuxUboH379kr261//Wju2Y8eOSqbbldgTT7v3bN682fVrIPS99dZbwZ6CV3S7cHraGe6uu+5SMt1umyIiAwYM8GleQLCsXLky2FNAiPnwww+1+UUXXeT6NbZs2aJkw4cPr+qUgKCrU6eOknmzK+ayZcv8PieEnho1amjzadOmKdnDDz+sZKWlpdrz//SnPymZp07pdia+9tprtWPnzZunZFdffbWS7d+/X3v+qFGjlGzDhg3asTExMUp2/fXXa8cOHjxYyfr06aNka9eu1Z6vk5ubq82TkpJcvwZC38KFC5Xs/vvv9+k1R44cqc3Hjx/v0+uGMz45CAAAAAAAAFiKxUEAAAAAAADAUiwOAgAAAAAAAJZicRAAAAAAAACwFBuShKjLL79cm48ZM0bJ+vfvr2QJCQk+z+HcuXNKlpeXpx2re3gzQk9ERISrLC0tTXv+Aw884O8peeXBBx/U5o8++qiSxcbGascuXbpUyYYOHerbxAAgxDVq1Eibe/Pn94IFC5Ts1KlTVZ4TEGwffPBBsKeAMOBp4wLd5iOnT59WMk8bJ+g2irruuuu0Y++9914lu/XWW7VjdRvtPPHEE0q2ePFi7fmeNvnQKS4uVrL3339fO1aXDxo0SMl+85vfuP76nv5uALPs27cv2FOwAp8cBAAAAAAAACzF4iAAAAAAAABgKRYHAQAAAAAAAEuxOAgAAAAAAABYyqvFwYyMDOnYsaM0aNBAmjZtKmlpaZKTk3PemLNnz0p6ero0atRI6tevLwMGDJCCggK/ThrwJ3oN09BpmIhewzR0Gqah0zARvYYtIhzHcdwOvuWWW+Tuu++Wjh07ynfffSePPPKI7N69W/bu3Sv16tUTEZFRo0bJO++8I5mZmRIbGytjxoyRyMhI+fjjj119jeLiYo+7jIY7TzsI63Zp0u1KLCLSqlUrf05JRES2bdumzadPn65kb731lt+/fjAUFRVJTEyMiNjV6zvvvFPJXnvtNSXT7VQtIvLCCy8o2aJFi7Rjv/76ayXztAPbPffco2TJyclK1qJFC+35hw8fVrItW7Zox86ePdv12HBia6dtsnz5cm0+cOBAJRs2bJh27JIlS/w6p0D6cadF6LU3dDtQDh8+XDvWm92KL7nkEiU7dOiQ6/PBvTrU9O7dW8neffddJfP016VmzZop2fHjx32fWBip7k6LVH+v8/LytHmTJk2UrKysTMk87bT6w/fkxy677DIvZ6eaOnWqkmVkZCiZp/f74F4dDj7//HMlu/TSS12fHxmp/5yc7mfwwIED7icWon76vlqnpjcv+NPtxzMzM6Vp06ayfft26datmxQVFcnf/vY3efXVV+Wmm24Ske/foLZt21a2bNnicWEACCZ6DdPQaZiIXsM0dBqmodMwEb2GLXx65mBRUZGIiDRs2FBERLZv3y7ffvutpKamVo5p06aNJCYmSlZWlvY1ysrKpLi4+LwDCCZ6DdPQaZiIXsM0dBqm8UenReg1Qgv3apiqyouDFRUVMn78eLnhhhukXbt2IiKSn58vUVFREhcXd97Y+Ph4yc/P175ORkaGxMbGVh4tW7as6pQAn9FrmIZOw0T0Gqah0zCNvzotQq8ROrhXw2RVXhxMT0+X3bt3y7Jly3yawKRJk6SoqKjyyM3N9en1AF/Qa5iGTsNE9BqmodMwjb86LUKvETq4V8NkXj1z8AdjxoyRNWvWyObNm8/bICAhIUHKy8ulsLDwvJXzgoICj5txREdHS3R0dFWmERLi4+O1+RVXXKFk8+bN045t06aNX+ckIpKdna3NZ82apWSrV6/WjvXmIeUmoNf/U6NGDW0+evRoJRswYIB2rO7j8a1bt/ZpXp988ok237Bhg5I99thjPn0tE9BpO+gelO/pIcsmoNf/06FDB23+4/+16Qee/kwvLy9Xsvnz52vHsvNiYNDp4NFtsgPf+bPTIsHvtadPf+k2JNHNU7fJnie6DXFERDZv3qxkq1at0o798ssvlYzNR3zHvTq07NmzR8m8uafbttbhhld/e3AcR8aMGSMrV66U9evXS1JS0nm/npKSIrVq1ZJ169ZVZjk5OXL48GHp0qWLf2YM+Bm9hmnoNExEr2EaOg3T0GmYiF7DFl59cjA9PV1effVVWb16tTRo0KDyX1FiY2OlTp06EhsbK/fdd5889NBD0rBhQ4mJiZGxY8dKly5d2KUHIYtewzR0Giai1zANnYZp6DRMRK9hC68WB59//nkREenRo8d5+eLFi2X48OEiIvLnP/9ZIiMjZcCAAVJWVia9e/eWBQsW+GWyQCDQa5iGTsNE9BqmodMwDZ2Gieg1bOHV4qDuuUY/Vbt2bZk/f77H59UAoYZewzR0Giai1zANnYZp6DRMRK9hiyptSGKDhg0bKtkLL7ygZJ4eBh6oBxzrNmR49tlnleyDDz7Qnn/mzBm/zwnhIysrS8m2bt2qZB07dnT9mp4etOtpsx6dr7/+Wsl0u4A98MADrl8TsJmnZ9xkZmZW70QQUD9+8PmPXejB/j/11VdfKdnDDz9c1SkBYeVf//qXkuk2dOLB9Xbr1q2bNk9LS1Oya665RsmOHTumPX/RokVKdvLkSe1Y3eZRgM3++te/Ktkdd9wRhJmYw9ztDAEAAAAAAABcEIuDAAAAAAAAgKVYHAQAAAAAAAAsxeIgAAAAAAAAYCkWBwEAAAAAAABLWbVbcefOnZVswoQJ2rGdOnVSsl/84hd+n5OIyOnTp5Vszpw52rFPPfWUkpWWlvp9TjDTkSNHlKx///5Kdv/992vPnzx5sk9ff/bs2dr8+eefV7IvvvjCp68F2CIiIiLYUwCAsLR7924l279/v5Jdcskl2vMvvfRSJTt+/LjvE0NIKSkp0eYvv/yyqwyA/+3du1fJPvvsM+3Ytm3bBno6RuCTgwAAAAAAAIClWBwEAAAAAAAALMXiIAAAAAAAAGApFgcBAAAAAAAAS1m1IUm/fv1cZd7QPQhTRGTNmjVK9t1332nHPvvss0pWWFjo07wAt/Ly8pRs6tSp2rGecgCB995772nzO++8s5pnglCxb98+bf7JJ58oWdeuXQM9HcAIus3/XnrpJe3Y6dOnK9nYsWO1Yz39nQEA4L1Dhw4p2VVXXRWEmZiDTw4CAAAAAAAAlmJxEAAAAAAAALAUi4MAAAAAAACApVgcBAAAAAAAACzF4iAAAAAAAABgqQjHcZxgT+LHiouLJTY2NtjTgOGKiookJiam2r4evUag0WmYpro7LUKvEXjcq0Of7vfn9ddf145NTU1VsjfffFM79t5771Wy0tJSL2cXerhXw0Tcq2EaN53mk4MAAAAAAACApVgcBAAAAAAAACzF4iAAAAAAAABgKRYHAQAAAAAAAEvVDPYEAAAAACAUFBcXK9nAgQO1Y6dPn65ko0aN0o6dOnWqku3du9e7yQEAECB8chAAAAAAAACwFIuDAAAAAAAAgKVYHAQAAAAAAAAsxeIgAAAAAAAAYCkWBwEAAAAAAABLsVsxAAAAAHig28FYRGTs2LGuMgAAQh2fHAQAAAAAAAAsxeIgAAAAAAAAYCkWBwEAAAAAAABLhdzioOM4wZ4CLFDdPaPXCDQ6DdMEo2P0GoHGvRqm4V4NE3GvhmncdCzkFgdLSkqCPQVYoLp7Rq8RaHQapglGx+g1Ao17NUzDvRom4l4N07jpWIQTYsvUFRUVcvToUWnQoIGUlJRIy5YtJTc3V2JiYoI9Nb8pLi7muoLEcRwpKSmR5s2bS2Rk9a2N0+vwFerXRacDJ9R/76sq1K8rWJ0W+V+vHceRxMTEkP0eVVWo/95XVThcF/fqwAmH3/+qCPXr4l4dOKH+e19V4XBd3KsDJxx+/6si1K/Lm07XrKY5uRYZGSktWrQQEZGIiAgREYmJiQnJb7SvuK7giI2NrfavSa/DXyhfF50OLK6r+gWj0yL/63VxcbGIhPb3yBdcV3Bwrw4srqv6ca8OLK4rOLhXBxbXVf3cdjrk/rdiAAAAAAAAANWDxUEAAAAAAADAUiG9OBgdHS1TpkyR6OjoYE/Fr7guu5n6feK67GXq94jrspep3yOuy26mfp+4LnuZ+j3iuuxm6veJ6wp9IbchCQAAAAAAAIDqEdKfHAQAAAAAAAAQOCwOAgAAAAAAAJZicRAAAAAAAACwFIuDAAAAAAAAgKVYHAQAAAAAAAAsFdKLg/Pnz5dWrVpJ7dq1pXPnzvLvf/872FPyyubNm+WOO+6Q5s2bS0REhKxateq8X3ccRx577DFp1qyZ1KlTR1JTU2X//v3BmaxLGRkZ0rFjR2nQoIE0bdpU0tLSJCcn57wxZ8+elfT0dGnUqJHUr19fBgwYIAUFBUGacWih06GJXldduHdaxMxe02nfhHuvTey0CL32Rbh3WsTMXtNp34R7r03stAi99kW4d1rEzF7b0umQXRxcvny5PPTQQzJlyhTZsWOHJCcnS+/eveXYsWPBnpprpaWlkpycLPPnz9f++syZM2XOnDmycOFCyc7Olnr16knv3r3l7Nmz1TxT9zZt2iTp6emyZcsWWbt2rXz77bfSq1cvKS0trRzz4IMPyttvvy1vvPGGbNq0SY4ePSr9+/cP4qxDA50OXfS6akzotIiZvabTVWdCr03stAi9rioTOi1iZq/pdNWZ0GsTOy1Cr6vKhE6LmNlrazrthKhOnTo56enplf997tw5p3nz5k5GRkYQZ1V1IuKsXLmy8r8rKiqchIQEZ9asWZVZYWGhEx0d7bz22mtBmGHVHDt2zBERZ9OmTY7jfH8NtWrVct54443KMZ999pkjIk5WVlawphkS6HT4oNfumNZpxzG313TaPdN6bWqnHYdeu2Vapx3H3F7TafdM67WpnXYceu2WaZ12HHN7bWqnQ/KTg+Xl5bJ9+3ZJTU2tzCIjIyU1NVWysrKCODP/OXjwoOTn5593jbGxsdK5c+ewusaioiIREWnYsKGIiGzfvl2+/fbb866rTZs2kpiYGFbX5W90OryukV7/PBs6LWJOr+m0Ozb02pROi9BrN2zotIg5vabT7tjQa1M6LUKv3bCh0yLm9NrUTofk4uCJEyfk3LlzEh8ff14eHx8v+fn5QZqVf/1wHeF8jRUVFTJ+/Hi54YYbpF27diLy/XVFRUVJXFzceWPD6boCgU6HzzXSa3ds6LSIGb2m0+7Z0GsTOi1Cr92yodMiZvSaTrtnQ69N6LQIvXbLhk6LmNFrkztdM9gTQPhKT0+X3bt3y0cffRTsqQB+Q69hGjoNE9FrmIZOw0T0GqYxudMh+cnBxo0bS40aNZTdXQoKCiQhISFIs/KvH64jXK9xzJgxsmbNGtmwYYO0aNGiMk9ISJDy8nIpLCw8b3y4XFeg0OnwuEZ67Z4NnRYJ/17Tae/Y0Otw77QIvfaGDZ0WCf9e02nv2NDrcO+0CL32hg2dFgn/Xpve6ZBcHIyKipKUlBRZt25dZVZRUSHr1q2TLl26BHFm/pOUlCQJCQnnXWNxcbFkZ2eH9DU6jiNjxoyRlStXyvr16yUpKem8X09JSZFatWqdd105OTly+PDhkL6uQKPToX2N9Np7NnRaJHx7TaerxoZeh2unReh1VdjQaZHw7TWdrhobeh2unRah11VhQ6dFwrfX1nQ6iJuhXNCyZcuc6OhoJzMz09m7d68zcuRIJy4uzsnPzw/21FwrKSlxdu7c6ezcudMREee5555zdu7c6Rw6dMhxHMeZMWOGExcX56xevdr5z3/+4/Tt29dJSkpyzpw5E+SZezZq1CgnNjbW2bhxo5OXl1d5nD59unLM73//eycxMdFZv369s23bNqdLly5Oly5dgjjr0ECnQxe9rhoTOu04ZvaaTledCb02sdOOQ6+ryoROO46ZvabTVWdCr03stOPQ66oyodOOY2avbel0yC4OOo7jzJ0710lMTHSioqKcTp06OVu2bAn2lLyyYcMGR0SUY9iwYY7jfL+V96OPPurEx8c70dHRTs+ePZ2cnJzgTvpn6K5HRJzFixdXjjlz5owzevRo56KLLnLq1q3r9OvXz8nLywvepEMInQ5N9Lrqwr3TjmNmr+m0b8K91yZ22nHotS/CvdOOY2av6bRvwr3XJnbacei1L8K9045jZq9t6XSE4ziOd581BAAAAAAAAGCCkHzmIAAAAAAAAIDAY3EQAAAAAAAAsBSLgwAAAAAAAIClWBwEAAAAAAAALMXiIAAAAAAAAGApFgcBAAAAAAAAS7E4CAAAAAAAAFiKxUEAAAAAAADAUiwOAgAAAAAAAJZicRAAAAAAAACwFIuDAAAAAAAAgKX+H9kDRDMQW71hAAAAAElFTkSuQmCC\n",
"text/plain": [
"<Figure size 1600x400 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" 5 0 4 1 9 2 1\n"
]
}
],
"source": [
"draw_examples(x_train[:7], captions=y_train)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"60000 przykładów uczących\n",
"10000 przykładów testowych\n"
]
}
],
"source": [
"num_classes = 10\n",
"\n",
"x_train = x_train.reshape(60000, 784) # 784 = 28 * 28\n",
"x_test = x_test.reshape(10000, 784)\n",
"x_train = x_train.astype(\"float32\")\n",
"x_test = x_test.astype(\"float32\")\n",
"x_train /= 255\n",
"x_test /= 255\n",
"print(\"{} przykładów uczących\".format(x_train.shape[0]))\n",
"print(\"{} przykładów testowych\".format(x_test.shape[0]))\n",
"\n",
"# przekonwertuj wektory klas na binarne macierze klas\n",
"y_train = keras.utils.to_categorical(y_train, num_classes)\n",
"y_test = keras.utils.to_categorical(y_test, num_classes)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-01-26 10:52:27.077963: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory\n",
"2023-01-26 10:52:27.078089: W tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: UNKNOWN ERROR (303)\n",
"2023-01-26 10:52:27.078807: I tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ELLIOT): /proc/driver/nvidia/version does not exist\n",
"2023-01-26 10:52:27.095828: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA\n",
"To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"sequential\"\n",
"_________________________________________________________________\n",
" Layer (type) Output Shape Param # \n",
"=================================================================\n",
" dense (Dense) (None, 512) 401920 \n",
" \n",
" dense_1 (Dense) (None, 512) 262656 \n",
" \n",
" dense_2 (Dense) (None, 10) 5130 \n",
" \n",
"=================================================================\n",
"Total params: 669,706\n",
"Trainable params: 669,706\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"model = keras.Sequential()\n",
"model.add(Dense(512, activation=\"relu\", input_shape=(784,)))\n",
"# model.add(Dropout(0.2))\n",
"model.add(Dense(512, activation=\"relu\"))\n",
"# model.add(Dropout(0.2))\n",
"model.add(Dense(num_classes, activation=\"softmax\"))\n",
"model.summary()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(60000, 784) (60000, 10)\n"
]
}
],
"source": [
"print(x_train.shape, y_train.shape)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-01-26 10:52:27.713204: W tensorflow/tsl/framework/cpu_allocator_impl.cc:82] Allocation of 188160000 exceeds 10% of free system memory.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch 1/5\n",
"469/469 [==============================] - 13s 25ms/step - loss: 0.2303 - accuracy: 0.9290 - val_loss: 0.1023 - val_accuracy: 0.9684\n",
"Epoch 2/5\n",
"469/469 [==============================] - 9s 20ms/step - loss: 0.0840 - accuracy: 0.9742 - val_loss: 0.0794 - val_accuracy: 0.9754\n",
"Epoch 3/5\n",
"469/469 [==============================] - 9s 20ms/step - loss: 0.0548 - accuracy: 0.9826 - val_loss: 0.0603 - val_accuracy: 0.9828\n",
"Epoch 4/5\n",
"469/469 [==============================] - 9s 20ms/step - loss: 0.0367 - accuracy: 0.9883 - val_loss: 0.0707 - val_accuracy: 0.9796\n",
"Epoch 5/5\n",
"469/469 [==============================] - 9s 19ms/step - loss: 0.0278 - accuracy: 0.9912 - val_loss: 0.0765 - val_accuracy: 0.9785\n"
]
},
{
"data": {
"text/plain": [
"<keras.callbacks.History at 0x7f8642785120>"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model.compile(\n",
" loss=\"categorical_crossentropy\",\n",
" optimizer=keras.optimizers.RMSprop(),\n",
" metrics=[\"accuracy\"],\n",
")\n",
"\n",
"model.fit(\n",
" x_train,\n",
" y_train,\n",
" batch_size=128,\n",
" epochs=5,\n",
" verbose=1,\n",
" validation_data=(x_test, y_test),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Test loss: 0.07645954936742783\n",
"Test accuracy: 0.9785000085830688\n"
]
}
],
"source": [
"score = model.evaluate(x_test, y_test, verbose=0)\n",
"\n",
"print(\"Test loss: {}\".format(score[0]))\n",
"print(\"Test accuracy: {}\".format(score[1]))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Warstwa _dropout_ to metoda regularyzacji, służy zapobieganiu nadmiernemu dopasowaniu sieci. Polega na tym, że część węzłów sieci jest usuwana w sposób losowy."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"sequential_1\"\n",
"_________________________________________________________________\n",
" Layer (type) Output Shape Param # \n",
"=================================================================\n",
" dense_3 (Dense) (None, 512) 401920 \n",
" \n",
" dense_4 (Dense) (None, 512) 262656 \n",
" \n",
" dense_5 (Dense) (None, 10) 5130 \n",
" \n",
"=================================================================\n",
"Total params: 669,706\n",
"Trainable params: 669,706\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n",
"Epoch 1/5\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-01-26 10:53:20.710986: W tensorflow/tsl/framework/cpu_allocator_impl.cc:82] Allocation of 188160000 exceeds 10% of free system memory.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"469/469 [==============================] - 10s 19ms/step - loss: 0.2283 - accuracy: 0.9302 - val_loss: 0.0983 - val_accuracy: 0.9685\n",
"Epoch 2/5\n",
"469/469 [==============================] - 10s 22ms/step - loss: 0.0849 - accuracy: 0.9736 - val_loss: 0.0996 - val_accuracy: 0.9673\n",
"Epoch 3/5\n",
"469/469 [==============================] - 10s 22ms/step - loss: 0.0549 - accuracy: 0.9829 - val_loss: 0.0704 - val_accuracy: 0.9777\n",
"Epoch 4/5\n",
"469/469 [==============================] - 10s 21ms/step - loss: 0.0380 - accuracy: 0.9877 - val_loss: 0.0645 - val_accuracy: 0.9797\n",
"Epoch 5/5\n",
"469/469 [==============================] - 20s 43ms/step - loss: 0.0276 - accuracy: 0.9910 - val_loss: 0.0637 - val_accuracy: 0.9825\n"
]
},
{
"data": {
"text/plain": [
"<keras.callbacks.History at 0x7f86301a3f40>"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Bez warstw Dropout\n",
"\n",
"num_classes = 10\n",
"\n",
"(x_train, y_train), (x_test, y_test) = mnist.load_data()\n",
"\n",
"x_train = x_train.reshape(60000, 784) # 784 = 28 * 28\n",
"x_test = x_test.reshape(10000, 784)\n",
"x_train = x_train.astype(\"float32\")\n",
"x_test = x_test.astype(\"float32\")\n",
"x_train /= 255\n",
"x_test /= 255\n",
"\n",
"y_train = keras.utils.to_categorical(y_train, num_classes)\n",
"y_test = keras.utils.to_categorical(y_test, num_classes)\n",
"\n",
"model_no_dropout = keras.Sequential()\n",
"model_no_dropout.add(Dense(512, activation=\"relu\", input_shape=(784,)))\n",
"model_no_dropout.add(Dense(512, activation=\"relu\"))\n",
"model_no_dropout.add(Dense(num_classes, activation=\"softmax\"))\n",
"model_no_dropout.summary()\n",
"\n",
"model_no_dropout.compile(\n",
" loss=\"categorical_crossentropy\",\n",
" optimizer=keras.optimizers.RMSprop(),\n",
" metrics=[\"accuracy\"],\n",
")\n",
"\n",
"model_no_dropout.fit(\n",
" x_train,\n",
" y_train,\n",
" batch_size=128,\n",
" epochs=5,\n",
" verbose=1,\n",
" validation_data=(x_test, y_test),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Test loss (no dropout): 0.06374581903219223\n",
"Test accuracy (no dropout): 0.9825000166893005\n"
]
}
],
"source": [
"# Bez warstw Dropout\n",
"\n",
"score = model_no_dropout.evaluate(x_test, y_test, verbose=0)\n",
"\n",
"print(\"Test loss (no dropout): {}\".format(score[0]))\n",
"print(\"Test accuracy (no dropout): {}\".format(score[1]))"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"sequential_3\"\n",
"_________________________________________________________________\n",
" Layer (type) Output Shape Param # \n",
"=================================================================\n",
" dense_6 (Dense) (None, 2500) 1962500 \n",
" \n",
" dense_7 (Dense) (None, 2000) 5002000 \n",
" \n",
" dense_8 (Dense) (None, 1500) 3001500 \n",
" \n",
" dense_9 (Dense) (None, 1000) 1501000 \n",
" \n",
" dense_10 (Dense) (None, 500) 500500 \n",
" \n",
" dense_11 (Dense) (None, 10) 5010 \n",
" \n",
"=================================================================\n",
"Total params: 11,972,510\n",
"Trainable params: 11,972,510\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n",
"Epoch 1/10\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-01-26 11:06:02.193383: W tensorflow/tsl/framework/cpu_allocator_impl.cc:82] Allocation of 188160000 exceeds 10% of free system memory.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"469/469 [==============================] - 140s 294ms/step - loss: 0.6488 - accuracy: 0.8175 - val_loss: 0.2686 - val_accuracy: 0.9211\n",
"Epoch 2/10\n",
"469/469 [==============================] - 147s 313ms/step - loss: 0.2135 - accuracy: 0.9367 - val_loss: 0.2251 - val_accuracy: 0.9363\n",
"Epoch 3/10\n",
"469/469 [==============================] - 105s 224ms/step - loss: 0.1549 - accuracy: 0.9535 - val_loss: 0.1535 - val_accuracy: 0.9533\n",
"Epoch 4/10\n",
"469/469 [==============================] - 94s 200ms/step - loss: 0.1210 - accuracy: 0.9635 - val_loss: 0.1412 - val_accuracy: 0.9599\n",
"Epoch 5/10\n",
"469/469 [==============================] - 93s 199ms/step - loss: 0.0985 - accuracy: 0.9704 - val_loss: 0.1191 - val_accuracy: 0.9650\n",
"Epoch 6/10\n",
"469/469 [==============================] - 105s 224ms/step - loss: 0.0834 - accuracy: 0.9746 - val_loss: 0.0959 - val_accuracy: 0.9732\n",
"Epoch 7/10\n",
"469/469 [==============================] - 111s 236ms/step - loss: 0.0664 - accuracy: 0.9797 - val_loss: 0.1071 - val_accuracy: 0.9685\n",
"Epoch 8/10\n",
"469/469 [==============================] - 184s 392ms/step - loss: 0.0562 - accuracy: 0.9824 - val_loss: 0.0951 - val_accuracy: 0.9737\n",
"Epoch 9/10\n",
"469/469 [==============================] - 161s 344ms/step - loss: 0.0475 - accuracy: 0.9852 - val_loss: 0.1377 - val_accuracy: 0.9631\n",
"Epoch 10/10\n",
"469/469 [==============================] - 146s 311ms/step - loss: 0.0399 - accuracy: 0.9873 - val_loss: 0.1093 - val_accuracy: 0.9736\n"
]
},
{
"data": {
"text/plain": [
"<keras.callbacks.History at 0x7f8640136f50>"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Więcej warstw, inna funkcja aktywacji\n",
"\n",
"num_classes = 10\n",
"\n",
"(x_train, y_train), (x_test, y_test) = mnist.load_data()\n",
"\n",
"x_train = x_train.reshape(60000, 784) # 784 = 28 * 28\n",
"x_test = x_test.reshape(10000, 784)\n",
"x_train = x_train.astype(\"float32\")\n",
"x_test = x_test.astype(\"float32\")\n",
"x_train /= 255\n",
"x_test /= 255\n",
"\n",
"y_train = keras.utils.to_categorical(y_train, num_classes)\n",
"y_test = keras.utils.to_categorical(y_test, num_classes)\n",
"\n",
"model3 = keras.Sequential()\n",
"model3.add(Dense(2500, activation=\"tanh\", input_shape=(784,)))\n",
"model3.add(Dense(2000, activation=\"tanh\"))\n",
"model3.add(Dense(1500, activation=\"tanh\"))\n",
"model3.add(Dense(1000, activation=\"tanh\"))\n",
"model3.add(Dense(500, activation=\"tanh\"))\n",
"model3.add(Dense(num_classes, activation=\"softmax\"))\n",
"model3.summary()\n",
"\n",
"model3.compile(\n",
" loss=\"categorical_crossentropy\",\n",
" optimizer=keras.optimizers.RMSprop(),\n",
" metrics=[\"accuracy\"],\n",
")\n",
"\n",
"model3.fit(\n",
" x_train,\n",
" y_train,\n",
" batch_size=128,\n",
" epochs=10,\n",
" verbose=1,\n",
" validation_data=(x_test, y_test),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Test loss: 0.10930903255939484\n",
"Test accuracy: 0.9735999703407288\n"
]
}
],
"source": [
"# Więcej warstw, inna funkcja aktywacji\n",
"\n",
"score = model3.evaluate(x_test, y_test, verbose=0)\n",
"\n",
"print(\"Test loss: {}\".format(score[0]))\n",
"print(\"Test accuracy: {}\".format(score[1]))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Przykład: 4-pikselowy aparat fotograficzny\n",
"\n",
"https://www.youtube.com/watch?v=ILsA4nyG7I0"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"def generate_example(description):\n",
" variant = random.choice([1, -1])\n",
" if description == \"s\": # solid\n",
" return (\n",
" np.array([[1.0, 1.0], [1.0, 1.0]])\n",
" if variant == 1\n",
" else np.array([[-1.0, -1.0], [-1.0, -1.0]])\n",
" )\n",
" elif description == \"v\": # vertical\n",
" return (\n",
" np.array([[1.0, -1.0], [1.0, -1.0]])\n",
" if variant == 1\n",
" else np.array([[-1.0, 1.0], [-1.0, 1.0]])\n",
" )\n",
" elif description == \"d\": # diagonal\n",
" return (\n",
" np.array([[1.0, -1.0], [-1.0, 1.0]])\n",
" if variant == 1\n",
" else np.array([[-1.0, 1.0], [1.0, -1.0]])\n",
" )\n",
" elif description == \"h\": # horizontal\n",
" return (\n",
" np.array([[1.0, 1.0], [-1.0, -1.0]])\n",
" if variant == 1\n",
" else np.array([[-1.0, -1.0], [1.0, 1.0]])\n",
" )\n",
" else:\n",
" return np.array(\n",
" [\n",
" [random.uniform(-1, 1), random.uniform(-1, 1)],\n",
" [random.uniform(-1, 1), random.uniform(-1, 1)],\n",
" ]\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"import numpy as np\n",
"import random\n",
"\n",
"num_classes = 4\n",
"\n",
"trainset_size = 4000\n",
"testset_size = 1000\n",
"\n",
"y4_train = np.array([random.choice([\"s\", \"v\", \"d\", \"h\"]) for i in range(trainset_size)])\n",
"x4_train = np.array([generate_example(desc) for desc in y4_train])\n",
"\n",
"y4_test = np.array([random.choice([\"s\", \"v\", \"d\", \"h\"]) for i in range(testset_size)])\n",
"x4_test = np.array([generate_example(desc) for desc in y4_test])"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABRcAAADICAYAAAByFdYYAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAbGklEQVR4nO3df2xddf0/8Fe30c7FtHNja2nY4DMRjEM2GHapJmzEmolkiQbjRANz0QGGf3REZAScEOMi8gcJmZFg5lQw/IgMjBoQO36EOQYMGwi/ksniBq6d2+SWDe10Pd8/+LassM17D7295777eCSvhHt6zr3vnvvci3tfub2nIcuyLAAAAAAAKjSh1gsAAAAAAOqT4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJBL1YaL+/fvj69+9avR3NwcU6dOja9//etx4MCB4x6zePHiaGhoGFFXXHFFtZYIFZNrUiPTpEiuSY1MkyK5JjUyzXjWkGVZVo07vuCCC2L37t1x2223xX/+859YsWJFfOITn4hf//rXxzxm8eLFcfrpp8eNN944vG3KlCnR3NxcjSVCxeSa1Mg0KZJrUiPTpEiuSY1MM55NqsadvvTSS/Hggw/G008/Heeee25ERNx6663xuc99Lm6++eZob28/5rFTpkyJtra2aiwL3he5JjUyTYrkmtTINCmSa1Ij04x3VRkubtmyJaZOnTr8jyoioqurKyZMmBBbt26NL3zhC8c89s4774w77rgj2traYunSpXH99dfHlClTjrn/wMBADAwMDN8eHByM/fv3x/Tp06OhoWF0fiGIiO7u7mhpaYlzzjlneJtcU882bdoULS0t0dbWFoODgzFhwgSZpu7p1aRGryZFejWp0atJUZZl8eabb0Z7e3tMmHD8b1WsynCxt7c3Zs6cOfKBJk2KadOmRW9v7zGP+8pXvhKnnHJKtLe3x3PPPRff/e5345VXXon77rvvmMesXbs2brjhhlFbO/wvf//73+Pkk0+OCLkmDbNmzYpdu3bFySefLNMkQ68mNXo1KdKrSY1eTYqGMn08FX3n4jXXXBM/+tGPjrvPSy+9FPfdd1/84he/iFdeeWXEz2bOnBk33HBDfPOb3yzr8TZt2hSf/vSnY/v27fHhD3/4qPu8e2pfKpVi9uzZsWvXLt9TUKaWlpZaL6GuvPHGGyPO2VjmGqrlyFzr1RTRmjVr4pZbbjnuPk8//XTce++9cdNNN+nVdaJUKtV6CTVTbqZ/+9vfxp133hmvvvpqzXo1VIteXR/06luOu49eTere3auPpqJPLl511VXxta997bj7zJkzJ9ra2mLPnj0jtv/3v/+N/fv3V/RdAgsXLoyIOO4/rKampmhqanrP9ubmZm9YqYojP2o+1rmGahnKtV5NUV177bVx+eWXH3efOXPmxNatWyNCr64X4/nff7mZfu6552Lfvn0RUbteDdWiV9cHvVqvZnwr58/tKxouzpgxI2bMmPE/9+vs7Iw33ngjtm3bFgsWLIiItyfwg4ODw/9YytHT0xMRESeddFIly4Sq+stf/hKLFi2KCLkmPTJNUZX7GqSjoyMi9GqKr5LX1e/+1JBMkwq9mqLTq6FMWZV89rOfzc4+++xs69at2RNPPJF95CMfyS6++OLhn7/22mvZGWeckW3dujXLsizbvn17duONN2bPPPNMtmPHjuyBBx7I5syZk5133nkVPW6pVMoiIiuVSqP6+6QsIlQFddZZZ9Us10pVqzZt2qRXk4ShbOnV9VGUp6urq6a9WqlqlV5dH0V59GqVapXznq1qnWLfvn3ZxRdfnH3wgx/MmpubsxUrVmRvvvnm8M937NiRRUT2yCOPZFmWZTt37szOO++8bNq0aVlTU1N22mmnZd/5zncqfuPpDWvlah3UeqsvfvGLNcu1UtUqvZpUDGVLr66PojxDmZVplVrp1fVRlEevVqlWOZms6IIu9aC/vz9aWlqiVCqN6++GqITL1VemFtkayjVUy1jnWq+mWmqZLb26com9DK2aWuVapqk2vbo+6NXl0atJVTmZnjBGawEAAAAAEmO4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAOQyJsPFdevWxamnnhqTJ0+OhQsXxlNPPXXc/e+999746Ec/GpMnT46Pf/zj8Yc//GEslgllk2lSc/vtt8s0ydGrSY1eTYr0alKjVzMuZVV21113ZY2Njdn69euzF154IVu5cmU2derUrK+v76j7b968OZs4cWJ20003ZS+++GJ23XXXZSeccEL2/PPPl/V4pVIpi4isVCqN5q+RtIhQFdT69evHNNNZ9k6ulapW1SrTejWjbShbenV9FOUZypZMq9RKr66Pojx6tUq1ynnPVvVO0dHRkV155ZXDtw8fPpy1t7dna9euPer+X/rSl7ILL7xwxLaFCxdml19+eVmP5w1r5Wod1HqrBQsWjGmms8z/MFT1a+XKlTXJtF7NaBvKll5dH0V5hrJVq16tVLVKr66Pojx6tUq1ynnPVtU/iz506FBs27Yturq6hrdNmDAhurq6YsuWLUc9ZsuWLSP2j4hYsmTJMfcfGBiI/v7+EQXV1NPTU9VMR8g1Y2/x4sXD/y3TpECvJiWHDh2KCL2a9OjVpESvZjyr6nBx7969cfjw4WhtbR2xvbW1NXp7e496TG9vb0X7r127NlpaWoZr1qxZo7N4OIZqZzpCrhl7M2fOHHFbpql3ejUp2bdvX0To1aRHryYlejXjWd1fLXr16tVRKpWGa9euXbVeErxvck1qZJoUyTWpkWlSJNekRqYpoknVvPMTTzwxJk6cGH19fSO29/X1RVtb21GPaWtrq2j/pqamaGpqGp0FQxmqnekIuWbs7dmzZ8Rtmabe6dWkZPr06RGhV5MevZqU6NWMZ1X95GJjY2MsWLAguru7h7cNDg5Gd3d3dHZ2HvWYzs7OEftHRDz88MPH3B/G2vz582Wa5Dz22GPD/y3TpECvJiWNjY0RoVeTHr2alOjVjGtlX4Iop7vuuitramrKNmzYkL344ovZZZddlk2dOjXr7e3NsizLLrnkkuyaa64Z3n/z5s3ZpEmTsptvvjl76aWXsjVr1lR0KXZXIK1cFODqQ/VU69evH9NMZ5krgKnqV60yrVcz2oaypVfXR1GeoWzJtEqt9Or6KMqjV6tUq5z3bGPSKW699dZs9uzZWWNjY9bR0ZE9+eSTwz9btGhRtnz58hH733PPPdnpp5+eNTY2ZnPnzs1+//vfl/1Y3rBWrtZBrbcqlUpjmuks8z8MVf368Y9/XJNM69WMtiOzpVcXvyjPULZq1auVqlbp1fVRlEevVqlWOe/ZGrIsyyIh/f390dLSEqVSKZqbm2u9nLrQ0NBQ6yXUlVpkayjXUC1jnWu9mmqpZbb06sol9jK0amqVa5mm2vTq+qBXl0evJlXlZLrurxYNAAAAANSG4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQy5gMF9etWxennnpqTJ48ORYuXBhPPfXUMffdsGFDNDQ0jKjJkyePxTKhbDJNam6//XaZJjl6NanRq0mRXk1q9GrGo6oPF+++++5YtWpVrFmzJp599tmYN29eLFmyJPbs2XPMY5qbm2P37t3D9be//a3ay4Sy/eY3v5FpknPttdfKNEnRq0mRXk1q9GpSpFczLmVV1tHRkV155ZXDtw8fPpy1t7dna9euPer+P//5z7OWlpbcj1cqlbKIyEqlUu77GG8iQlVQCxYsGNNMZ9k7uVaqWrVy5cqaZFqvZrQNZUuvro+iPEPZqlWvVqpapVfXR1EevVqlWuW8Z5sUVXTo0KHYtm1brF69enjbhAkToqurK7Zs2XLM4w4cOBCnnHJKDA4OxjnnnBM//OEPY+7cuUfdd2BgIAYGBoZv9/f3R0RES0vLKP0W6cuyrNZLqAv9/f3R0tISPT09cd111w1vH+1MRxw716VSKZqbm0fht4G37d27N2bMmBGLFy8e3jaWmdarqRa9uj40NDTUegl1pVa9WqYZbV5X1xe9ujJ6NakY6tXlqOqfRe/duzcOHz4cra2tI7a3trZGb2/vUY8544wzYv369fHAAw/EHXfcEYODg/HJT34yXnvttaPuv3bt2mhpaRmuWbNmjfrvAUeqdqYj5Jqxs2/fvoiImDlz5ojtMk2906tJkV5NavRqUqRXMx4V7mrRnZ2dcemll8b8+fNj0aJFcd9998WMGTPitttuO+r+q1evjlKpNFy7du0a4xXD8VWa6Qi5pthkmhTJNamRaVIk16RGpklFVf8s+sQTT4yJEydGX1/fiO19fX3R1tZW1n2ccMIJcfbZZ8f27duP+vOmpqZoamp632uFclU70xFyzdiZPn16RMR7vmRapql3ejUp0qtJjV5NivRqxqOqfnKxsbExFixYEN3d3cPbBgcHo7u7Ozo7O8u6j8OHD8fzzz8fJ510UrWWCRWZP3++TJOMxsbGiIh47LHHhrfJNCnQq0mRXk1q9GpSpFczLr2vyxKV4a677sqampqyDRs2ZC+++GJ22WWXZVOnTs16e3uzLMuySy65JLvmmmuG97/hhhuyhx56KPvrX/+abdu2Lfvyl7+cTZ48OXvhhRfKejxXSqq8KM9QttavXz+mmT7ysV1Zl9E2lK1aZVqpapVeXR9qnZN6K5kmFV5X15da9756K5kmFZVkq6p/Fh0RsWzZsvjHP/4R3/ve96K3tzfmz58fDz744PAX9+7cuTMmTHjnA5T//Oc/Y+XKldHb2xsf+tCHYsGCBfHnP/85Pvaxj1V7qVCWiy66KA4ePCjTJOUHP/iBTJMUvZoU6dWkRq8mRXo141FDlmVZrRcxmiq5VDZvSywCVTOUrVKpFM3NzePmsUlbrbKlV1NtenV9aGhoqPUS6kqterVMM9q8rq4venVl9GpSUUm2Cne1aAAAAACgPhguAgAAAAC5GC4CAAAAALkYLgIAAAAAuRguAgAAAAC5GC4CAAAAALkYLgIAAAAAuRguAgAAAAC5GC4CAAAAALkYLgIAAAAAuRguAgAAAAC5GC4CAAAAALkYLgIAAAAAuRguAgAAAAC5GC4CAAAAALkYLgIAAAAAuRguAgAAAAC5GC4CAAAAALkYLgIAAAAAuRguAgAAAAC5GC4CAAAAALkYLgIAAAAAuRguAgAAAAC5GC4CAAAAALkYLgIAAAAAuRguAgAAAAC5VHW4+Pjjj8fSpUujvb09Ghoa4v777/+fxzz66KNxzjnnRFNTU5x22mmxYcOGai4RKrZ582a5JjnLli2TaZIi06RIrkmNTJMiuWY8qupw8eDBgzFv3rxYt25dWfvv2LEjLrzwwjj//POjp6cnvvWtb8U3vvGNeOihh6q5TKjIW2+9Jdck58wzz5RpkiLTpEiuSY1MkyK5ZlzKxkhEZBs3bjzuPldffXU2d+7cEduWLVuWLVmypOzHKZVKWUSoCoryDGWrVCoNbxvrXB/52DAa3p0tvVqlUmOd6SNzrVeXr9Y5qbeqVa+WaUZbrV5/HO2x+d9q3fvqrfRqUlFJtgr1nYtbtmyJrq6uEduWLFkSW7ZsOeYxAwMD0d/fP6KgSOSa1Mg0qcmT6Qi5ptj0alKjV5MivZpUFGq42NvbG62trSO2tba2Rn9/f/zrX/866jFr166NlpaW4Zo1a9ZYLBXKJtekRqZJTZ5MR8g1xaZXkxq9mhTp1aSiUMPFPFavXh2lUmm4du3aVeslwfsm16RGpkmRXJMamSZFck1qZJoimlTrBRypra0t+vr6Rmzr6+uL5ubm+MAHPnDUY5qamqKpqWkslge5yDWpkWlSkyfTEXJNsenVpEavJkV6Nako1CcXOzs7o7u7e8S2hx9+ODo7O2u0Inj/5JrUyDSpkWlSJNekRqZJkVyTiqoOFw8cOBA9PT3R09MTEW9fZr2npyd27twZEW9/nPfSSy8d3v+KK66IV199Na6++up4+eWX4yc/+Uncc8898e1vf7uay4SKyDUpeu6552SapMg0KZJrUiPTpEiuGZeqednqRx555KiXZl++fHmWZVm2fPnybNGiRe85Zv78+VljY2M2Z86c7Oc//3lFjzl0qWxVflGeoWz97ne/q1muy7kEPFTieD1Tr1apVTUzfWSu9ery1ToTKZTXH9SjWr3+OPKx5bp8te5zKZReTT2qJFsNWZZlkZD+/v5oaWmp9TLqSmIRqJqhbJVKpWhubh43j03aapUtvZpq06vrQ0NDQ62XUFdq1atlmtHmdXV90asro1eTikqyVajvXAQAAAAA6ofhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJBLVYeLjz/+eCxdujTa29ujoaEh7r///uPu/+ijj0ZDQ8N7qre3t5rLhIps3rxZrknOsmXLZJqkyDQpkmtSI9OkSK4Zj6o6XDx48GDMmzcv1q1bV9Fxr7zySuzevXu4Zs6cWaUVQuXeeustuSY5Z555pkyTFJkmRXJNamSaFMk149Gkat75BRdcEBdccEHFx82cOTOmTp06+guCUfCZz3wmLrroooqPk2uK7Prrr4/m5uaKjpFpikymSZFckxqZJkVyzXhU1eFiXvPnz4+BgYE488wz4/vf/3586lOfOua+AwMDMTAwMHy7VCqNxRKT0t/fX+sl1IWh85RlWa7jRyPXnitG2/vJtV5NkVU70xF6NWOvVr1aphltY/X6I0KuGXt6NamoqFdnYyQiso0bNx53n5dffjn76U9/mj3zzDPZ5s2bsxUrVmSTJk3Ktm3bdsxj1qxZk0WEUmNWu3btkmuVXA3lOkKmVRpV7UzLtapF6dUqtdKrVYqlV6vU6sgZyLE0/P/QV11DQ0Ns3LgxPv/5z1d03KJFi2L27Nnxq1/96qg/f/fUfnBwMPbv3x/Tp0+PhoaG97PkUdXf3x+zZs2KXbt2VfwR6fGoiOcry7J48803o729PSZMePvrSsdzrov4HBVZUc/Xu3M9njMdUdznqaiKeL7GKtMR9ZHrIj5HRVbU86VXj1TU56mIinqu9OqRivo8FVVRz5dePVJRn6ciKuq5OtoM5FgK+WfRR+ro6IgnnnjimD9vamqKpqamEduK/F0Fzc3NhQpL0RXtfLW0tIzK/aSU66I9R0VXxPM1GrlOKdMRxXyeiqxo52ssMh1RX7ku2nNUdEU8X3r1exXxeSqqIp4rvfq9ivg8FVkRz5de/V5FfJ6KqojnqtxMV/Vq0aOhp6cnTjrppFovA0aVXJMamSY1Mk2K5JrUyDQpkmvqUVU/uXjgwIHYvn378O0dO3ZET09PTJs2LWbPnh2rV6+O119/PX75y19GRMQtt9wS//d//xdz586Nf//73/Gzn/0sNm3aFH/84x+ruUyoiFyTGpkmNTJNiuSa1Mg0KZJrxquqDhefeeaZOP/884dvr1q1KiIili9fHhs2bIjdu3fHzp07h39+6NChuOqqq+L111+PKVOmxFlnnRV/+tOfRtxHvWpqaoo1a9a85+PLHF2Rz5dcv63Iz1ERFfl8yfQ7ivw8FVFRz5dMv6Ooz1FRFfl8yfU7ivw8FU2Rz5VMv6PIz1MRFfl8yfU7ivw8FU0K52rMLugCAAAAAKSl8N+5CAAAAAAUk+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+HiGFm3bl2ceuqpMXny5Fi4cGE89dRTtV5SIT3++OOxdOnSaG9vj4aGhrj//vtrvSSOQabLI9P1Q6bLJ9f1Q67LI9P1Q6bLJ9f1Q67LI9P1Q6bLl0quDRfHwN133x2rVq2KNWvWxLPPPhvz5s2LJUuWxJ49e2q9tMI5ePBgzJs3L9atW1frpXAcMl0+ma4PMl0Zua4Pcl0+ma4PMl0Zua4Pcl0+ma4PMl2ZZHKdUXUdHR3ZlVdeOXz78OHDWXt7e7Z27doarqr4IiLbuHFjrZfBUch0PjJdXDKdn1wXl1znI9PFJdP5yXVxyXU+Ml1cMp1fPefaJxer7NChQ7Ft27bo6uoa3jZhwoTo6uqKLVu21HBlkI9MkxqZJkVyTWpkmhTJNamR6fHLcLHK9u7dG4cPH47W1tYR21tbW6O3t7dGq4L8ZJrUyDQpkmtSI9OkSK5JjUyPX4aLAAAAAEAuhotVduKJJ8bEiROjr69vxPa+vr5oa2ur0aogP5kmNTJNiuSa1Mg0KZJrUiPT45fhYpU1NjbGggULoru7e3jb4OBgdHd3R2dnZw1XBvnINKmRaVIk16RGpkmRXJMamR6/JtV6AePBqlWrYvny5XHuuedGR0dH3HLLLXHw4MFYsWJFrZdWOAcOHIjt27cP396xY0f09PTEtGnTYvbs2TVcGUeS6fLJdH2Q6crIdX2Q6/LJdH2Q6crIdX2Q6/LJdH2Q6cokk+taX656vLj11luz2bNnZ42NjVlHR0f25JNP1npJhfTII49kEfGeWr58ea2XxrvIdHlkun7IdPnkun7IdXlkun7IdPnkun7IdXlkun7IdPlSyXVDlmVZleeXAAAAAECCfOciAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQy/8DtjV1ROlkKSAAAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 1600x400 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" d h h d h d h\n"
]
}
],
"source": [
"draw_examples(x4_train[:7], captions=y4_train)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"x4_train = x4_train.reshape(trainset_size, 4)\n",
"x4_test = x4_test.reshape(testset_size, 4)\n",
"x4_train = x4_train.astype(\"float32\")\n",
"x4_test = x4_test.astype(\"float32\")\n",
"\n",
"y4_train = np.array([{\"s\": 0, \"v\": 1, \"d\": 2, \"h\": 3}[desc] for desc in y4_train])\n",
"y4_test = np.array([{\"s\": 0, \"v\": 1, \"d\": 2, \"h\": 3}[desc] for desc in y4_test])\n",
"\n",
"y4_train = keras.utils.to_categorical(y4_train, num_classes)\n",
"y4_test = keras.utils.to_categorical(y4_test, num_classes)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"sequential_4\"\n",
"_________________________________________________________________\n",
" Layer (type) Output Shape Param # \n",
"=================================================================\n",
" dense_12 (Dense) (None, 4) 20 \n",
" \n",
" dense_13 (Dense) (None, 4) 20 \n",
" \n",
" dense_14 (Dense) (None, 8) 40 \n",
" \n",
" dense_15 (Dense) (None, 4) 36 \n",
" \n",
"=================================================================\n",
"Total params: 116\n",
"Trainable params: 116\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"model4 = keras.Sequential()\n",
"model4.add(Dense(4, activation=\"tanh\", input_shape=(4,)))\n",
"model4.add(Dense(4, activation=\"tanh\"))\n",
"model4.add(Dense(8, activation=\"relu\"))\n",
"model4.add(Dense(num_classes, activation=\"softmax\"))\n",
"model4.summary()"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"model4.layers[0].set_weights(\n",
" [\n",
" np.array(\n",
" [\n",
" [1.0, 0.0, 1.0, 0.0],\n",
" [0.0, 1.0, 0.0, 1.0],\n",
" [1.0, 0.0, -1.0, 0.0],\n",
" [0.0, 1.0, 0.0, -1.0],\n",
" ],\n",
" dtype=np.float32,\n",
" ),\n",
" np.array([0.0, 0.0, 0.0, 0.0], dtype=np.float32),\n",
" ]\n",
")\n",
"model4.layers[1].set_weights(\n",
" [\n",
" np.array(\n",
" [\n",
" [1.0, -1.0, 0.0, 0.0],\n",
" [1.0, 1.0, 0.0, 0.0],\n",
" [0.0, 0.0, 1.0, -1.0],\n",
" [0.0, 0.0, -1.0, -1.0],\n",
" ],\n",
" dtype=np.float32,\n",
" ),\n",
" np.array([0.0, 0.0, 0.0, 0.0], dtype=np.float32),\n",
" ]\n",
")\n",
"model4.layers[2].set_weights(\n",
" [\n",
" np.array(\n",
" [\n",
" [1.0, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],\n",
" [0.0, 0.0, 1.0, -1.0, 0.0, 0.0, 0.0, 0.0],\n",
" [0.0, 0.0, 0.0, 0.0, 1.0, -1.0, 0.0, 0.0],\n",
" [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, -1.0],\n",
" ],\n",
" dtype=np.float32,\n",
" ),\n",
" np.array([0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], dtype=np.float32),\n",
" ]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"model4.layers[3].set_weights(\n",
" [\n",
" np.array(\n",
" [\n",
" [1.0, 0.0, 0.0, 0.0],\n",
" [1.0, 0.0, 0.0, 0.0],\n",
" [0.0, 1.0, 0.0, 0.0],\n",
" [0.0, 1.0, 0.0, 0.0],\n",
" [0.0, 0.0, 1.0, 0.0],\n",
" [0.0, 0.0, 1.0, 0.0],\n",
" [0.0, 0.0, 0.0, 1.0],\n",
" [0.0, 0.0, 0.0, 1.0],\n",
" ],\n",
" dtype=np.float32,\n",
" ),\n",
" np.array([0.0, 0.0, 0.0, 0.0], dtype=np.float32),\n",
" ]\n",
")\n",
"\n",
"model4.compile(\n",
" loss=\"categorical_crossentropy\",\n",
" optimizer=keras.optimizers.Adagrad(),\n",
" metrics=[\"accuracy\"],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[array([[ 1., 0., 1., 0.],\n",
" [ 0., 1., 0., 1.],\n",
" [ 1., 0., -1., 0.],\n",
" [ 0., 1., 0., -1.]], dtype=float32), array([0., 0., 0., 0.], dtype=float32)]\n",
"[array([[ 1., -1., 0., 0.],\n",
" [ 1., 1., 0., 0.],\n",
" [ 0., 0., 1., -1.],\n",
" [ 0., 0., -1., -1.]], dtype=float32), array([0., 0., 0., 0.], dtype=float32)]\n",
"[array([[ 1., -1., 0., 0., 0., 0., 0., 0.],\n",
" [ 0., 0., 1., -1., 0., 0., 0., 0.],\n",
" [ 0., 0., 0., 0., 1., -1., 0., 0.],\n",
" [ 0., 0., 0., 0., 0., 0., 1., -1.]], dtype=float32), array([0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)]\n",
"[array([[1., 0., 0., 0.],\n",
" [1., 0., 0., 0.],\n",
" [0., 1., 0., 0.],\n",
" [0., 1., 0., 0.],\n",
" [0., 0., 1., 0.],\n",
" [0., 0., 1., 0.],\n",
" [0., 0., 0., 1.],\n",
" [0., 0., 0., 1.]], dtype=float32), array([0., 0., 0., 0.], dtype=float32)]\n"
]
}
],
"source": [
"for layer in model4.layers:\n",
" print(layer.get_weights())"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1/1 [==============================] - 1s 872ms/step\n"
]
},
{
"data": {
"text/plain": [
"array([[0.17831734, 0.17831734, 0.17831734, 0.465048 ]], dtype=float32)"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model4.predict([np.array([[1.0, 1.0], [-1.0, -1.0]]).reshape(1, 4)])"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Test loss: 0.7656148672103882\n",
"Test accuracy: 1.0\n"
]
}
],
"source": [
"score = model4.evaluate(x4_test, y4_test, verbose=0)\n",
"\n",
"print(\"Test loss: {}\".format(score[0]))\n",
"print(\"Test accuracy: {}\".format(score[1]))"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"sequential_5\"\n",
"_________________________________________________________________\n",
" Layer (type) Output Shape Param # \n",
"=================================================================\n",
" dense_16 (Dense) (None, 4) 20 \n",
" \n",
" dense_17 (Dense) (None, 4) 20 \n",
" \n",
" dense_18 (Dense) (None, 8) 40 \n",
" \n",
" dense_19 (Dense) (None, 4) 36 \n",
" \n",
"=================================================================\n",
"Total params: 116\n",
"Trainable params: 116\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"model5 = keras.Sequential()\n",
"model5.add(Dense(4, activation=\"tanh\", input_shape=(4,)))\n",
"model5.add(Dense(4, activation=\"tanh\"))\n",
"model5.add(Dense(8, activation=\"relu\"))\n",
"model5.add(Dense(num_classes, activation=\"softmax\"))\n",
"model5.compile(\n",
" loss=\"categorical_crossentropy\",\n",
" optimizer=keras.optimizers.RMSprop(),\n",
" metrics=[\"accuracy\"],\n",
")\n",
"model5.summary()"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch 1/8\n",
"125/125 [==============================] - 3s 8ms/step - loss: 1.3014 - accuracy: 0.4947 - val_loss: 1.1876 - val_accuracy: 0.6040\n",
"Epoch 2/8\n",
"125/125 [==============================] - 1s 6ms/step - loss: 1.0779 - accuracy: 0.7395 - val_loss: 0.9865 - val_accuracy: 0.8730\n",
"Epoch 3/8\n",
"125/125 [==============================] - 1s 4ms/step - loss: 0.8925 - accuracy: 0.8382 - val_loss: 0.8114 - val_accuracy: 0.7460\n",
"Epoch 4/8\n",
"125/125 [==============================] - 0s 4ms/step - loss: 0.7266 - accuracy: 0.8060 - val_loss: 0.6622 - val_accuracy: 0.8730\n",
"Epoch 5/8\n",
"125/125 [==============================] - 0s 4ms/step - loss: 0.5890 - accuracy: 0.8765 - val_loss: 0.5392 - val_accuracy: 0.8730\n",
"Epoch 6/8\n",
"125/125 [==============================] - 1s 4ms/step - loss: 0.4738 - accuracy: 0.8838 - val_loss: 0.4293 - val_accuracy: 0.8730\n",
"Epoch 7/8\n",
"125/125 [==============================] - 1s 5ms/step - loss: 0.3636 - accuracy: 0.9337 - val_loss: 0.3191 - val_accuracy: 1.0000\n",
"Epoch 8/8\n",
"125/125 [==============================] - 1s 5ms/step - loss: 0.2606 - accuracy: 1.0000 - val_loss: 0.2202 - val_accuracy: 1.0000\n"
]
},
{
"data": {
"text/plain": [
"<keras.callbacks.History at 0x7f860a6a9870>"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model5.fit(x4_train, y4_train, epochs=8, validation_data=(x4_test, y4_test))"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1/1 [==============================] - 0s 106ms/step\n"
]
},
{
"data": {
"text/plain": [
"array([[1.5366691e-01, 4.4674356e-04, 4.7448810e-02, 7.9843748e-01]],\n",
" dtype=float32)"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model5.predict([np.array([[1.0, 1.0], [-1.0, -1.0]]).reshape(1, 4)])"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Test loss: 0.22015966475009918\n",
"Test accuracy: 1.0\n"
]
}
],
"source": [
"score = model5.evaluate(x4_test, y4_test, verbose=0)\n",
"\n",
"print(\"Test loss: {}\".format(score[0]))\n",
"print(\"Test accuracy: {}\".format(score[1]))"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"import contextlib\n",
"\n",
"\n",
"@contextlib.contextmanager\n",
"def printoptions(*args, **kwargs):\n",
" original = np.get_printoptions()\n",
" np.set_printoptions(*args, **kwargs)\n",
" try:\n",
" yield\n",
" finally:\n",
" np.set_printoptions(**original)"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[array([[-0.8, 0.1, -0.6, 0.1],\n",
" [-0.9, -0.7, -1. , 0.6],\n",
" [-0.3, 0.5, 0.5, 0.3],\n",
" [ 0.4, 0.3, -0.9, -0.8]], dtype=float32), array([ 0., -0., 0., 0.], dtype=float32)]\n",
"[array([[-1.1, 1.2, -0.6, -0.6],\n",
" [-1.1, -0.2, -0.7, -1.3],\n",
" [ 0.6, 0.9, 0.3, -1.3],\n",
" [ 0.8, 0.3, 0.7, 0.4]], dtype=float32), array([ 0.3, 0.5, -0.4, 0.5], dtype=float32)]\n",
"[array([[ 0.5, 0.4, -0.4, 0.3, 0.8, -1.4, -1.1, 0.8],\n",
" [ 0.5, -1.3, 0.3, 0.4, -1.3, 0.2, 0.9, 0.7],\n",
" [-0.2, -0.1, -0.5, -0.2, 1.2, -0.4, -0.4, 1.1],\n",
" [-1.1, 0.4, 1.3, -1.1, 1. , -1.1, -0.8, 0.3]], dtype=float32), array([ 0.2, 0.2, 0.1, 0.1, 0.2, 0.1, -0.2, 0. ], dtype=float32)]\n",
"[array([[ 0.7, 0.8, -1.5, -0.2],\n",
" [ 0.7, -0.9, -1.2, 0.2],\n",
" [-0.4, 1.1, -0.1, -1.6],\n",
" [ 0.3, 0.8, -1.4, 0.4],\n",
" [ 0.2, -1.4, -0.3, 0.5],\n",
" [-0.2, -1.2, 0.6, 0.7],\n",
" [-0.1, -1.5, 0.3, -0.1],\n",
" [-1.4, 0.1, 1.2, -0. ]], dtype=float32), array([-0.2, 0.5, 0.5, -0.5], dtype=float32)]\n"
]
}
],
"source": [
"with printoptions(precision=1, suppress=True):\n",
" for layer in model5.layers:\n",
" print(layer.get_weights())"
]
}
],
"metadata": {
"author": "Paweł Skórzewski",
"celltoolbar": "Slideshow",
"email": "pawel.skorzewski@amu.edu.pl",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"lang": "pl",
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"livereveal": {
"start_slideshow_at": "selected",
"theme": "white"
},
"subtitle": "10.Sieci neuronowe propagacja wsteczna[wykład]",
"title": "Uczenie maszynowe",
"vscode": {
"interpreter": {
"hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
}
},
"year": "2021"
},
"nbformat": 4,
"nbformat_minor": 4
}