1
0
Fork 0
uczenie-maszynowe/wyk/12_Propagacja_wsteczna.ipynb

2111 lines
85 KiB
Plaintext
Raw Normal View History

2023-01-23 15:42:40 +01:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# 12. Sieci neuronowe propagacja wsteczna"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
2023-01-26 11:38:50 +01:00
"%matplotlib inline"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 12.1. Metoda propagacji wstecznej wprowadzenie"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"nn1.png\" alt=\"Rys. 12.1. Wielowarstwowa sieć neuronowa\" style=\"height: 100%\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Architektura sieci neuronowych\n",
"\n",
"* Budowa warstwowa, najczęściej sieci jednokierunkowe i gęste.\n",
"* Liczbę i rozmiar warstw dobiera się do każdego problemu.\n",
"* Rozmiary sieci określane poprzez liczbę neuronów lub parametrów."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### _Feedforward_\n",
"\n",
"Mając daną $n$-warstwową sieć neuronową oraz jej parametry $\\Theta^{(1)}, \\ldots, \\Theta^{(L)} $ oraz $\\beta^{(1)}, \\ldots, \\beta^{(L)} $, obliczamy:\n",
"\n",
"$$a^{(l)} = g^{(l)}\\left( a^{(l-1)} \\Theta^{(l)} + \\beta^{(l)} \\right). $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"nn2.png\" alt=\"Rys. 12.2. Wielowarstwowa sieć neuronowa - feedforward\" style=\"height:100%\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Funkcje $g^{(l)}$ to **funkcje aktywacji**.<br/>\n",
"Dla $i = 0$ przyjmujemy $a^{(0)} = x$ (wektor wierszowy cech) oraz $g^{(0)}(x) = x$ (identyczność)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Parametry $\\Theta$ to wagi na połączeniach miedzy neuronami dwóch warstw.<br/>\n",
"Rozmiar macierzy $\\Theta^{(l)}$, czyli macierzy wag na połączeniach warstw $a^{(l-1)}$ i $a^{(l)}$, to $\\dim(a^{(l-1)}) \\times \\dim(a^{(l)})$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Parametry $\\beta$ zastępują tutaj dodawanie kolumny z jedynkami do macierzy cech.<br/>Macierz $\\beta^{(l)}$ ma rozmiar równy liczbie neuronów w odpowiedniej warstwie, czyli $1 \\times \\dim(a^{(l)})$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* **Klasyfikacja**: dla ostatniej warstwy $L$ (o rozmiarze równym liczbie klas) przyjmuje się $g^{(L)}(x) = \\mathop{\\mathrm{softmax}}(x)$.\n",
"* **Regresja**: pojedynczy neuron wyjściowy; funkcją aktywacji może wtedy być np. funkcja identycznościowa."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Jak uczyć sieci neuronowe?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* W poznanych do tej pory algorytmach (regresja liniowa, regresja logistyczna) do uczenia używaliśmy funkcji kosztu, jej gradientu oraz algorytmu gradientu prostego (GD/SGD)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Dla sieci neuronowych potrzebowalibyśmy również znaleźć gradient funkcji kosztu."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Sprowadza się to do bardziej ogólnego problemu:<br/>jak obliczyć gradient $\\nabla f(x)$ dla danej funkcji $f$ i wektora wejściowego $x$?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Pochodna funkcji\n",
"\n",
"* **Pochodna** mierzy, jak szybko zmienia się wartość funkcji względem zmiany jej argumentów:\n",
"\n",
"$$ \\frac{d f(x)}{d x} = \\lim_{h \\to 0} \\frac{ f(x + h) - f(x) }{ h } $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Pochodna cząstkowa i gradient\n",
"\n",
"* **Pochodna cząstkowa** mierzy, jak szybko zmienia się wartość funkcji względem zmiany jej *pojedynczego argumentu*."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* **Gradient** to wektor pochodnych cząstkowych:\n",
"\n",
"$$ \\nabla f = \\left( \\frac{\\partial f}{\\partial x_1}, \\ldots, \\frac{\\partial f}{\\partial x_n} \\right) $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Gradient przykłady"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$$ f(x_1, x_2) = x_1 + x_2 \\qquad \\to \\qquad \\frac{\\partial f}{\\partial x_1} = 1, \\quad \\frac{\\partial f}{\\partial x_2} = 1, \\quad \\nabla f = (1, 1) $$ "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$$ f(x_1, x_2) = x_1 \\cdot x_2 \\qquad \\to \\qquad \\frac{\\partial f}{\\partial x_1} = x_2, \\quad \\frac{\\partial f}{\\partial x_2} = x_1, \\quad \\nabla f = (x_2, x_1) $$ "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
2023-01-26 14:26:06 +01:00
"$$ f(x_1, x_2) = \\max(x_1, x_2) \\hskip{12em} \\\\\n",
"\\to \\qquad \\frac{\\partial f}{\\partial x_1} = \\mathbb{1}_{x_1 \\geq x_2}, \\quad \\frac{\\partial f}{\\partial x_2} = \\mathbb{1}_{x_2 \\geq x_1}, \\quad \\nabla f = (\\mathbb{1}_{x_1 \\geq x_2}, \\mathbb{1}_{x_2 \\geq x_1}) $$ "
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Własności pochodnych cząstkowych\n",
"\n",
"Jezeli $f(x, y, z) = (x + y) \\, z$ oraz $x + y = q$, to:\n",
"$$f = q z,\n",
"\\quad \\frac{\\partial f}{\\partial q} = z,\n",
"\\quad \\frac{\\partial f}{\\partial z} = q,\n",
"\\quad \\frac{\\partial q}{\\partial x} = 1,\n",
"\\quad \\frac{\\partial q}{\\partial y} = 1 $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Reguła łańcuchowa\n",
"\n",
"$$ \\frac{\\partial f}{\\partial x} = \\frac{\\partial f}{\\partial q} \\, \\frac{\\partial q}{\\partial x},\n",
"\\quad \\frac{\\partial f}{\\partial y} = \\frac{\\partial f}{\\partial q} \\, \\frac{\\partial q}{\\partial y} $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Propagacja wsteczna prosty przykład"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# Dla ustalonego wejścia\n",
2023-01-26 11:38:50 +01:00
"x = -2\n",
"y = 5\n",
"z = -4"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2023-01-26 11:38:50 +01:00
"3 -12\n"
2023-01-23 15:42:40 +01:00
]
}
],
"source": [
"# Krok w przód\n",
"q = x + y\n",
"f = q * z\n",
"print(q, f)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[-4, -4, 3]\n"
]
}
],
"source": [
"# Propagacja wsteczna dla f = q * z\n",
2023-01-26 11:38:50 +01:00
"# Oznaczmy symbolami `dfx`, `dfy`, `dfz`, `dfq` odpowiednio\n",
2023-01-23 15:42:40 +01:00
"# pochodne cząstkowe ∂f/∂x, ∂f/∂y, ∂f/∂z, ∂f/∂q\n",
"dfz = q\n",
"dfq = z\n",
"# Propagacja wsteczna dla q = x + y\n",
"dfx = 1 * dfq # z reguły łańcuchowej\n",
"dfy = 1 * dfq # z reguły łańcuchowej\n",
"print([dfx, dfy, dfz])"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"exp1.png\" alt=\"Rys. 12.3. Propagacja wsteczna - przykład 1\" style=\"height:100%\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Właśnie tak wygląda obliczanie pochodnych metodą propagacji wstecznej!"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Spróbujmy czegoś bardziej skomplikowanego:<br/>metodą propagacji wstecznej obliczmy pochodną funkcji sigmoidalnej."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Propagacja wsteczna funkcja sigmoidalna"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Funkcja sigmoidalna:\n",
"\n",
"$$f(\\theta,x) = \\frac{1}{1+e^{-(\\theta_0 x_0 + \\theta_1 x_1 + \\theta_2)}}$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"$$\n",
"\\begin{array}{lcl}\n",
"f(x) = \\frac{1}{x} \\quad & \\rightarrow & \\quad \\frac{df}{dx} = -\\frac{1}{x^2} \\\\\n",
"f_c(x) = c + x \\quad & \\rightarrow & \\quad \\frac{df}{dx} = 1 \\\\\n",
"f(x) = e^x \\quad & \\rightarrow & \\quad \\frac{df}{dx} = e^x \\\\\n",
"f_a(x) = ax \\quad & \\rightarrow & \\quad \\frac{df}{dx} = a \\\\\n",
"\\end{array}\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"exp2.png\" alt=\"Rys. 12.4. Propagacja wsteczna - przykład 2\" style=\"height:100%\"/>"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0.3932238664829637, -0.5898357997244456]\n",
"[-0.19661193324148185, -0.3932238664829637, 0.19661193324148185]\n"
]
}
],
"source": [
2023-01-26 11:38:50 +01:00
"from math import exp\n",
"\n",
"\n",
2023-01-23 15:42:40 +01:00
"# Losowe wagi i dane\n",
2023-01-26 11:38:50 +01:00
"w = [2, -3, -3]\n",
2023-01-23 15:42:40 +01:00
"x = [-1, -2]\n",
"\n",
"# Krok w przód\n",
2023-01-26 11:38:50 +01:00
"dot = w[0] * x[0] + w[1] * x[1] + w[2]\n",
"f = 1.0 / (1 + exp(-dot)) # funkcja sigmoidalna\n",
2023-01-23 15:42:40 +01:00
"\n",
"# Krok w tył\n",
"ddot = (1 - f) * f # pochodna funkcji sigmoidalnej\n",
"dx = [w[0] * ddot, w[1] * ddot]\n",
"dw = [x[0] * ddot, x[1] * ddot, 1.0 * ddot]\n",
"\n",
"print(dx)\n",
"print(dw)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Obliczanie gradientów podsumowanie"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Gradient $f$ dla $x$ mówi, jak zmieni się całe wyrażenie przy zmianie wartości $x$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Gradienty łączymy, korzystając z **reguły łańcuchowej**."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* W kroku \"wstecz\" gradienty informują, które części grafu powinny być zwiększone lub zmniejszone (i z jaką siłą), aby zwiększyć wartość na wyjściu."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* W kontekście implementacji chcemy dzielić funkcję $f$ na części, dla których można łatwo obliczyć gradienty."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 12.2. Uczenie wielowarstwowych sieci neuronowych metodą propagacji wstecznej"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Mając algorytm SGD oraz gradienty wszystkich wag, moglibyśmy trenować każdą sieć."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Niech $\\Theta = (\\Theta^{(1)},\\Theta^{(2)},\\Theta^{(3)},\\beta^{(1)},\\beta^{(2)},\\beta^{(3)})$\n",
"* Funkcja sieci neuronowej z grafiki:\n",
"$$\\small h_\\Theta(x) = \\tanh(\\tanh(\\tanh(x\\Theta^{(1)}+\\beta^{(1)})\\Theta^{(2)} + \\beta^{(2)})\\Theta^{(3)} + \\beta^{(3)})$$\n",
"* Funkcja kosztu dla regresji:\n",
"$$J(\\Theta) = \\dfrac{1}{2m} \\sum_{i=1}^{m} (h_\\Theta(x^{(i)})- y^{(i)})^2 $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Jak obliczymy gradienty?\n",
"\n",
"$$\\nabla_{\\Theta^{(l)}} J(\\Theta) = ? \\quad \\nabla_{\\beta^{(l)}} J(\\Theta) = ?$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### W kierunku propagacji wstecznej\n",
"\n",
"* Pewna (niewielka) zmiana wagi $\\Delta z^l_j$ dla $j$-ego neuronu w warstwie $l$ pociąga za sobą (niewielką) zmianę kosztu: \n",
"\n",
"$$\\frac{\\partial J(\\Theta)}{\\partial z^{l}_j} \\Delta z^{l}_j$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Jeżeli $\\frac{\\partial J(\\Theta)}{\\partial z^{l}_j}$ jest duża, $\\Delta z^l_j$ ze znakiem przeciwnym zredukuje koszt.\n",
"* Jeżeli $\\frac{\\partial J(\\Theta)}{\\partial z^l_j}$ jest bliska zeru, koszt nie będzie mocno poprawiony."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Definiujemy błąd $\\delta^l_j$ neuronu $j$ w warstwie $l$: \n",
"\n",
"$$\\delta^l_j := \\dfrac{\\partial J(\\Theta)}{\\partial z^l_j}$$ \n",
"$$\\delta^l := \\nabla_{z^l} J(\\Theta) \\quad \\textrm{ (zapis wektorowy)} $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Podstawowe równania propagacji wstecznej\n",
"\n",
"$$\n",
"\\begin{array}{rcll}\n",
"\\delta^L & = & \\nabla_{a^L}J(\\Theta) \\odot { \\left( g^{L} \\right) }^{\\prime} \\left( z^L \\right) & (BP1) \\\\[2mm]\n",
"\\delta^{l} & = & \\left( \\left( \\Theta^{l+1} \\right) \\! ^\\top \\, \\delta^{l+1} \\right) \\odot {{ \\left( g^{l} \\right) }^{\\prime}} \\left( z^{l} \\right) & (BP2)\\\\[2mm]\n",
"\\nabla_{\\beta^l} J(\\Theta) & = & \\delta^l & (BP3)\\\\[2mm]\n",
"\\nabla_{\\Theta^l} J(\\Theta) & = & a^{l-1} \\odot \\delta^l & (BP4)\\\\\n",
"\\end{array}\n",
"$$\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### (BP1)\n",
"$$ \\delta^L_j \\; = \\; \\frac{ \\partial J }{ \\partial a^L_j } \\, g' \\!\\! \\left( z^L_j \\right) $$\n",
"$$ \\delta^L \\; = \\; \\nabla_{a^L}J(\\Theta) \\odot { \\left( g^{L} \\right) }^{\\prime} \\left( z^L \\right) $$\n",
"Błąd w ostatniej warstwie jest iloczynem szybkości zmiany kosztu względem $j$-tego wyjścia i szybkości zmiany funkcji aktywacji w punkcie $z^L_j$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### (BP2)\n",
"$$ \\delta^{l} \\; = \\; \\left( \\left( \\Theta^{l+1} \\right) \\! ^\\top \\, \\delta^{l+1} \\right) \\odot {{ \\left( g^{l} \\right) }^{\\prime}} \\left( z^{l} \\right) $$\n",
"Aby obliczyć błąd w $l$-tej warstwie, należy przemnożyć błąd z następnej ($(l+1)$-szej) warstwy przez transponowany wektor wag, a uzyskaną macierz pomnożyć po współrzędnych przez szybkość zmiany funkcji aktywacji w punkcie $z^l$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### (BP3)\n",
"$$ \\nabla_{\\beta^l} J(\\Theta) \\; = \\; \\delta^l $$\n",
"Błąd w $l$-tej warstwie jest równy wartości gradientu funkcji kosztu."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### (BP4)\n",
"$$ \\nabla_{\\Theta^l} J(\\Theta) \\; = \\; a^{l-1} \\odot \\delta^l $$\n",
"Gradient funkcji kosztu względem wag $l$-tej warstwy można obliczyć jako iloczyn po współrzędnych $a^{l-1}$ przez $\\delta^l$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Algorytm propagacji wstecznej"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Dla pojedynczego przykładu $(x,y)$:\n",
"1. **Wejście**: Ustaw aktywacje w warstwie cech $a^{(0)}=x$ \n",
"2. **Feedforward:** dla $l=1,\\dots,L$ oblicz \n",
"$z^{(l)} = a^{(l-1)} \\Theta^{(l)} + \\beta^{(l)}$ oraz $a^{(l)}=g^{(l)} \\!\\! \\left( z^{(l)} \\right)$\n",
"3. **Błąd wyjścia $\\delta^{(L)}$:** oblicz wektor $$\\delta^{(L)}= \\nabla_{a^{(L)}}J(\\Theta) \\odot {g^{\\prime}}^{(L)} \\!\\! \\left( z^{(L)} \\right) $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"4. **Propagacja wsteczna błędu:** dla $l = L-1,L-2,\\dots,1$ oblicz $$\\delta^{(l)} = \\delta^{(l+1)}(\\Theta^{(l+1)})^T \\odot {g^{\\prime}}^{(l)} \\!\\! \\left( z^{(l)} \\right) $$\n",
"5. **Gradienty:** \n",
" * $\\dfrac{\\partial}{\\partial \\Theta_{ij}^{(l)}} J(\\Theta) = a_i^{(l-1)}\\delta_j^{(l)} \\textrm{ oraz } \\dfrac{\\partial}{\\partial \\beta_{j}^{(l)}} J(\\Theta) = \\delta_j^{(l)}$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"W naszym przykładzie:\n",
"\n",
"$$\\small J(\\Theta) = \\frac{1}{2} \\left( a^{(L)} - y \\right) ^2 $$\n",
"$$\\small \\dfrac{\\partial}{\\partial a^{(L)}} J(\\Theta) = a^{(L)} - y$$\n",
"\n",
"$$\\small \\tanh^{\\prime}(x) = 1 - \\tanh^2(x)$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"nn3.png\" alt=\"Rys. 12.5. Propagacja wsteczna - schemat\" style=\"height:100%\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Algorytm SGD z propagacją wsteczną\n",
"\n",
"Pojedyncza iteracja:\n",
"1. Dla parametrów $\\Theta = (\\Theta^{(1)},\\ldots,\\Theta^{(L)})$ utwórz pomocnicze macierze zerowe $\\Delta = (\\Delta^{(1)},\\ldots,\\Delta^{(L)})$ o takich samych wymiarach (dla uproszczenia opuszczono wagi $\\beta$)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"2. Dla $m$ przykładów we wsadzie (*batch*), $i = 1,\\ldots,m$:\n",
" * Wykonaj algortym propagacji wstecznej dla przykładu $(x^{(i)}, y^{(i)})$ i przechowaj gradienty $\\nabla_{\\Theta}J^{(i)}(\\Theta)$ dla tego przykładu;\n",
" * $\\Delta := \\Delta + \\dfrac{1}{m}\\nabla_{\\Theta}J^{(i)}(\\Theta)$\n",
"3. Wykonaj aktualizację wag: $\\Theta := \\Theta - \\alpha \\Delta$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Propagacja wsteczna podsumowanie\n",
"\n",
"* Algorytm pierwszy raz wprowadzony w latach 70. XX w.\n",
"* W 1986 David Rumelhart, Geoffrey Hinton i Ronald Williams pokazali, że jest znacznie szybszy od wcześniejszych metod.\n",
"* Obecnie najpopularniejszy algorytm uczenia sieci neuronowych."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 12.3. Przykłady implementacji wielowarstwowych sieci neuronowych"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"source": [
"### Uwaga!\n",
"\n",
"Poniższe przykłady wykorzystują interfejs [Keras](https://keras.io), który jest częścią biblioteki [TensorFlow](https://www.tensorflow.org).\n",
"\n",
"Aby uruchomić TensorFlow w środowisku Jupyter, należy wykonać następujące czynności:\n",
"\n",
"#### Przed pierwszym uruchomieniem (wystarczy wykonać tylko raz)\n",
"\n",
"Instalacja biblioteki TensorFlow w środowisku Anaconda:\n",
"\n",
"1. Uruchom *Anaconda Navigator*\n",
"1. Wybierz kafelek *CMD.exe Prompt*\n",
"1. Kliknij przycisk *Launch*\n",
"1. Pojawi się konsola. Wpisz następujące polecenia, każde zatwierdzając wciśnięciem klawisza Enter:\n",
"```\n",
"conda create -n tf tensorflow\n",
"conda activate tf\n",
"conda install pandas matplotlib\n",
"jupyter notebook\n",
"```\n",
"\n",
"#### Przed każdym uruchomieniem\n",
"\n",
"Jeżeli chcemy korzystać z biblioteki TensorFlow, to środowisko Jupyter Notebook należy uruchomić w następujący sposób:\n",
"\n",
"1. Uruchom *Anaconda Navigator*\n",
"1. Wybierz kafelek *CMD.exe Prompt*\n",
"1. Kliknij przycisk *Launch*\n",
"1. Pojawi się konsola. Wpisz następujące polecenia, każde zatwierdzając wciśnięciem klawisza Enter:\n",
"```\n",
"conda activate tf\n",
"jupyter notebook\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Przykład: MNIST\n",
"\n",
"_Modified National Institute of Standards and Technology database_"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Zbiór cyfr zapisanych pismem odręcznym\n",
"* 60 000 przykładów uczących, 10 000 przykładów testowych\n",
"* Rozdzielczość każdego przykładu: 28 × 28 = 784 piksele"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 6,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
2023-01-26 11:38:50 +01:00
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-01-26 10:52:17.922141: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA\n",
"To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
"2023-01-26 10:52:18.163925: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory\n",
"2023-01-26 10:52:18.163996: I tensorflow/compiler/xla/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.\n",
"2023-01-26 10:52:19.577890: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory\n",
"2023-01-26 10:52:19.578662: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory\n",
"2023-01-26 10:52:19.578677: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz\n",
"11490434/11490434 [==============================] - 1s 0us/step\n"
]
}
],
2023-01-23 15:42:40 +01:00
"source": [
"from tensorflow import keras\n",
"from tensorflow.keras.datasets import mnist\n",
"from tensorflow.keras.layers import Dense, Dropout\n",
"\n",
"# załaduj dane i podziel je na zbiory uczący i testowy\n",
"(x_train, y_train), (x_test, y_test) = mnist.load_data()"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 7,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"from matplotlib import pyplot as plt\n",
"\n",
2023-01-26 11:38:50 +01:00
"\n",
2023-01-23 15:42:40 +01:00
"def draw_examples(examples, captions=None):\n",
" plt.figure(figsize=(16, 4))\n",
" m = len(examples)\n",
" for i, example in enumerate(examples):\n",
" plt.subplot(100 + m * 10 + i + 1)\n",
2023-01-26 11:38:50 +01:00
" plt.imshow(example, cmap=plt.get_cmap(\"gray\"))\n",
2023-01-23 15:42:40 +01:00
" plt.show()\n",
" if captions is not None:\n",
2023-01-26 11:38:50 +01:00
" print(6 * \" \" + (10 * \" \").join(str(captions[i]) for i in range(m)))"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 8,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2023-01-26 11:38:50 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAABQcAAADFCAYAAADpJUQuAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAomElEQVR4nO3de3RU1d3G8V8CJFyTyC2BQiQqFRQJGgFRFqBE0KoQoKIUBKwFCwFEK6X4oqCIQVBbrmK1kKIoaBFQvFKuVUPK1S5AIlKEIEkAJRcCJErO+4fLVNx78ExmJjOz9/ez1vnDh30m+4Qnh2Eznh3hOI4jAAAAAAAAAKwTGewJAAAAAAAAAAgOFgcBAAAAAAAAS7E4CAAAAAAAAFiKxUEAAAAAAADAUiwOAgAAAAAAAJZicRAAAAAAAACwFIuDAAAAAAAAgKVYHAQAAAAAAAAsxeIgAAAAAAAAYCkWBwEAAAAAAABL1QzUC8+fP19mzZol+fn5kpycLHPnzpVOnTr97HkVFRVy9OhRadCggURERARqerCU4zhSUlIizZs3l8hI79bGq9ppEXqNwPGl0yLcqxF6gtVpEXqNwOFeDdNwr4aJuFfDNF512gmAZcuWOVFRUc6iRYucPXv2OCNGjHDi4uKcgoKCnz03NzfXEREOjoAeubm51dZpes1RHYe3nfa113SaI9BHdXeaXnNUx8G9msO0g3s1h4kH92oO0w43nQ7I4mCnTp2c9PT0yv8+d+6c07x5cycjI+Nnzy0sLAz6N47D/KOwsLDaOk2vOarj8LbTvvaaTnME+qjuTtNrjuo4uFdzmHZwr+Yw8eBezWHa4abTfn/mYHl5uWzfvl1SU1Mrs8jISElNTZWsrCxlfFlZmRQXF1ceJSUl/p4SoPDm49redlqEXqP6efu/IHCvRqgLdKdF6DWqH/dqmIZ7NUzEvRqmcdNpvy8OnjhxQs6dOyfx8fHn5fHx8ZKfn6+Mz8jIkNjY2MqjZcuW/p4S4BNvOy1CrxH6uFfDNNyrYSLu1TAN92qYiHs1TBD03YonTZokRUVFlUdubm6wpwT4jF7DNHQaJqLXMA2dhonoNUxDpxGK/L5bcePGjaVGjRpSUFBwXl5QUCAJCQnK+OjoaImOjvb3NAC/8bbTIvQaoY97NUzDvRom4l4N03Cvhom4V8MEfv/kYFRUlKSkpMi6desqs4qKClm3bp106dLF318OCDg6DRPRa5iGTsNE9BqmodMwEb2GEbzehseFZcuWOdHR0U5mZqazd+9eZ+TIkU5cXJyTn5//s+cWFRUFfScXDvOPoqKiaus0veaojsPbTvvaazrNEeijujtNrzmq4+BezWHawb2aw8SDezWHaYebTgdkcdBxHGfu3LlOYmKiExUV5XTq1MnZsmWLq/P4weCojqMqN/yqdppec1THUZVO+9JrOs0R6KO6O02vOarj4F7NYdrBvZrDxIN7NYdph5tORziO40gIKS4ultjY2GBPA4YrKiqSmJiYavt69BqBRqdhmurutAi9RuBxr4ZpuFfDRNyrYRo3nQ76bsUAAAAAAAAAgoPFQQAAAAAAAMBSLA4CAAAAAAAAlmJxEAAAAAAAALAUi4MAAAAAAACApVgcBAAAAAAAACzF4iAAAAAAAABgKRYHAQAAAAAAAEuxOAgAAAAAAABYisVBAAAAAAAAwFIsDgIAAAAAAACWqhnsCQCAWykpKdp8zJgxSjZ06FDt2CVLlijZ3LlztWN37NjhxewAAAAAAFUxe/ZsbT5u3Dgl2717t5Ldfvvt2vMPHTrk28QswScHAQAAAAAAAEuxOAgAAAAAAABYisVBAAAAAAAAwFIsDgIAAAAAAACWYkOSEFWjRg1tHhsb69Pr6jZuqFu3rnbs5ZdfrmTp6enasc8884ySDRo0SMnOnj2rPX/GjBlK9vjjj2vHwg4dOnRQsrVr12rHxsTEKJnjONqx99xzj5L16dNHO7ZRo0YXmCEQfnr27KlkS5cuVbLu3btrz8/JyfH7nACdyZMnK5mn9wWRkeq/dffo0UPJNm3a5PO8AMAUDRo00Ob169dXsttuu007tkmTJkr23HPPaceWlZV5MTuYrlWrVko2ZMgQ7diKigola9u2rZK1adNGez4bkrjDJwcBAAAAAAAAS7E4CAAAAAAAAFiKxUEAAAAAAADAUiwOAgAAAAAAAJZicRAAAAAAAACwFLsV+ygxMVGbR0VFKdn111+vHdu1a1cli4uL044dMGCA+8n56MiRI0o2Z84c7dh+/fopWUlJiZJ9+umn2vPZQdBunTp1UrIVK1YomafdunU7E+v6JyJSXl6uZJ52Jb7uuuuUbMeOHa5eE1XXrVs3JfP0e7Ry5cpAT8coHTt2VLKtW7cGYSbA94YPH67NJ06cqGS63Qo98bRjPQCYTrcLrO6e2qVLF+357dq18+nrN2vWTJuPGzfOp9eFWY4fP65kmzdv1o7t06dPoKcD4ZODAAAAAAAAgLVYHAQAAAAAAAAsxeIgAAAAAAAAYCkWBwEAAAAAAABLsSGJFzp06KBk69ev1471tHFCKPL0gO/Jkycr2alTp7Rjly5dqmR5eXlKdvLkSe35OTk5F5oiwlDdunWV7JprrtGOfeWVV5TM08OM3dq/f782nzlzppItW7ZMO/bjjz9WMt3PRUZGhpezw4X06NFDyVq3bq0dy4YkepGR+n/7S0pKUrKLL75YySIiIvw+J0BH1z8Rkdq1a1fzTGCyzp07a/MhQ4YoWffu3ZXsyiuvdP21Hn74YW1+9OhRJdNtSiiif1+UnZ3teg4wT5s2bZRs/Pjx2rGDBw9Wsjp16iiZpz/rc3NzlczTRn9t27ZVsoEDB2rHLliwQMn27dunHQvzlZaWKtmhQ4eCMBP8gE8OAgAAAAAAAJZicRAAAAAAAACwFIuDAAAAAAAAgKVYHAQAAAAAAAAsxYYkXjh8+LCSff3119qx1bkhie4BxYWFhdqxN954o5KVl5drx7788ss+zQt2e+GFF5Rs0KBB1fb1PW1+Ur9+fSXbtGmTdqxuY4z27dv7NC/8vKFDhypZVlZWEGYSvjxt6DNixAgl0z34ngeEIxBSU1OVbOzYsa7P99TL22+/XckKCgrcTwxGueuuu5Rs9uzZ2rGNGzdWMt0mDRs3btSe36RJEyWbNWvWz8zwwl/L0+vefffdrl8X4UH398Wnn35aO1bX6wYNGvj09T1t3te7d28lq1Wrlnas7r6s+7m6UA47xcXFKVlycnL1TwSV+OQgAAAAAAAAYCkWBwEAAAAAAABLsTgIAAAAAAAAWIrFQQAAAAAAAMBSLA4CAAAAAAAAlmK3Yi988803SjZhwgTtWN3OeTt37tSOnTNnjus57Nq1S8luvvlmJSstLdWef+WVVyrZAw884PrrAz+VkpKizW+77TYl87Qrn45uB+G3335bO/aZZ55RsqNHj2rH6n4OT548qR170003KZk314CqiYzk36189dJLL7ke62m3QqCqunbtqs0XL16sZLrdOj3xtAvsoUOHXL8GwlPNmupfWa699lrt2BdffFHJ6tatqx27efNmJZs2bZqSffTRR9rzo6Ojlez111/Xju3Vq5c219m2bZvrsQhf/fr1U7Lf/e53AflaBw4cUDLd3yFFRHJzc5Xssssu8/ucYDfdfTkxMdGn1+zYsaM21+2qzXsHFX8DAwAAAAAAACzF4iAAAAAAAABgKRYHAQAAAAAAAEuxOAgAAAAAAABYyusNSTZv3iyzZs2S7du3S15enqxcuVLS0tIqf91xHJkyZYq8+OKLUlhYKDfccIM8//zz0rp1a3/OO2SsWrVKm69fv17JSkpKtGOTk5OV7L777tOO1W284GnzEZ09e/Yo2ciRI12fbyI67V6HDh2UbO3atdqxMTExSuY4jnbse++9p2SDBg1Ssu7du2vPnzx5spJ52pDh+PHjSvbpp59qx1ZUVCiZbqOVa665Rnv+jh07tHmghUun27dvr83j4+OrdR4m8maTB08/w6EmXHoNkWHDhmnz5s2bu36NjRs3KtmSJUuqOqWQRKfdGzJkiJJ5s/GSp/v
2023-01-23 15:42:40 +01:00
"text/plain": [
2023-01-26 11:38:50 +01:00
"<Figure size 1600x400 with 7 Axes>"
2023-01-23 15:42:40 +01:00
]
},
2023-01-26 11:38:50 +01:00
"metadata": {},
2023-01-23 15:42:40 +01:00
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" 5 0 4 1 9 2 1\n"
]
}
],
"source": [
"draw_examples(x_train[:7], captions=y_train)"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 9,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"60000 przykładów uczących\n",
"10000 przykładów testowych\n"
]
}
],
"source": [
"num_classes = 10\n",
"\n",
"x_train = x_train.reshape(60000, 784) # 784 = 28 * 28\n",
"x_test = x_test.reshape(10000, 784)\n",
2023-01-26 11:38:50 +01:00
"x_train = x_train.astype(\"float32\")\n",
"x_test = x_test.astype(\"float32\")\n",
2023-01-23 15:42:40 +01:00
"x_train /= 255\n",
"x_test /= 255\n",
2023-01-26 11:38:50 +01:00
"print(\"{} przykładów uczących\".format(x_train.shape[0]))\n",
"print(\"{} przykładów testowych\".format(x_test.shape[0]))\n",
2023-01-23 15:42:40 +01:00
"\n",
"# przekonwertuj wektory klas na binarne macierze klas\n",
"y_train = keras.utils.to_categorical(y_train, num_classes)\n",
"y_test = keras.utils.to_categorical(y_test, num_classes)"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 10,
2023-01-23 15:42:40 +01:00
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
2023-01-26 11:38:50 +01:00
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-01-26 10:52:27.077963: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory\n",
"2023-01-26 10:52:27.078089: W tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: UNKNOWN ERROR (303)\n",
"2023-01-26 10:52:27.078807: I tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ELLIOT): /proc/driver/nvidia/version does not exist\n",
"2023-01-26 10:52:27.095828: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA\n",
"To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
]
},
2023-01-23 15:42:40 +01:00
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"sequential\"\n",
"_________________________________________________________________\n",
2023-01-26 11:38:50 +01:00
" Layer (type) Output Shape Param # \n",
2023-01-23 15:42:40 +01:00
"=================================================================\n",
2023-01-26 11:38:50 +01:00
" dense (Dense) (None, 512) 401920 \n",
" \n",
" dense_1 (Dense) (None, 512) 262656 \n",
" \n",
" dense_2 (Dense) (None, 10) 5130 \n",
" \n",
2023-01-23 15:42:40 +01:00
"=================================================================\n",
"Total params: 669,706\n",
"Trainable params: 669,706\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"model = keras.Sequential()\n",
2023-01-26 11:38:50 +01:00
"model.add(Dense(512, activation=\"relu\", input_shape=(784,)))\n",
2023-01-26 14:26:06 +01:00
"model.add(Dropout(0.2))\n",
2023-01-26 11:38:50 +01:00
"model.add(Dense(512, activation=\"relu\"))\n",
2023-01-26 14:26:06 +01:00
"model.add(Dropout(0.2))\n",
2023-01-26 11:38:50 +01:00
"model.add(Dense(num_classes, activation=\"softmax\"))\n",
2023-01-23 15:42:40 +01:00
"model.summary()"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 11,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(60000, 784) (60000, 10)\n"
]
}
],
"source": [
"print(x_train.shape, y_train.shape)"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 12,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
2023-01-26 11:38:50 +01:00
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-01-26 10:52:27.713204: W tensorflow/tsl/framework/cpu_allocator_impl.cc:82] Allocation of 188160000 exceeds 10% of free system memory.\n"
]
},
2023-01-23 15:42:40 +01:00
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch 1/5\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 13s 25ms/step - loss: 0.2303 - accuracy: 0.9290 - val_loss: 0.1023 - val_accuracy: 0.9684\n",
2023-01-23 15:42:40 +01:00
"Epoch 2/5\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 9s 20ms/step - loss: 0.0840 - accuracy: 0.9742 - val_loss: 0.0794 - val_accuracy: 0.9754\n",
2023-01-23 15:42:40 +01:00
"Epoch 3/5\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 9s 20ms/step - loss: 0.0548 - accuracy: 0.9826 - val_loss: 0.0603 - val_accuracy: 0.9828\n",
2023-01-23 15:42:40 +01:00
"Epoch 4/5\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 9s 20ms/step - loss: 0.0367 - accuracy: 0.9883 - val_loss: 0.0707 - val_accuracy: 0.9796\n",
"Epoch 5/5\n",
"469/469 [==============================] - 9s 19ms/step - loss: 0.0278 - accuracy: 0.9912 - val_loss: 0.0765 - val_accuracy: 0.9785\n"
2023-01-23 15:42:40 +01:00
]
2023-01-26 11:38:50 +01:00
},
{
"data": {
"text/plain": [
"<keras.callbacks.History at 0x7f8642785120>"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
2023-01-23 15:42:40 +01:00
}
],
"source": [
2023-01-26 11:38:50 +01:00
"model.compile(\n",
" loss=\"categorical_crossentropy\",\n",
" optimizer=keras.optimizers.RMSprop(),\n",
" metrics=[\"accuracy\"],\n",
")\n",
2023-01-23 15:42:40 +01:00
"\n",
2023-01-26 11:38:50 +01:00
"model.fit(\n",
" x_train,\n",
" y_train,\n",
" batch_size=128,\n",
" epochs=5,\n",
" verbose=1,\n",
" validation_data=(x_test, y_test),\n",
")"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 13,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2023-01-26 11:38:50 +01:00
"Test loss: 0.07645954936742783\n",
"Test accuracy: 0.9785000085830688\n"
2023-01-23 15:42:40 +01:00
]
}
],
"source": [
"score = model.evaluate(x_test, y_test, verbose=0)\n",
"\n",
2023-01-26 11:38:50 +01:00
"print(\"Test loss: {}\".format(score[0]))\n",
"print(\"Test accuracy: {}\".format(score[1]))"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Warstwa _dropout_ to metoda regularyzacji, służy zapobieganiu nadmiernemu dopasowaniu sieci. Polega na tym, że część węzłów sieci jest usuwana w sposób losowy."
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 14,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2023-01-26 11:38:50 +01:00
"Model: \"sequential_1\"\n",
2023-01-23 15:42:40 +01:00
"_________________________________________________________________\n",
2023-01-26 11:38:50 +01:00
" Layer (type) Output Shape Param # \n",
2023-01-23 15:42:40 +01:00
"=================================================================\n",
2023-01-26 11:38:50 +01:00
" dense_3 (Dense) (None, 512) 401920 \n",
" \n",
" dense_4 (Dense) (None, 512) 262656 \n",
" \n",
" dense_5 (Dense) (None, 10) 5130 \n",
" \n",
2023-01-23 15:42:40 +01:00
"=================================================================\n",
"Total params: 669,706\n",
"Trainable params: 669,706\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n",
2023-01-26 11:38:50 +01:00
"Epoch 1/5\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-01-26 10:53:20.710986: W tensorflow/tsl/framework/cpu_allocator_impl.cc:82] Allocation of 188160000 exceeds 10% of free system memory.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"469/469 [==============================] - 10s 19ms/step - loss: 0.2283 - accuracy: 0.9302 - val_loss: 0.0983 - val_accuracy: 0.9685\n",
2023-01-23 15:42:40 +01:00
"Epoch 2/5\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 10s 22ms/step - loss: 0.0849 - accuracy: 0.9736 - val_loss: 0.0996 - val_accuracy: 0.9673\n",
2023-01-23 15:42:40 +01:00
"Epoch 3/5\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 10s 22ms/step - loss: 0.0549 - accuracy: 0.9829 - val_loss: 0.0704 - val_accuracy: 0.9777\n",
2023-01-23 15:42:40 +01:00
"Epoch 4/5\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 10s 21ms/step - loss: 0.0380 - accuracy: 0.9877 - val_loss: 0.0645 - val_accuracy: 0.9797\n",
2023-01-23 15:42:40 +01:00
"Epoch 5/5\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 20s 43ms/step - loss: 0.0276 - accuracy: 0.9910 - val_loss: 0.0637 - val_accuracy: 0.9825\n"
2023-01-23 15:42:40 +01:00
]
},
{
"data": {
"text/plain": [
2023-01-26 11:38:50 +01:00
"<keras.callbacks.History at 0x7f86301a3f40>"
2023-01-23 15:42:40 +01:00
]
},
2023-01-26 11:38:50 +01:00
"execution_count": 14,
2023-01-23 15:42:40 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Bez warstw Dropout\n",
"\n",
"num_classes = 10\n",
"\n",
"(x_train, y_train), (x_test, y_test) = mnist.load_data()\n",
"\n",
"x_train = x_train.reshape(60000, 784) # 784 = 28 * 28\n",
"x_test = x_test.reshape(10000, 784)\n",
2023-01-26 11:38:50 +01:00
"x_train = x_train.astype(\"float32\")\n",
"x_test = x_test.astype(\"float32\")\n",
2023-01-23 15:42:40 +01:00
"x_train /= 255\n",
"x_test /= 255\n",
"\n",
"y_train = keras.utils.to_categorical(y_train, num_classes)\n",
"y_test = keras.utils.to_categorical(y_test, num_classes)\n",
"\n",
"model_no_dropout = keras.Sequential()\n",
2023-01-26 11:38:50 +01:00
"model_no_dropout.add(Dense(512, activation=\"relu\", input_shape=(784,)))\n",
"model_no_dropout.add(Dense(512, activation=\"relu\"))\n",
"model_no_dropout.add(Dense(num_classes, activation=\"softmax\"))\n",
2023-01-23 15:42:40 +01:00
"model_no_dropout.summary()\n",
"\n",
2023-01-26 11:38:50 +01:00
"model_no_dropout.compile(\n",
" loss=\"categorical_crossentropy\",\n",
" optimizer=keras.optimizers.RMSprop(),\n",
" metrics=[\"accuracy\"],\n",
")\n",
2023-01-23 15:42:40 +01:00
"\n",
2023-01-26 11:38:50 +01:00
"model_no_dropout.fit(\n",
" x_train,\n",
" y_train,\n",
" batch_size=128,\n",
" epochs=5,\n",
" verbose=1,\n",
" validation_data=(x_test, y_test),\n",
")"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 15,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2023-01-26 11:38:50 +01:00
"Test loss (no dropout): 0.06374581903219223\n",
"Test accuracy (no dropout): 0.9825000166893005\n"
2023-01-23 15:42:40 +01:00
]
}
],
"source": [
"# Bez warstw Dropout\n",
"\n",
"score = model_no_dropout.evaluate(x_test, y_test, verbose=0)\n",
"\n",
2023-01-26 11:38:50 +01:00
"print(\"Test loss (no dropout): {}\".format(score[0]))\n",
"print(\"Test accuracy (no dropout): {}\".format(score[1]))"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 18,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2023-01-26 11:38:50 +01:00
"Model: \"sequential_3\"\n",
2023-01-23 15:42:40 +01:00
"_________________________________________________________________\n",
2023-01-26 11:38:50 +01:00
" Layer (type) Output Shape Param # \n",
2023-01-23 15:42:40 +01:00
"=================================================================\n",
2023-01-26 11:38:50 +01:00
" dense_6 (Dense) (None, 2500) 1962500 \n",
" \n",
" dense_7 (Dense) (None, 2000) 5002000 \n",
" \n",
" dense_8 (Dense) (None, 1500) 3001500 \n",
" \n",
" dense_9 (Dense) (None, 1000) 1501000 \n",
" \n",
" dense_10 (Dense) (None, 500) 500500 \n",
" \n",
" dense_11 (Dense) (None, 10) 5010 \n",
" \n",
2023-01-23 15:42:40 +01:00
"=================================================================\n",
"Total params: 11,972,510\n",
"Trainable params: 11,972,510\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n",
2023-01-26 11:38:50 +01:00
"Epoch 1/10\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-01-26 11:06:02.193383: W tensorflow/tsl/framework/cpu_allocator_impl.cc:82] Allocation of 188160000 exceeds 10% of free system memory.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"469/469 [==============================] - 140s 294ms/step - loss: 0.6488 - accuracy: 0.8175 - val_loss: 0.2686 - val_accuracy: 0.9211\n",
2023-01-23 15:42:40 +01:00
"Epoch 2/10\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 147s 313ms/step - loss: 0.2135 - accuracy: 0.9367 - val_loss: 0.2251 - val_accuracy: 0.9363\n",
2023-01-23 15:42:40 +01:00
"Epoch 3/10\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 105s 224ms/step - loss: 0.1549 - accuracy: 0.9535 - val_loss: 0.1535 - val_accuracy: 0.9533\n",
2023-01-23 15:42:40 +01:00
"Epoch 4/10\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 94s 200ms/step - loss: 0.1210 - accuracy: 0.9635 - val_loss: 0.1412 - val_accuracy: 0.9599\n",
2023-01-23 15:42:40 +01:00
"Epoch 5/10\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 93s 199ms/step - loss: 0.0985 - accuracy: 0.9704 - val_loss: 0.1191 - val_accuracy: 0.9650\n",
2023-01-23 15:42:40 +01:00
"Epoch 6/10\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 105s 224ms/step - loss: 0.0834 - accuracy: 0.9746 - val_loss: 0.0959 - val_accuracy: 0.9732\n",
2023-01-23 15:42:40 +01:00
"Epoch 7/10\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 111s 236ms/step - loss: 0.0664 - accuracy: 0.9797 - val_loss: 0.1071 - val_accuracy: 0.9685\n",
2023-01-23 15:42:40 +01:00
"Epoch 8/10\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 184s 392ms/step - loss: 0.0562 - accuracy: 0.9824 - val_loss: 0.0951 - val_accuracy: 0.9737\n",
2023-01-23 15:42:40 +01:00
"Epoch 9/10\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 161s 344ms/step - loss: 0.0475 - accuracy: 0.9852 - val_loss: 0.1377 - val_accuracy: 0.9631\n",
2023-01-23 15:42:40 +01:00
"Epoch 10/10\n",
2023-01-26 11:38:50 +01:00
"469/469 [==============================] - 146s 311ms/step - loss: 0.0399 - accuracy: 0.9873 - val_loss: 0.1093 - val_accuracy: 0.9736\n"
2023-01-23 15:42:40 +01:00
]
},
{
"data": {
"text/plain": [
2023-01-26 11:38:50 +01:00
"<keras.callbacks.History at 0x7f8640136f50>"
2023-01-23 15:42:40 +01:00
]
},
2023-01-26 11:38:50 +01:00
"execution_count": 18,
2023-01-23 15:42:40 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Więcej warstw, inna funkcja aktywacji\n",
"\n",
"num_classes = 10\n",
"\n",
"(x_train, y_train), (x_test, y_test) = mnist.load_data()\n",
"\n",
"x_train = x_train.reshape(60000, 784) # 784 = 28 * 28\n",
"x_test = x_test.reshape(10000, 784)\n",
2023-01-26 11:38:50 +01:00
"x_train = x_train.astype(\"float32\")\n",
"x_test = x_test.astype(\"float32\")\n",
2023-01-23 15:42:40 +01:00
"x_train /= 255\n",
"x_test /= 255\n",
"\n",
"y_train = keras.utils.to_categorical(y_train, num_classes)\n",
"y_test = keras.utils.to_categorical(y_test, num_classes)\n",
"\n",
2023-01-26 11:38:50 +01:00
"model3 = keras.Sequential()\n",
"model3.add(Dense(2500, activation=\"tanh\", input_shape=(784,)))\n",
"model3.add(Dense(2000, activation=\"tanh\"))\n",
"model3.add(Dense(1500, activation=\"tanh\"))\n",
"model3.add(Dense(1000, activation=\"tanh\"))\n",
"model3.add(Dense(500, activation=\"tanh\"))\n",
"model3.add(Dense(num_classes, activation=\"softmax\"))\n",
2023-01-23 15:42:40 +01:00
"model3.summary()\n",
"\n",
2023-01-26 11:38:50 +01:00
"model3.compile(\n",
" loss=\"categorical_crossentropy\",\n",
" optimizer=keras.optimizers.RMSprop(),\n",
" metrics=[\"accuracy\"],\n",
")\n",
2023-01-23 15:42:40 +01:00
"\n",
2023-01-26 11:38:50 +01:00
"model3.fit(\n",
" x_train,\n",
" y_train,\n",
" batch_size=128,\n",
" epochs=10,\n",
" verbose=1,\n",
" validation_data=(x_test, y_test),\n",
")"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 19,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2023-01-26 11:38:50 +01:00
"Test loss: 0.10930903255939484\n",
"Test accuracy: 0.9735999703407288\n"
2023-01-23 15:42:40 +01:00
]
}
],
"source": [
"# Więcej warstw, inna funkcja aktywacji\n",
"\n",
"score = model3.evaluate(x_test, y_test, verbose=0)\n",
"\n",
2023-01-26 11:38:50 +01:00
"print(\"Test loss: {}\".format(score[0]))\n",
"print(\"Test accuracy: {}\".format(score[1]))"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Przykład: 4-pikselowy aparat fotograficzny\n",
"\n",
"https://www.youtube.com/watch?v=ILsA4nyG7I0"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 20,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"def generate_example(description):\n",
" variant = random.choice([1, -1])\n",
2023-01-26 11:38:50 +01:00
" if description == \"s\": # solid\n",
" return (\n",
" np.array([[1.0, 1.0], [1.0, 1.0]])\n",
" if variant == 1\n",
" else np.array([[-1.0, -1.0], [-1.0, -1.0]])\n",
" )\n",
" elif description == \"v\": # vertical\n",
" return (\n",
" np.array([[1.0, -1.0], [1.0, -1.0]])\n",
" if variant == 1\n",
" else np.array([[-1.0, 1.0], [-1.0, 1.0]])\n",
" )\n",
" elif description == \"d\": # diagonal\n",
" return (\n",
" np.array([[1.0, -1.0], [-1.0, 1.0]])\n",
" if variant == 1\n",
" else np.array([[-1.0, 1.0], [1.0, -1.0]])\n",
" )\n",
" elif description == \"h\": # horizontal\n",
" return (\n",
" np.array([[1.0, 1.0], [-1.0, -1.0]])\n",
" if variant == 1\n",
" else np.array([[-1.0, -1.0], [1.0, 1.0]])\n",
" )\n",
2023-01-23 15:42:40 +01:00
" else:\n",
2023-01-26 11:38:50 +01:00
" return np.array(\n",
" [\n",
" [random.uniform(-1, 1), random.uniform(-1, 1)],\n",
" [random.uniform(-1, 1), random.uniform(-1, 1)],\n",
" ]\n",
" )"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 22,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
2023-01-26 11:38:50 +01:00
"import numpy as np\n",
2023-01-23 15:42:40 +01:00
"import random\n",
"\n",
"num_classes = 4\n",
"\n",
"trainset_size = 4000\n",
"testset_size = 1000\n",
"\n",
2023-01-26 11:38:50 +01:00
"y4_train = np.array([random.choice([\"s\", \"v\", \"d\", \"h\"]) for i in range(trainset_size)])\n",
2023-01-23 15:42:40 +01:00
"x4_train = np.array([generate_example(desc) for desc in y4_train])\n",
"\n",
2023-01-26 11:38:50 +01:00
"y4_test = np.array([random.choice([\"s\", \"v\", \"d\", \"h\"]) for i in range(testset_size)])\n",
2023-01-23 15:42:40 +01:00
"x4_test = np.array([generate_example(desc) for desc in y4_test])"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 23,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2023-01-26 11:38:50 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAABRcAAADICAYAAAByFdYYAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAbGklEQVR4nO3df2xddf0/8Fe30c7FtHNja2nY4DMRjEM2GHapJmzEmolkiQbjRANz0QGGf3REZAScEOMi8gcJmZFg5lQw/IgMjBoQO36EOQYMGwi/ksniBq6d2+SWDe10Pd8/+LassM17D7295777eCSvhHt6zr3vnvvci3tfub2nIcuyLAAAAAAAKjSh1gsAAAAAAOqT4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJBL1YaL+/fvj69+9avR3NwcU6dOja9//etx4MCB4x6zePHiaGhoGFFXXHFFtZYIFZNrUiPTpEiuSY1MkyK5JjUyzXjWkGVZVo07vuCCC2L37t1x2223xX/+859YsWJFfOITn4hf//rXxzxm8eLFcfrpp8eNN944vG3KlCnR3NxcjSVCxeSa1Mg0KZJrUiPTpEiuSY1MM55NqsadvvTSS/Hggw/G008/Heeee25ERNx6663xuc99Lm6++eZob28/5rFTpkyJtra2aiwL3he5JjUyTYrkmtTINCmSa1Ij04x3VRkubtmyJaZOnTr8jyoioqurKyZMmBBbt26NL3zhC8c89s4774w77rgj2traYunSpXH99dfHlClTjrn/wMBADAwMDN8eHByM/fv3x/Tp06OhoWF0fiGIiO7u7mhpaYlzzjlneJtcU882bdoULS0t0dbWFoODgzFhwgSZpu7p1aRGryZFejWp0atJUZZl8eabb0Z7e3tMmHD8b1WsynCxt7c3Zs6cOfKBJk2KadOmRW9v7zGP+8pXvhKnnHJKtLe3x3PPPRff/e5345VXXon77rvvmMesXbs2brjhhlFbO/wvf//73+Pkk0+OCLkmDbNmzYpdu3bFySefLNMkQ68mNXo1KdKrSY1eTYqGMn08FX3n4jXXXBM/+tGPjrvPSy+9FPfdd1/84he/iFdeeWXEz2bOnBk33HBDfPOb3yzr8TZt2hSf/vSnY/v27fHhD3/4qPu8e2pfKpVi9uzZsWvXLt9TUKaWlpZaL6GuvPHGGyPO2VjmGqrlyFzr1RTRmjVr4pZbbjnuPk8//XTce++9cdNNN+nVdaJUKtV6CTVTbqZ/+9vfxp133hmvvvpqzXo1VIteXR/06luOu49eTere3auPpqJPLl511VXxta997bj7zJkzJ9ra2mLPnj0jtv/3v/+N/fv3V/RdAgsXLoyIOO4/rKampmhqanrP9ubmZm9YqYojP2o+1rmGahnKtV5NUV177bVx+eWXH3efOXPmxNatWyNCr64X4/nff7mZfu6552Lfvn0RUbteDdWiV9cHvVqvZnwr58/tKxouzpgxI2bMmPE/9+vs7Iw33ngjtm3bFgsWLIiItyfwg4ODw/9YytHT0xMRESeddFIly4Sq+stf/hKLFi2KCLkmPTJNUZX7GqSjoyMi9GqKr5LX1e/+1JBMkwq9mqLTq6FMWZV89rOfzc4+++xs69at2RNPPJF95CMfyS6++OLhn7/22mvZGWeckW3dujXLsizbvn17duONN2bPPPNMtmPHjuyBBx7I5syZk5133nkVPW6pVMoiIiuVSqP6+6QsIlQFddZZZ9Us10pVqzZt2qRXk4ShbOnV9VGUp6urq6a9WqlqlV5dH0V59GqVapXznq1qnWLfvn3ZxRdfnH3wgx/MmpubsxUrVmRvvvnm8M937NiRRUT2yCOPZFmWZTt37szOO++8bNq0aVlTU1N22mmnZd/5zncqfuPpDWvlah3UeqsvfvGLNcu1UtUqvZpUDGVLr66PojxDmZVplVrp1fVRlEevVqlWOZms6IIu9aC/vz9aWlqiVCqN6++GqITL1VemFtkayjVUy1jnWq+mWmqZLb26com9DK2aWuVapqk2vbo+6NXl0atJVTmZnjBGawEAAAAAEmO4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAORiuAgAAAAA5GK4CAAAAADkYrgIAAAAAOQyJsPFdevWxamnnhqTJ0+OhQsXxlNPPXXc/e+999746Ec/GpMnT46Pf/zj8Yc//GEslgllk2lSc/vtt8s0ydGrSY1eTYr0alKjVzMuZVV21113ZY2Njdn69euzF154IVu5cmU2derUrK+v76j7b968OZs4cWJ20003ZS+++GJ23XXXZSeccEL2/PPPl/V4pVIpi4isVCqN5q+RtIhQFdT69evHNNNZ9k6ulapW1SrTejWjbShbenV9FOUZypZMq9RKr66Pojx6tUq1ynnPVvVO0dHRkV155ZXDtw8fPpy1t7dna9euPer+X/rSl7ILL7xwxLaFCxdml19+eVmP5w1r5Wod1HqrBQsWjGmms8z/MFT1a+XKlTXJtF7NaBvKll5dH0V5hrJVq16tVLVKr66Pojx6tUq1ynnPVtU/iz506FBs27Yturq6hrdNmDAhurq6YsuWLUc9ZsuWLSP2j4hYsmTJMfcfGBiI/v7+EQXV1NPTU9VMR8g1Y2/x4sXD/y3TpECvJiWHDh2KCL2a9OjVpESvZjyr6nBx7969cfjw4WhtbR2xvbW1NXp7e496TG9vb0X7r127NlpaWoZr1qxZo7N4OIZqZzpCrhl7M2fOHHFbpql3ejUp2bdvX0To1aRHryYlejXjWd1fLXr16tVRKpWGa9euXbVeErxvck1qZJoUyTWpkWlSJNekRqYpoknVvPMTTzwxJk6cGH19fSO29/X1RVtb21GPaWtrq2j/pqamaGpqGp0FQxmqnekIuWbs7dmzZ8Rtmabe6dWkZPr06RGhV5MevZqU6NWMZ1X95GJjY2MsWLAguru7h7cNDg5Gd3d3dHZ2HvWYzs7OEftHRDz88MPH3B/G2vz582Wa5Dz22GPD/y3TpECvJiWNjY0RoVeTHr2alOjVjGtlX4Iop7vuuitramrKNmzYkL344ovZZZddlk2dOjXr7e3NsizLLrnkkuyaa64Z3n/z5s3ZpEmTsptvvjl76aWXsjVr1lR0KXZXIK1cFODqQ/VU69evH9NMZ5krgKnqV60yrVcz2oaypVfXR1GeoWzJtEqt9Or6KMqjV6tUq5z3bGPSKW699dZs9uzZWWNjY9bR0ZE9+eSTwz9btGhRtnz58hH733PPPdnpp5+eNTY2ZnPnzs1+//vfl/1Y3rBWrtZBrbcqlUpjmuks8z8MVf368Y9/XJNM69WMtiOzpVcXvyjPULZq1auVqlbp1fVRlEevVqlWOe/ZGrIsyyIh/f390dLSEqVSKZqbm2u9nLrQ0NBQ6yXUlVpkayjXUC1jnWu9mmqpZbb06sol9jK0amqVa5mm2vTq+qBXl0evJlXlZLrurxYNAAAAANSG4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQi+EiAAAAAJCL4SIAAAAAkIvhIgAAAACQy5gMF9etWxennnpqTJ48ORYuXBhPPfXUMffdsGFDNDQ
2023-01-23 15:42:40 +01:00
"text/plain": [
2023-01-26 11:38:50 +01:00
"<Figure size 1600x400 with 7 Axes>"
2023-01-23 15:42:40 +01:00
]
},
2023-01-26 11:38:50 +01:00
"metadata": {},
2023-01-23 15:42:40 +01:00
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
2023-01-26 11:38:50 +01:00
" d h h d h d h\n"
2023-01-23 15:42:40 +01:00
]
}
],
"source": [
"draw_examples(x4_train[:7], captions=y4_train)"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 24,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"x4_train = x4_train.reshape(trainset_size, 4)\n",
"x4_test = x4_test.reshape(testset_size, 4)\n",
2023-01-26 11:38:50 +01:00
"x4_train = x4_train.astype(\"float32\")\n",
"x4_test = x4_test.astype(\"float32\")\n",
2023-01-23 15:42:40 +01:00
"\n",
2023-01-26 11:38:50 +01:00
"y4_train = np.array([{\"s\": 0, \"v\": 1, \"d\": 2, \"h\": 3}[desc] for desc in y4_train])\n",
"y4_test = np.array([{\"s\": 0, \"v\": 1, \"d\": 2, \"h\": 3}[desc] for desc in y4_test])\n",
2023-01-23 15:42:40 +01:00
"\n",
"y4_train = keras.utils.to_categorical(y4_train, num_classes)\n",
"y4_test = keras.utils.to_categorical(y4_test, num_classes)"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 25,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2023-01-26 11:38:50 +01:00
"Model: \"sequential_4\"\n",
2023-01-23 15:42:40 +01:00
"_________________________________________________________________\n",
2023-01-26 11:38:50 +01:00
" Layer (type) Output Shape Param # \n",
2023-01-23 15:42:40 +01:00
"=================================================================\n",
2023-01-26 11:38:50 +01:00
" dense_12 (Dense) (None, 4) 20 \n",
" \n",
" dense_13 (Dense) (None, 4) 20 \n",
" \n",
" dense_14 (Dense) (None, 8) 40 \n",
" \n",
" dense_15 (Dense) (None, 4) 36 \n",
" \n",
2023-01-23 15:42:40 +01:00
"=================================================================\n",
"Total params: 116\n",
"Trainable params: 116\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"model4 = keras.Sequential()\n",
2023-01-26 11:38:50 +01:00
"model4.add(Dense(4, activation=\"tanh\", input_shape=(4,)))\n",
"model4.add(Dense(4, activation=\"tanh\"))\n",
"model4.add(Dense(8, activation=\"relu\"))\n",
"model4.add(Dense(num_classes, activation=\"softmax\"))\n",
2023-01-23 15:42:40 +01:00
"model4.summary()"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 26,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"model4.layers[0].set_weights(\n",
2023-01-26 11:38:50 +01:00
" [\n",
" np.array(\n",
" [\n",
" [1.0, 0.0, 1.0, 0.0],\n",
" [0.0, 1.0, 0.0, 1.0],\n",
" [1.0, 0.0, -1.0, 0.0],\n",
" [0.0, 1.0, 0.0, -1.0],\n",
" ],\n",
" dtype=np.float32,\n",
" ),\n",
" np.array([0.0, 0.0, 0.0, 0.0], dtype=np.float32),\n",
" ]\n",
")\n",
2023-01-23 15:42:40 +01:00
"model4.layers[1].set_weights(\n",
2023-01-26 11:38:50 +01:00
" [\n",
" np.array(\n",
" [\n",
" [1.0, -1.0, 0.0, 0.0],\n",
" [1.0, 1.0, 0.0, 0.0],\n",
" [0.0, 0.0, 1.0, -1.0],\n",
" [0.0, 0.0, -1.0, -1.0],\n",
" ],\n",
" dtype=np.float32,\n",
" ),\n",
" np.array([0.0, 0.0, 0.0, 0.0], dtype=np.float32),\n",
" ]\n",
")\n",
2023-01-23 15:42:40 +01:00
"model4.layers[2].set_weights(\n",
2023-01-26 11:38:50 +01:00
" [\n",
" np.array(\n",
" [\n",
" [1.0, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],\n",
" [0.0, 0.0, 1.0, -1.0, 0.0, 0.0, 0.0, 0.0],\n",
" [0.0, 0.0, 0.0, 0.0, 1.0, -1.0, 0.0, 0.0],\n",
" [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, -1.0],\n",
" ],\n",
" dtype=np.float32,\n",
" ),\n",
" np.array([0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], dtype=np.float32),\n",
" ]\n",
")"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 27,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"model4.layers[3].set_weights(\n",
2023-01-26 11:38:50 +01:00
" [\n",
" np.array(\n",
" [\n",
" [1.0, 0.0, 0.0, 0.0],\n",
" [1.0, 0.0, 0.0, 0.0],\n",
" [0.0, 1.0, 0.0, 0.0],\n",
" [0.0, 1.0, 0.0, 0.0],\n",
" [0.0, 0.0, 1.0, 0.0],\n",
" [0.0, 0.0, 1.0, 0.0],\n",
" [0.0, 0.0, 0.0, 1.0],\n",
" [0.0, 0.0, 0.0, 1.0],\n",
" ],\n",
" dtype=np.float32,\n",
" ),\n",
" np.array([0.0, 0.0, 0.0, 0.0], dtype=np.float32),\n",
" ]\n",
")\n",
"\n",
"model4.compile(\n",
" loss=\"categorical_crossentropy\",\n",
" optimizer=keras.optimizers.Adagrad(),\n",
" metrics=[\"accuracy\"],\n",
")"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 28,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[array([[ 1., 0., 1., 0.],\n",
" [ 0., 1., 0., 1.],\n",
" [ 1., 0., -1., 0.],\n",
" [ 0., 1., 0., -1.]], dtype=float32), array([0., 0., 0., 0.], dtype=float32)]\n",
"[array([[ 1., -1., 0., 0.],\n",
" [ 1., 1., 0., 0.],\n",
" [ 0., 0., 1., -1.],\n",
" [ 0., 0., -1., -1.]], dtype=float32), array([0., 0., 0., 0.], dtype=float32)]\n",
"[array([[ 1., -1., 0., 0., 0., 0., 0., 0.],\n",
" [ 0., 0., 1., -1., 0., 0., 0., 0.],\n",
" [ 0., 0., 0., 0., 1., -1., 0., 0.],\n",
" [ 0., 0., 0., 0., 0., 0., 1., -1.]], dtype=float32), array([0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)]\n",
"[array([[1., 0., 0., 0.],\n",
" [1., 0., 0., 0.],\n",
" [0., 1., 0., 0.],\n",
" [0., 1., 0., 0.],\n",
" [0., 0., 1., 0.],\n",
" [0., 0., 1., 0.],\n",
" [0., 0., 0., 1.],\n",
" [0., 0., 0., 1.]], dtype=float32), array([0., 0., 0., 0.], dtype=float32)]\n"
]
}
],
"source": [
"for layer in model4.layers:\n",
" print(layer.get_weights())"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 29,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
2023-01-26 11:38:50 +01:00
{
"name": "stdout",
"output_type": "stream",
"text": [
"1/1 [==============================] - 1s 872ms/step\n"
]
},
2023-01-23 15:42:40 +01:00
{
"data": {
"text/plain": [
"array([[0.17831734, 0.17831734, 0.17831734, 0.465048 ]], dtype=float32)"
]
},
2023-01-26 11:38:50 +01:00
"execution_count": 29,
2023-01-23 15:42:40 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model4.predict([np.array([[1.0, 1.0], [-1.0, -1.0]]).reshape(1, 4)])"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 30,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Test loss: 0.7656148672103882\n",
"Test accuracy: 1.0\n"
]
}
],
"source": [
"score = model4.evaluate(x4_test, y4_test, verbose=0)\n",
"\n",
2023-01-26 11:38:50 +01:00
"print(\"Test loss: {}\".format(score[0]))\n",
"print(\"Test accuracy: {}\".format(score[1]))"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 32,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2023-01-26 11:38:50 +01:00
"Model: \"sequential_5\"\n",
2023-01-23 15:42:40 +01:00
"_________________________________________________________________\n",
2023-01-26 11:38:50 +01:00
" Layer (type) Output Shape Param # \n",
2023-01-23 15:42:40 +01:00
"=================================================================\n",
2023-01-26 11:38:50 +01:00
" dense_16 (Dense) (None, 4) 20 \n",
" \n",
" dense_17 (Dense) (None, 4) 20 \n",
" \n",
" dense_18 (Dense) (None, 8) 40 \n",
" \n",
" dense_19 (Dense) (None, 4) 36 \n",
" \n",
2023-01-23 15:42:40 +01:00
"=================================================================\n",
"Total params: 116\n",
"Trainable params: 116\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
2023-01-26 11:38:50 +01:00
"model5 = keras.Sequential()\n",
"model5.add(Dense(4, activation=\"tanh\", input_shape=(4,)))\n",
"model5.add(Dense(4, activation=\"tanh\"))\n",
"model5.add(Dense(8, activation=\"relu\"))\n",
"model5.add(Dense(num_classes, activation=\"softmax\"))\n",
"model5.compile(\n",
" loss=\"categorical_crossentropy\",\n",
" optimizer=keras.optimizers.RMSprop(),\n",
" metrics=[\"accuracy\"],\n",
")\n",
2023-01-23 15:42:40 +01:00
"model5.summary()"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 33,
2023-01-23 15:42:40 +01:00
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch 1/8\n",
2023-01-26 11:38:50 +01:00
"125/125 [==============================] - 3s 8ms/step - loss: 1.3014 - accuracy: 0.4947 - val_loss: 1.1876 - val_accuracy: 0.6040\n",
2023-01-23 15:42:40 +01:00
"Epoch 2/8\n",
2023-01-26 11:38:50 +01:00
"125/125 [==============================] - 1s 6ms/step - loss: 1.0779 - accuracy: 0.7395 - val_loss: 0.9865 - val_accuracy: 0.8730\n",
2023-01-23 15:42:40 +01:00
"Epoch 3/8\n",
2023-01-26 11:38:50 +01:00
"125/125 [==============================] - 1s 4ms/step - loss: 0.8925 - accuracy: 0.8382 - val_loss: 0.8114 - val_accuracy: 0.7460\n",
2023-01-23 15:42:40 +01:00
"Epoch 4/8\n",
2023-01-26 11:38:50 +01:00
"125/125 [==============================] - 0s 4ms/step - loss: 0.7266 - accuracy: 0.8060 - val_loss: 0.6622 - val_accuracy: 0.8730\n",
2023-01-23 15:42:40 +01:00
"Epoch 5/8\n",
2023-01-26 11:38:50 +01:00
"125/125 [==============================] - 0s 4ms/step - loss: 0.5890 - accuracy: 0.8765 - val_loss: 0.5392 - val_accuracy: 0.8730\n",
2023-01-23 15:42:40 +01:00
"Epoch 6/8\n",
2023-01-26 11:38:50 +01:00
"125/125 [==============================] - 1s 4ms/step - loss: 0.4738 - accuracy: 0.8838 - val_loss: 0.4293 - val_accuracy: 0.8730\n",
2023-01-23 15:42:40 +01:00
"Epoch 7/8\n",
2023-01-26 11:38:50 +01:00
"125/125 [==============================] - 1s 5ms/step - loss: 0.3636 - accuracy: 0.9337 - val_loss: 0.3191 - val_accuracy: 1.0000\n",
2023-01-23 15:42:40 +01:00
"Epoch 8/8\n",
2023-01-26 11:38:50 +01:00
"125/125 [==============================] - 1s 5ms/step - loss: 0.2606 - accuracy: 1.0000 - val_loss: 0.2202 - val_accuracy: 1.0000\n"
2023-01-23 15:42:40 +01:00
]
},
{
"data": {
"text/plain": [
2023-01-26 11:38:50 +01:00
"<keras.callbacks.History at 0x7f860a6a9870>"
2023-01-23 15:42:40 +01:00
]
},
2023-01-26 11:38:50 +01:00
"execution_count": 33,
2023-01-23 15:42:40 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model5.fit(x4_train, y4_train, epochs=8, validation_data=(x4_test, y4_test))"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 34,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
2023-01-26 11:38:50 +01:00
{
"name": "stdout",
"output_type": "stream",
"text": [
"1/1 [==============================] - 0s 106ms/step\n"
]
},
2023-01-23 15:42:40 +01:00
{
"data": {
"text/plain": [
2023-01-26 11:38:50 +01:00
"array([[1.5366691e-01, 4.4674356e-04, 4.7448810e-02, 7.9843748e-01]],\n",
2023-01-23 15:42:40 +01:00
" dtype=float32)"
]
},
2023-01-26 11:38:50 +01:00
"execution_count": 34,
2023-01-23 15:42:40 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model5.predict([np.array([[1.0, 1.0], [-1.0, -1.0]]).reshape(1, 4)])"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 35,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2023-01-26 11:38:50 +01:00
"Test loss: 0.22015966475009918\n",
2023-01-23 15:42:40 +01:00
"Test accuracy: 1.0\n"
]
}
],
"source": [
"score = model5.evaluate(x4_test, y4_test, verbose=0)\n",
"\n",
2023-01-26 11:38:50 +01:00
"print(\"Test loss: {}\".format(score[0]))\n",
"print(\"Test accuracy: {}\".format(score[1]))"
2023-01-23 15:42:40 +01:00
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 36,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"import contextlib\n",
"\n",
2023-01-26 11:38:50 +01:00
"\n",
2023-01-23 15:42:40 +01:00
"@contextlib.contextmanager\n",
"def printoptions(*args, **kwargs):\n",
" original = np.get_printoptions()\n",
" np.set_printoptions(*args, **kwargs)\n",
" try:\n",
" yield\n",
2023-01-26 11:38:50 +01:00
" finally:\n",
2023-01-23 15:42:40 +01:00
" np.set_printoptions(**original)"
]
},
{
"cell_type": "code",
2023-01-26 11:38:50 +01:00
"execution_count": 37,
2023-01-23 15:42:40 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2023-01-26 11:38:50 +01:00
"[array([[-0.8, 0.1, -0.6, 0.1],\n",
" [-0.9, -0.7, -1. , 0.6],\n",
" [-0.3, 0.5, 0.5, 0.3],\n",
" [ 0.4, 0.3, -0.9, -0.8]], dtype=float32), array([ 0., -0., 0., 0.], dtype=float32)]\n",
"[array([[-1.1, 1.2, -0.6, -0.6],\n",
" [-1.1, -0.2, -0.7, -1.3],\n",
" [ 0.6, 0.9, 0.3, -1.3],\n",
" [ 0.8, 0.3, 0.7, 0.4]], dtype=float32), array([ 0.3, 0.5, -0.4, 0.5], dtype=float32)]\n",
"[array([[ 0.5, 0.4, -0.4, 0.3, 0.8, -1.4, -1.1, 0.8],\n",
" [ 0.5, -1.3, 0.3, 0.4, -1.3, 0.2, 0.9, 0.7],\n",
" [-0.2, -0.1, -0.5, -0.2, 1.2, -0.4, -0.4, 1.1],\n",
" [-1.1, 0.4, 1.3, -1.1, 1. , -1.1, -0.8, 0.3]], dtype=float32), array([ 0.2, 0.2, 0.1, 0.1, 0.2, 0.1, -0.2, 0. ], dtype=float32)]\n",
"[array([[ 0.7, 0.8, -1.5, -0.2],\n",
" [ 0.7, -0.9, -1.2, 0.2],\n",
" [-0.4, 1.1, -0.1, -1.6],\n",
" [ 0.3, 0.8, -1.4, 0.4],\n",
" [ 0.2, -1.4, -0.3, 0.5],\n",
" [-0.2, -1.2, 0.6, 0.7],\n",
" [-0.1, -1.5, 0.3, -0.1],\n",
" [-1.4, 0.1, 1.2, -0. ]], dtype=float32), array([-0.2, 0.5, 0.5, -0.5], dtype=float32)]\n"
2023-01-23 15:42:40 +01:00
]
}
],
"source": [
"with printoptions(precision=1, suppress=True):\n",
" for layer in model5.layers:\n",
" print(layer.get_weights())"
]
}
],
"metadata": {
"author": "Paweł Skórzewski",
"celltoolbar": "Slideshow",
"email": "pawel.skorzewski@amu.edu.pl",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"lang": "pl",
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"livereveal": {
"start_slideshow_at": "selected",
"theme": "white"
},
"subtitle": "10.Sieci neuronowe propagacja wsteczna[wykład]",
"title": "Uczenie maszynowe",
"vscode": {
"interpreter": {
"hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
}
},
"year": "2021"
},
"nbformat": 4,
"nbformat_minor": 4
}