umz21/wyk/2011_Wielowarstwowe_sieci_neuronowe.ipynb

1773 lines
145 KiB
Plaintext
Raw Normal View History

{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Uczenie maszynowe UMZ 2019/2020\n",
"### 26 maja 2020\n",
"# 11. Wielowarstwowe sieci neuronowe i algorytmy optymalizacji"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 11.1. Funkcje aktywacji"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Każda funkcja aktywacji ma swoje zalety i wady.\n",
"* Różne rodzaje funkcji aktywacji nadają się do różnych zastosowań."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/pawel/.local/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.\n",
" from ._conv import register_converters as _register_converters\n",
"Using TensorFlow backend.\n"
]
}
],
"source": [
"%matplotlib inline\n",
"\n",
"import math\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import random\n",
"\n",
"import keras\n",
"from keras.datasets import mnist\n",
"from keras.models import Sequential\n",
"from keras.layers import Dense, Dropout, SimpleRNN, LSTM\n",
"from keras.optimizers import Adagrad, Adam, RMSprop, SGD\n",
"\n",
"from IPython.display import YouTubeVideo"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"def plot(fun):\n",
" x = np.arange(-3.0, 3.0, 0.01)\n",
" y = [fun(x_i) for x_i in x]\n",
" fig = plt.figure(figsize=(14, 7))\n",
" ax = fig.add_subplot(111)\n",
" fig.subplots_adjust(left=0.1, right=0.9, bottom=0.1, top=0.9)\n",
" ax.set_xlim(-3.0, 3.0)\n",
" ax.set_ylim(-1.5, 1.5)\n",
" ax.grid()\n",
" ax.plot(x, y)\n",
" plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Funkcja logistyczna"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$$ g(x) = \\frac{1}{1 + e^{-x}} $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Przyjmuje wartości z przedziału $(0, 1)$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Funkcja logistyczna wykres"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1cAAAG2CAYAAACTRXz+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAIABJREFUeJzt3XmsXudh3/nfw7vvGy/3VRu1Wou1\n2HAWOk4cx0jjpq1bezpN0mXUdhq0M0Axk04GCaaDAl2AWVNMa6RB00GRNJhOWs/UbdI0ZdJ2YFmW\nLVnWZslcxEUSl7vxbrzbmT/el1ekxCvJvke8l+TnA7x43/O+5/I8NB6R/Pqc87ylqqoAAACwPls2\negAAAAA3A3EFAABQA3EFAABQA3EFAABQA3EFAABQA3EFAABQg1riqpTya6WUs6WUb6/x+eFSymQp\n5dnm45fqOC4AAMBm0VrTr/OPk/xKkn/yHvv8h6qqfrKm4wEAAGwqtZy5qqrqD5OM1fFrAQAA3Iiu\n5z1XHy+lPFdK+dellPuu43EBAAA+dHVdFvh+vpFkf1VV06WUzyb5F0nuvNaOpZQnkzyZJJ2dnR/d\nt2/fdRoiN4qVlZVs2WItFq5mXrAWc4NrMS9Yi7nBtXznO985X1XV6PvtV6qqquWApZQDSf7fqqru\n/wD7Hk/yaFVV599rv0OHDlWvvPJKLePj5nHkyJEcPnx4o4fBJmNesBZzg2sxL1iLucG1lFKeqarq\n0ffb77pkeSllRymlNF8/3jzuhetxbAAAgOuhlssCSym/keRwkq2llFNJfjlJW5JUVfUPkvyJJH+5\nlLKUZC7JF6q6TpkBAABsArXEVVVVX3yfz38ljaXaAQAAbkru1gMAAKiBuAIAAKiBuAIAAKiBuAIA\nAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiB\nuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIA\nAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiB\nuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIA\nAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiB\nuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKiBuAIAAKhBLXFVSvm1UsrZUsq31/i8\nlFL+t1LKa6WUb5VSHqnjuAAAAJtFXWeu/nGSz7zH5z+R5M7m48kk/0dNxwUAANgUaomrqqr+MMnY\ne+zyuST/pGr4apLBUsrOOo4NAACwGVyve652Jzl5xfap5nsAAAA3hdaNHsA7lVKeTOPSwYyOjubI\nkSMbOyA2nenpafOCdzEvWIu5wbWYF6zF3GA9rldcnU6y94rtPc333qWqqi8l+VKSHDp0qDp8+PCH\nPjhuLEeOHIl5wTuZF6zF3OBazAvWYm6wHtfrssAvJ/mZ5qqBH0syWVXVG9fp2AAAAB+6Ws5clVJ+\nI8nhJFtLKaeS/HKStiSpquofJPlKks8meS3JbJI/W8dxAQAANota4qqqqi++z+dVkr9Sx7EAAAA2\no023oAUAAMCHpaqqXFpaycylpcwuLGdmYSkzl5Yyc2k5swuN55mFt7d//L4dH/jXFlcAAMCmd2lp\nOdPzS5m+tJSLzefp+aXMLFy9/fbni1e9N7uwvPq8vFJ9oGOWkuwd7v7AYxRXAADAh+bymaKpucVM\nzi1man4xU3NLmZpvbs8t5uIVETQ9v3T1dvP1wvLK+x6rZUtJb0drejta09fZeB7sbs+e4e70tLek\nu73xXndHS3raW9Pd3tLcbl39vKejJT0drelpb01n25aUUvKnPuDvVVwBAADvaXG5EUdT80vvGUlT\n80tXvG5+Prf4vmHU0bplNYZ6m8+7Brve9d7qo7M1fVe+39mavo621RjaKOIKAABuESsrVS7OL2Vi\nbiHjs4sZn13IZPN5fHYxE7MLmWhuT1zx+cVLS+/567ZuKRnoakv/5Udna3YPdTXe62xLf1dr+jvb\n3t6ns3X1dV9nazpaW67T/wIfLnEFAAA3oKqqMjW/lAvTlzI2s5ALMwu5ML2QsZlLzTBqxNL47EIm\n5hYz0dxe63ajUpL+zrYMdbdloLs9I73tuWNbbwa72zLY1Z7B7mtFUuP1Rp8x2izEFQAAbAKXY2ls\nZiEXpi/lwszCNV5fjqhLGZ9dyOLytUupq60lQ91tGexuz1BPW3YOdGWwuy1D3Y1IGuxuf/vz5vNA\nV1tatgik9RBXAADwIamqKhOzizk3fSnnLjYeZy/Or76+MLOQ882zTWMza8dSb0drRnrbM9zTnt2D\nnfnI7oEM97ZnpKe9+X7H6uuh7vZ0tt0cl9ndaMQVAAB8j+YXl5uh1Iym6Us5NzV/VURdfv9awdTZ\ntiWjfR3Z2tuR3YOdeWB3f0Z6G4E03NO++los3VjEFQAANC2tVDk9MZc3J+fz1tT828/N142IunTN\nBR5KSUZ6OjLa15FtfR25c3tfRvs6Mtr79nujzUdvR6t7lG5C4goAgJteVVW5eGkpb02+HUpvR9Ol\nvDU1nzcm53Nh+lKq3/39q362vWVLtg90ZEd/Z+7Z0Z8fuvPtSLocT9v6OzLc3Z7Wli0b9DtkMxBX\nAADc8OYXl3NmYi5nJuZzZmIupyfmGtuTc3ljohFRswvL7/q5we627OjvzPb+zty7sz/z42/mYw/e\nvfrejoHODHW3OcvEByKuAADY1FZWqlyYWXg7mK6Mp2ZMXZhZuOpnSkm29XVk50BX7t7Zlx8+NJod\nzVi6/Ly9v/Nd9zIdOTKWw4/vu56/PW4i4goAgA21vFLlran5nBybzcnxuZwcm70qpM5MzmdhaeWq\nn+lub8nuwa7sGuzK/bsHsnuwM7ua27sHu7K9vzPtrS7R4/oSVwAAfKguL0d+cnw2J8fm8vrYbPP1\nbE6Nz+X0+FwWlt+Op1KS7X2d2T3UlQf2DObH7+9shNTA2/HU32VBCDYfcQUAwLrNLy7n5NhsI5yu\nOAP1ejOgpt+xut5Qd1v2Dnfn3p39+fR927NvuDt7h7qzd7g7uwY709Fq6XFuPOIKAIAPZG5hOSfG\nZnL8/GyOX5jJiQszOXZ+JicuzOaNyfmr9u1s27IaS08cHM7e4cbrxntd6ets26DfBXx4xBUAAKtm\nF5Zy4sJsjp+fyfELs1cF1JtTVwfUcE97Dox05+O3jeTA1p7sH+nOnmY8jfZ2uGyPW464AgC4xSwt\nr+T1sdkcPTeT756bztFzMzl2YSbHz8/k7MVLV+27tbc9+0d68ok7tubASHf2b+3JwZGe7BvpzkCX\ns09wJXEFAHCTmpxdzGvnpnP03HS+e26m+Tyd18dms7hcre63tbc9B7f25IfuGs2Bke4c2NqTAyON\nM1Eu34MPTlwBANzAlpZXcmp8bvUM1JXPV373U1tLyf6RntyxrTefvm9Hbh/tzW2jPbl9a28GugUU\n1EFcAQDcAJaWV3L8wmxefetivvPWdF49ezGvvjWdY+dnrlrGfKSnPbeN9uTH7t3eiKfR3tw22pu9\nQ11pbfG9T/BhElcAAJvI5Yh67Wwjor7zViOijp6fvupSvr3DXblrW18O3z2a20d7m4+eDHa3b+Do\n4dYmrgAANsDS8kpOjL19Juo7b13Ma2cbl/RdeSbqckR98u5tuXNbb+7a3pfbt/Wku90/42Cz8V8l\nAMCH7Pz0pbz8xsW8/OZUXmo+v3p2OgtL746oHz40mru29YkouAH5rxUAoCbzi8t57ex0Xn7zYl5+\nY6rx/OZUzk+/vbDEtr6
"text/plain": [
"<matplotlib.figure.Figure at 0x7fdda9490fd0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot(lambda x: 1 / (1 + math.exp(-x)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Tangens hiperboliczny"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$$ g(x) = \\tanh x = \\frac{e^{x} - e^{-x}}{e^{x} + e^{-x}} $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Przyjmuje wartości z przedziału $(-1, 1)$.\n",
"* Powstaje z funkcji logistycznej przez przeskalowanie i przesunięcie."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Tangens hiperboliczny wykres"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1cAAAG2CAYAAACTRXz+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAIABJREFUeJzs3Xd0HOW9h/Hvu7vq1bKKZVtyt9xx\nw6ZjUw0pQCgBQjc4cOEGktz0nnADaeSShCQQejVOaCZ0AqLjinuVu+Si3tuW9/6htRFGsmVrpFlJ\nz+ecPTuzO9r5kTNH8pPZnTXWWgEAAAAAOsfj9gAAAAAA0BsQVwAAAADgAOIKAAAAABxAXAEAAACA\nA4grAAAAAHAAcQUAAAAADnAkrowxDxpjio0xa9p5fpYxpsoYsyJ8+6kT+wUAAACASOFz6HUelvQX\nSY8eYpv3rLVfdGh/AAAAABBRHDlzZa19V1K5E68FAAAAAD1Rd37m6nhjzEpjzCvGmPHduF8AAAAA\n6HJOvS3wcJZLGmKtrTXGnCvpeUmj2trQGDNP0jxJio2NnZabm9tNI6KnCIVC8ni4Fgs+i+MC7eHY\nQFs4LtAejg20ZdOmTaXW2ozDbWestY7s0BgzVNK/rbUTOrDtdknTrbWlh9ouLy/Pbty40ZH50Hvk\n5+dr1qxZbo+BCMNxgfZwbKAtHBdoD8cG2mKMWWatnX647boly40xA4wxJrw8I7zfsu7YNwAAAAB0\nB0feFmiMeUrSLEnpxphCST+TFCVJ1tq/S7pI0k3GmICkBkmXWqdOmQEAAABABHAkrqy1lx3m+b+o\n5VLtAAAAANAr8Wk9AAAAAHAAcQUAAAAADiCuAAAAAMABxBUAAAAAOIC4AgAAAAAHEFcAAAAA4ADi\nCgAAAAAcQFwBAAAAgAOIKwAAAABwAHEFAAAAAA4grgAAAADAAcQVAAAAADiAuAIAAAAABxBXAAAA\nAOAA4goAAAAAHEBcAQAAAIADiCsAAAAAcABxBQAAAAAOIK4AAAAAwAHEFQAAAAA4gLgCAAAAAAcQ\nVwAAAADgAOIKAAAAABxAXAEAAACAA4grAAAAAHAAcQUAAAAADiCuAAAAAMABxBUAAAAAOIC4AgAA\nAAAHEFcAAAAA4ADiCgAAAAAcQFwBAAAAgAOIKwAAAABwAHEFAAAAAA4grgAAAADAAcQVAAAAADiA\nuAIAAAAABxBXAAAAAOAA4goAAAAAHEBcAQAAAIADiCsAAAAAcABxBQAAAAAOIK4AAAAAwAHEFQAA\nAAA4gLgCAAAAAAcQVwAAAADgAOIKAAAAABxAXAEAAACAA4grAAAAAHAAcQUAAAAADiCuAAAAAMAB\nxBUAAAAAOIC4AgAAAAAHEFcAAAAA4ADiCgAAAAAcQFwBAAAAgAOIKwAAAABwAHEFAAAAAA5wJK6M\nMQ8aY4qNMWvaed4YY/5kjCkwxqwyxkx1Yr8AAAAAECmcOnP1sKQ5h3j+HEmjwrd5kv7m0H4BAAAA\nICI4ElfW2ncllR9ik/MkPWpbfCwp1RiT7cS+AQAAACAS+LppP4Mk7Wq1Xhh+bE837R8AAABAL2St\nVchKwZBVyFoFQ1ZBaxUKhZfD68GQVSikT5ftp8+HPvPYZ19rdFZSh2fprrjqMGPMPLW8dVAZGRnK\nz893dyBEnNraWo4LfA7HBdrDsYG2cFygPX3x2LDWKmilYEjyh6SAtQqE1OrWsu4Ph4m/jecCISlg\nJX8oHCe25bHQ/tcOv37IhuMmvL7/uQNRc2Bd4SjSZ7YNHfSzofDzXWnuhOgOb9tdcVUkKafV+uDw\nY59jrb1P0n2SlJeXZ2fNmtXlw6Fnyc/PF8cFDsZxgfZwbKAtHBdoTyQcG6GQVYM/qPrmoOqbA6pr\nCqoxEFSjP6gmf0iN/v3r4WV/SA3+oJr8wQPr+7c/sE0g9Jnn/cGQmgMhNQVblq3DgeLzGPm8Rj6P\nJ3zfsuz1GEV5Tfi+Zd3n9SjGYz77Mwf9vNdjFOXxyOs1ivIYeT2eA6/jC697PZLHY+Q1LY97wvcH\nbsa0PO/Rp8/tf2z/zxx4TAceG5qeoJ929L/b2f8Z27VQ0i3GmPmSZkqqstbylkAAAAD0eNZa1TcH\nVdMYUE2jX9WNAVU3+lXTGFB9U0B1zcFP78OxVN/c/uP1zcGjmiPa51Gsz6PYKK9io7yKi/IqNsqj\nmCivUuKiFJsUo9gor2J8HsVEeRTl9Sja51G0t+UWtX+51X3UgXujaJ9HMZ95rO3to7xGxhiH/1fu\nGRyJK2PMU5JmSUo3xhRK+pmkKEmy1v5d0suSzpVUIKle0rVO7BcAAABwgj8YUkV9s4pqQlq0tUwV\n9X5VNTSruuHzwVTT6G95vGn/ekDB0OFP/UR7PYqP8Soh2qf4aK/iY3xKiPYqNT5aCTFexUe3rO9/\nfP99XDiWYsOx9Jlln/dAMHk8fTNoIokjcWWtvewwz1tJNzuxLwAAAOBQgiGr8rpmldQ0qbS2SRX1\nzaqoa1ZFvV+V9S33FfXNqmx1X9sU+PQFPvj4M69njJQY41NybJSSYlvus1NiNTo2UclxLY8lxX56\nn9zqPiHGp4Ron+KivYr2OfUtSIhUEXdBCwAAAOBgoZBVWatg2n9rWW9utdyk8rpmtXciKTnWp34J\n0UqNj1b/xGiNzExUanyU+sVHq198lHZvL9CJ0ye3PJYQ3RJI0T7OCqFDiCsAAAC4yh8MqbimSXur\nGrW3qlF7qhpa7qsbta+qUXuqGrWvulGBNoopxudRRlKM0hNjNLhfvKbkpiojMUbp4cfSE2OUltAS\nTilxUfJ5D332KL9pu04ald5V/6no5YgrAAAAdKlGf1CFFfXaVdGgwvJ6FVY0aFdFvYoqGrSnqlEl\ntU2fu1pdbJRH2SlxGpAcq5nD0jQgJVZZybHKSIo5EFPpidFKjPH12YsnIPIQVwAAAOgUa61Kapq0\ntbRO20vrtLNVQBVWNKikpukz20f7PBrcL06DUuM0ZkCyBqTEHrhlp8QqOzlOyXFEE3oe4goAAAAd\nUtXg17bSOm0rrdW2kjptLa3TtnBQ1bW6fLjXYzQwNVY5/eI1Oy9DOf3ilZMWr5y0OA3uF6+MxBg+\nw4ReibgCAADAZ5TWNmnT3hpt2lejTcW12ryvRltL6lRW13xgG4+RBveL17D0BB07NE3DMxI0tH+C\nhqUnKDsl9rCfbQJ6I+IKAACgj6pu9GvDnnBEHbjVqrxVRKXERWl0VqLOGp+lYekJGpaeqGHpLWei\nYnxeF6cHIg9xBQAA0AeU1DRp7e4qrd1dfeB+R1n9gecTY3walZWos8ZlaVRWkvKykjQ6K1EZSTF8\n9gnoIOIKAACglymuadSKnZVaXfRpTO2r/vSiErlp8Ro/MFkXTxus8QNTNHpAkgamxBJRQCcRVwAA\nAD1YUyCodbur9cnOSn2yq1Kf7KxQYUWDpJbPRY3MTNSJI9I1bmCyxg9M0biByUqJi3J5aqB3Iq4A\nAAB6kH3VjVq8rVzLd1bok52VWre7Ws3BkCRpYEqspuT20zUnDNWU3FSNH5ii2Cg+FwV0F+IKAAAg\nQllrVVjRoEXbyrV4W5kWbSs/8Dmp2CiPJg1O1bUnDdWUnH6akpuqrORYlycG+jbiCgAAIILsKq/X\nBwWlWrStXIu2lml3VaMkKTU+SscOTdOVxw3RjGFpGpudrCgudw5EFOIKAADARdWNfn1YUKb3C0r0\n/uZSbQ+fmUpPjNbMYf114/A0zRiWptGZSXzxLhDhiCsAAIBuFAiGtLKwUu9uKtV7m0u0srBKwZBV\nfLRXxw/vr6tPGKqTR6VrREYiV+8DehjiCgAAoItVN/r1zsYSvbWhWG9vLFZlvV/GSJMGp+qmU0fo\n5FHpmpLbT9E+3uYH9GTEFQAAQBfYUVanN9cX6z/r92nxtnIFQlb94qN0Wl6mThubqZNGpis1Ptrt\nMQE4iLgCAABwgLVW6/Z
"text/plain": [
"<matplotlib.figure.Figure at 0x7fdda93e9590>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot(lambda x: math.tanh(x))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### ReLU (_Rectifier Linear Unit_)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$$ g(x) = \\max(0, x) $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### ReLU zalety\n",
"* Mniej podatna na problem zanikającego gradientu (_vanishing gradient_) niż funkcje sigmoidalne, dzięki czemu SGD jest szybciej zbieżna.\n",
"* Prostsze obliczanie gradientu.\n",
"* Dzięki zerowaniu ujemnych wartości, wygasza neurony, „rozrzedzając” sieć (_sparsity_), co przyspiesza obliczenia."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"#### ReLU wady\n",
"* Dla dużych wartości gradient może „eksplodować”.\n",
"* „Wygaszanie” neuronów."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### ReLU wykres"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1cAAAG2CAYAAACTRXz+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAGJhJREFUeJzt3X+sZOd91/HPN/6RRQk0AS+NYzvE\npdalpkBLLNcRVbkmTnFMFTelkWwhkQDVtgirBQmBi0UCgUitkABBItpVY9VJo6QWqZul2eIkda7d\nCNzaiZzWXmfdxQR2t4lN7CT0KmnM1g9/7Gx7vZ659+7Oc2fOzLxe0tXOj7PzPH98vfbb58zZaq0F\nAACA6bxk3hsAAABYBuIKAACgA3EFAADQgbgCAADoQFwBAAB0IK4AAAA66BJXVXVnVT1dVY9OeH+9\nqr5WVY+Mft7RY10AAIChuLDT5/x8kvckef82x/x6a+0HOq0HAAAwKF3OXLXWHkjybI/PAgAAWESz\n/M7V66vqc1X1q1X152e4LgAAwJ7rdVngTj6b5M+01jar6qYkv5zkqnEHVtWBJAeSZN++fa97zWte\nM6Mtsiief/75vOQl7sXCC5kLJjEbjGMukme+0fJ7/6/lWy6uvHJfzXs7g2E2GOeJJ574cmtt/07H\nVWuty4JV9dokv9Ja+85dHPuFJNe01r683XFra2vt6NGjXfbH8tjY2Mj6+vq8t8HAmAsmMRuMs8pz\n0VrLP//oo/mFB/93fvSvfltuv/HPpUpcnbHKs8FkVfWZ1to1Ox03kyyvqlfV6J/aqrp2tO4zs1gb\nAIDThBXsrS6XBVbVh5KsJ7mkqk4keWeSi5KktfYzSX44yd+vqlNJvpHkltbrlBkAADsSVrD3usRV\na+3WHd5/T07fqh0AgBkTVjAbvq0HALDEhBXMjrgCAFhSwgpmS1wBACwhYQWzJ64AAJaMsIL5EFcA\nAEtEWMH8iCsAgCUhrGC+xBUAwBIQVjB/4goAYMEJKxgGcQUAsMCEFQyHuAIAWFDCCoZFXAEALCBh\nBcMjrgAAFoywgmESVwAAC0RYwXCJKwCABSGsYNjEFQDAAhBWMHziCgBg4IQVLAZxBQAwYMIKFoe4\nAgAYKGEFi0VcAQAMkLCCxSOuAAAGRljBYhJXAAADIqxgcYkrAICBEFaw2MQVAMAACCtYfOIKAGDO\nhBUsB3EFADBHwgqWh7gCAJgTYQXLRVwBAMyBsILlI64AAGZMWMFyElcAADMkrGB5iSsAgBkRVrDc\nxBUAwAwIK1h+4goAYI8JK1gN4goAYA8JK1gd4goAYI8IK1gt4goAYA8IK1g94goAoDNhBatJXAEA\ndCSsYHWJKwCAToQVrDZxBQDQgbACxBUAwJSEFZCIKwCAqQgr4AxxBQBwnoQVsJW4AgA4D8IKOJu4\nAgA4R8IKGEdcAQCcA2EFTCKuAAB2SVgB2xFXAAC7IKyAnYgrAIAdCCtgN8QVAMA2hBWwW+IKAGAC\nYQWcC3EFADCGsALOlbgCADiLsALOh7gCANhCWAHnS1wBAIwIK2Aa4goAIMIKmJ64AgBWnrACehBX\nAMBKE1ZAL+IKAFhZwgroqUtcVdWdVfV0VT064f2qqv9QVceq6req6i/3WBcA4HwJK6C3Xmeufj7J\njdu8/6YkV41+DiT5T53WBQA4Z8IK2AsX9viQ1toDVfXabQ65Ocn7W2styYNV9YqqurS19sUe6wMA\n7FZrLR848lzuOy6sgL5m9Z2ry5Ic3/L8xOg1AICZOXPG6r7jp4QV0F2XM1c9VdWBnL50MPv378/G\nxsZ8N8TgbG5umgtexFwwidngjD86Y3UqN1zWct2+L+X++5+a97YYGH9mMI1ZxdXJJFdseX756LUX\naa0dTHIwSdbW1tr6+vqeb47FsrGxEXPB2cwFk5gNkq1nrE5fCnjdvi/l+uuvn/e2GCB/ZjCNWV0W\neCjJ3x7dNfC6JF/zfSsAYBbcvAKYlS5nrqrqQ0nWk1xSVSeSvDPJRUnSWvuZJIeT3JTkWJKvJ/k7\nPdYFANiOsAJmqdfdAm/d4f2W5B/0WAsAYDeEFTBrs7osEABgZoQVMA/iCgBYKsIKmBdxBQAsDWEF\nzJO4AgCWgrAC5k1cAQALT1gBQyCuAICFJqyAoRBXAMDCElbAkIgrAGAhCStgaMQVALBwhBUwROIK\nAFgowgoYKnEFACwMYQUMmbgCABaCsAKGTlwBAIMnrIBFIK4AgEETVsCiEFcAwGAJK2CRiCsAYJCE\nFbBoxBUAMDjCClhE4goAGBRhBSwqcQUADIawAhaZuAIABkFYAYtOXAEAcyesgGUgrgCAuRJWwLIQ\nVwDA3AgrYJmIKwBgLoQVsGzEFQAwc8IKWEbiCgCYKWEFLCtxBQDMjLAClpm4AgBmQlgBy05cAQB7\nTlgBq0BcAQB7SlgBq0JcAQB7RlgBq0RcAQB7QlgBq0ZcAQDdCStgFYkrAKArYQWsKnEFAHQjrIBV\nJq4AgC6EFbDqxBUAMDVhBSCuAIApCSuA08QVAHDehBXAHxFXAMB5EVYALySuAIBzJqwAXkxcAQDn\nRFgBjCeuAIBdE1YAk4krAGBXhBXA9sQVALAjYQWwM3EFAGxLWAHsjrgCACYSVgC7J64AgLGEFcC5\nEVcAwIsIK4BzJ64AgBcQVgDnR1wBAH9IWAGcP3EFACQRVgDTElcAgLAC6EBcAcCKE1YAfYgrAFhh\nwgqgH3EFACtKWAH0Ja4AYAUJK4D+xBUArBhhBbA3xBUArBBhBbB3usRVVd1YVUer6lhV3T7m/bdX\n1f+pqkdGPz/SY10AYPeEFcDeunDaD6iqC5K8N8kbk5xI8lBVHWqtHTnr0F9srd027XoAwLkTVgB7\nr8eZq2uTHGutPdlaey7Jh5Pc3OFzAYAOhBXAbEx95irJZUmOb3l+Isn3jDnub1bV9yV5Isk/aq0d\nH3NMqupAkgNJsn///mxsbHTYIstkc3PTXPAi5oJJVn02Wmv5wJHnct/xU7npyoty3b4v5f77n5r3\ntuZu1eeCycwG0+gRV7vxX5J8qLX2zar60SR3Jflr4w5srR1McjBJ1tbW2vr6+oy2yKLY2NiIueBs\n5oJJVnk2zpyxuu+4M1ZnW+W5YHtmg2n0uCzwZJIrtjy/fPTaH2qtPdNa++bo6c8leV2HdQGACVwK\nCDB7PeLqoSRXVdWVVXVxkluSHNp6QFVduuXpm5M83mFdAGAMYQUwH1NfFthaO1VVtyW5N8kFSe5s\nrT1WVe9K8nBr7VCSH6+qNyc5leTZJG+fdl0A4MWEFcD8dPnOVWvtcJLDZ732ji2PfzLJT/ZYCwAY\nT1gBzFeXv0QYAJgvYQUwf+IKABacsAIYBnEFAAtMWAEMh7gCgAUlrACGRVwBwAISVgDDI64AYMEI\nK4BhElcAsECEFcBwiSsAWBDCCmDYxBUALABhBTB84goABk5YASwGcQUAAyasABaHuAKAgRJWAItF\nXAHAAAkrgMUjrgBgYIQVwGISVwAwIMIKYHGJKwAYCGEFsNjEFQAMgLACWHziCgDmTFgBLAdxBQBz\nJKwAloe4AoA5EVYAy0VcAcAcCCuA5SOuAGDGhBXAchJXADBDwgpgeYkrAJgRYQWw3MQVAMyAsAJY\nfuIKAPaYsAJYDeIKAPaQsAJYHeIKAPaIsAJYLeIKAPaAsAJYPeIKADoTVgCrSVwBQEfCCmB1iSsA\n6ERYAaw2cQUAHQgrAMQVAExJWAGQiCsAmIqwAuAMcQUA50lYAbCVuAKA8yCsADibuAKAcySsABhH\nXAHAORBWAEwirgBgl4QVANsRVwCwC8IKgJ2IKwDYgbACYDfEFQBsQ1gBsFviCgAmEFYAnAtxBQBj\nCCsAzpW4AoCzCCsAzoe4AoAthBUA50tcAcCIsAJgGuIKACKsAJieuAJg5QkrAHoQVwCsNGEFQC/i\nCoCVJawA6ElcAbCShBU
"text/plain": [
"<matplotlib.figure.Figure at 0x7fdda936c6d0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot(lambda x: max(0, x))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Softplus"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$$ g(x) = \\log(1 + e^{x}) $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Wygładzona wersja ReLU."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Softplus wykres"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1cAAAG2CAYAAACTRXz+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAIABJREFUeJzt3Xd0XOWd//HPV733YlmWuyz3go0J\nJWCHXoIDgQCppKxDyi7pIWEDu+THhkAKKbtJ2IQEshAgEEIzmBaFboONuy13W5Zly+oa9dE8vz80\nNsZItrCudUfS+3WOjmdGF93vnvOs7HfunWfMOScAAAAAQP9E+T0AAAAAAAwFxBUAAAAAeIC4AgAA\nAAAPEFcAAAAA4AHiCgAAAAA8QFwBAAAAgAc8iSszu9vMqsxsXS/fX2BmDWa2Kvx1kxfnBQAAAIBI\nEePRz/mTpF9Luvcox7zsnLvEo/MBAAAAQETx5MqVc+4lSbVe/CwAAAAAGIwG8j1Xp5rZajN72sym\nDeB5AQAAAOCE8+q2wGNZKWmMcy5gZhdJ+ruk4p4ONLPFkhZLUkJCwtzRo0cP0IgYLEKhkKKi2IsF\n78a6QG9YG+iJX+uiMyQdaAmpIyRlxJsy4m3AZ8DR8TsDPdm8eXO1cy73WMeZc86TE5rZWElPOuem\n9+HYnZLmOeeqj3ZcSUmJKysr82Q+DB2lpaVasGCB32MgwrAu0BvWBnrix7p4em2lvv3wGsVEm35+\n1WwtLMkb0POjb/idgZ6Y2Qrn3LxjHTcgV67MbISk/c45Z2bz1X07Ys1AnBsAAMBPnV0h3fb0Jv3h\nlR2aVZSh//nESSrMSPR7LAAngCdxZWZ/kbRAUo6Z7ZF0s6RYSXLO/VbSFZK+ZGZBSa2SrnZeXTID\nAACIUPsa2vSV+1dqxa46XXvaWH3/oimKi+GWM2Co8iSunHPXHOP7v1b3Vu0AAADDwitbqnX9A2+r\nrbNLv7pmjj48a6TfIwE4wQZqQwsAAIBhIRRy+tWLW3XnC5tVnJei//nEXE3MS/F7LAADgLgCAADw\nSG1zh7724Cq9tPmALptTqFsvm66kOP65BQwX/H87AACAB97eXaev3LdS1YEO/ddlM3TN/CKZsdU6\nMJwQVwAAAP3gnNM9r+3UrUs2akR6gh750mmaMSrd77EA+IC4AgAAOE6B9qC++8gaPbWmUudMydNP\nr5yt9KRYv8cC4BPiCgAA4DiU7WvSl+5boZ3Vzbrhwsla/MHxioriNkBgOCOuAAAA3qe/vlWumx5b\nr5SEGN3/Lx/QB8Zn+z0SgAhAXAEAAPRRc3tQP3hsnf62skKnjs/WL66ZrbzUBL/HAhAhiCsAAIA+\n2LSvUV+5b6W2Vzfra+cU618/VKxobgMEcBjiCgAA4Cicc3rwzXLd/Ph6pSXG6r4vnKLTJuT4PRaA\nCERcAQAA9CLQHtSNj67VY6v26oyJOfr5VbOVmxrv91gAIhRxBQAA0IP1exv0r/e/rZ01zfrmuZP0\n5YUTuQ0QwFERVwAAAIdxzun/lu3WD5/coMykWHYDBNBnxBUAAEBYY1unvve3tXpqTaXOnJSrn39s\nlrJTuA0QQN8QVwAAAJLW7mnQV/+yUnvqWvWdC0p03ZkT+FBgAO8LcQUAAIY155zufX2Xbn1qo7JT\n4vTA4g/o5LFZfo8FYBAirgAAwLBV39Kh7z6yRkvX79eHJufpJ1fOUlZynN9jARikiCsAADAsLdte\no689uErVgXZ9/6LJ+sIZ47kNEEC/EFcAAGBYCXaF9OiWDj2x9A2NzkrSI186TTNHZfg9FoAhgLgC\nAADDRkV9q77+wCot39mpy+cU6paPTFdKPP8cAuANfpsAAIBh4Zl1lfruI2sV7ArpX2bE6carZvs9\nEoAhhrgCAABDWltnl3745Abdt2y3Zo5K1y+vnqOd6970eywAQxBxBQAAhqyyfU3617+s1Ob9AS0+\nc7y+dV6J4mKitNPvwQAMScQVAAAYcpxz+r9lu/X/ntyg1IQY3fO5+TprUq7fYwEY4ogrAAAwpBz+\n2VVnTsrVT6+cpdzUeL/HAjAMEFcAAGDIWL6jVtc/8LaqA+268aIp+vwZ4/jsKgADhrgCAACDXmdX\nSHc+v1m/Kd3GZ1cB8A1xBQAABrXtBwL6+oOrtHpPg66cO0o3XzqNz64C4At+8wAAgEHJOacH3izX\nLU9sUFxMlH7ziZN04YwCv8cCMIwRVwAAYNCpCbTrhr+t1XMb9uv0idn66ZWzNSI9we+xAAxzxBUA\nABhUSsuq9O2H16ihpVP/fvEUfe50Nq0AEBmIKwAAMCi0dXbptqc36U+v7VRJfqru/dx8TSlI83ss\nADiEuAIAABFvw95Gfe3Bt7V5f0CfPX2svnvBZCXERvs9FgC8C3EFAAAiVijk9IdXduiOpWVKT4rV\nPZ+br7Mm5fo9FgD0iLgCAAARqbKhVd/662q9urVG503N120fnams5Di/xwKAXhFXAAAg4jy5Zq9u\nfHSdOoIh3Xb5DF11cpHM2LQCQGQjrgAAQMSob+nQDx5brydW79WsogzdedVsjctJ9nssAOgT4goA\nAESE0rIqfefhNapt7tA3z52kLy2YoJjoKL/HAoA+I64AAICvmtuDunXJRt2/bLcm5afo7mtP1vTC\ndL/HAoD3jbgCAAC+eXNnrb750GqV17Xoi2eO19fPncQW6wAGLeIKAAAMuLbOLv38uc266+XtGpWZ\nqAcXn6r547L8HgsA+oW4AgAAA2r93gZ948HVKtvfpGvmj9aNF09RSjz/JAEw+PGbDAAADIhgV0i/\n/ec23fn8FmUlx+mP156shZPz/B4LADxDXAEAgBNu+4GAvvHQaq0qr9clMwv0w0XTlckHAgMYYogr\nAABwwoRCTve+vlO3PbNJ8THR+uU1c3TprJF+jwUAJwRxBQAATohdNc369sNrtHxHrc6alKvbr5ip\n/LQEv8cCgBOGuAIAAJ4KhZzueX2nbn+mTDFRpts/OlNXzhslM/N7NAA4oYgrAADgmZ3VzfrOw2u0\nfGf31arbPjpDBemJfo8FAAOCuAIAAP0WCjn98bWdumPpJsVGR+mOK2bqirlcrQIwvBBXAACgX3ZU\nN+s7D6/WmzvrtLAkVz+6fKZGpPPeKgDDD3EFAACOS1fI6Y+v7tAdS8sUFxOln1w5Sx89qZCrVQCG\nLeIKAAC8b9sPBPTth9doxa46fWhynv7rshlcrQIw7BFXAACgz7pCTne/skM/ebZM8TFR+tnHZumy\nOVytAgCJuAIAAH20sbJRNzyyRqv3NOjsyXn6r8tn8LlVAHAY4goAABxVe7BLv35xq35Tuk1pibH6\nxdWzdemskVytAoAjEFcAAKBXb+2s1XcfWaNtB5p1+ZxC/fslU5WVHOf3WAAQkYgrAADwHoH2oG5/\nZpP+/MYujUxP1J8+e7IWlOT5PRYARDTiCgAAvMs/NlXpxkfXqrKxTZ85day+dX6JUuL5JwMAHIsn\nvynN7G5Jl0iqcs5N7+H7JukXki6S1CLpWufcSi/ODQAAvFETaNctT27QY6v2amJeih6+7jTNHZPp\n91gAMGh49T9D/UnSryXd28v3L5RUHP46RdJvwn8CAACfOef02Kq9uuXJDWpq69T1ZxfrywsnKD4m\n2u/RAGBQ8SSunHMvmdnYoxyySNK9zjkn6Q0zyzCzAudcpRfnBwAAx6e8tkU3PbZO/yg7oNlFGfrx\nR2eqZESq32MBwKA0UDdQF0oqP+z5nvBrxBUAAD7o7Arp7ld26M7nt8hM+sElU3XtaWMVHcX26gBw\nvCLu3almtljSYknKzc1VaWmpvwMh4gQCAdYF3oN1gd6wNt5ra32X7lnfofKmkObkReuTU+KUHdyl\nl1/a5fdoA4Z1gd6wNtAfAxVXFZKKDns+Kvzaezjn7pJ0lySVlJS4BQsWnPDhMLiUlpaKdYEjsS7Q\nG9bGOxpaO3XH0k26b9l
"text/plain": [
"<matplotlib.figure.Figure at 0x7fdde8452e10>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot(lambda x: math.log(1 + math.exp(x)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Problem zanikającego gradientu (_vanishing gradient problem_)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Sigmoidalne funkcje aktywacji ograniczają wartości na wyjściach neuronów do niewielkich przedziałów ($(-1, 1)$, $(0, 1)$ itp.).\n",
"* Jeżeli sieć ma wiele warstw, to podczas propagacji wstecznej mnożymy przez siebie wiele małych wartości → obliczony gradient jest mały.\n",
"* Im więcej warstw, tym silniejszy efekt zanikania."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Sposoby na zanikający gradient\n",
"\n",
"* Modyfikacja algorytmu optymalizacji (_RProp_, _RMSProp_)\n",
"* Użycie innej funckji aktywacji (ReLU, softplus)\n",
"* Dodanie warstw _dropout_\n",
"* Nowe architektury (LSTM itp.)\n",
"* Więcej danych, zwiększenie mocy obliczeniowej"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 11.2. Wielowarstwowe sieci neuronowe w&nbsp;praktyce"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Przykład: MNIST\n",
"\n",
"_Modified National Institute of Standards and Technology database_"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Zbiór cyfr zapisanych pismem odręcznym\n",
"* 60 000 przykładów uczących, 10 000 przykładów testowych\n",
"* Rozdzielczość każdego przykładu: 28 × 28 = 784 piksele"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# źródło: https://github.com/keras-team/keras/examples/minst_mlp.py\n",
"\n",
"import keras\n",
"from keras.datasets import mnist\n",
"\n",
"# załaduj dane i podziel je na zbiory uczący i testowy\n",
"(x_train, y_train), (x_test, y_test) = mnist.load_data()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"def draw_examples(examples, captions=None):\n",
" plt.figure(figsize=(16, 4))\n",
" m = len(examples)\n",
" for i, example in enumerate(examples):\n",
" plt.subplot(100 + m * 10 + i + 1)\n",
" plt.imshow(example, cmap=plt.get_cmap('gray'))\n",
" plt.show()\n",
" if captions is not None:\n",
" print(6 * ' ' + (10 * ' ').join(str(captions[i]) for i in range(m)))"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA6IAAACPCAYAAADgImbyAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAHEtJREFUeJzt3XmQVNXZx/HniIAYREQIISKCggiR\nTUDB1wITwBVZJKKEPUYoUYSUUKASgzEIolLFIlEkMIKUaIVVI0EElKiEAgnmZXXAyJYRUEE2Iy96\n3z/o5T5Hpqd7uvvc2z3fT9UU9ze3u+/pnme659D93GM8zxMAAAAAAFw5J+gBAAAAAADKFiaiAAAA\nAACnmIgCAAAAAJxiIgoAAAAAcIqJKAAAAADAKSaiAAAAAACnmIgCAAAAAJxiIgoAAAAAcCqtiagx\n5hZjzA5jzE5jzOhMDQq5h1pAFLUAEeoAcdQCRKgDxFELiPE8r1RfIlJORHaJyOUiUkFEPhaRxiVc\nx+Mr574OZboWQnCf+MpCHVALZeYr488J1ELOfvH6wFdW6oBayNkvaoGvpGvB87y03hG9VkR2ep73\nqed5p0Rkvoh0TeP2EE67k7gMtZD/kqkDEWqhLOA5AVHUAkSoA8RRC4hK6u/GdCail4jIXl/eF/me\nYowZZIzZYIzZkMaxEG4l1gJ1UGZQCxDh9QFxPCdAhOcExFELiDk32wfwPG+GiMwQETHGeNk+HsKJ\nOkAUtYAoagEi1AHiqAVEUQtlQzrviO4XkUt9uXbkeyh7qAVEUQsQoQ4QRy1AhDpAHLWAmHQmoutF\npIExpp4xpoKI3CMiSzMzLOQYagFR1AJEqAPEUQsQoQ4QRy0gptQfzfU877Qx5kERWS5nzoA1y/O8\nLRkbGXIGtYAoagEi1AHiqAWIUAeIoxbgZyKnRXZzMD7jnYs+8jyvVSZvkDrISRmvAxFqIUdRC4ji\n9QEiPCcgjlpAVFK1kM5HcwEAAAAASBkTUQAAAACAU0xEAQAAAABOMREFAAAAADjFRBQAAAAA4BQT\nUQAAAACAU0xEAQAAAABOMREFAAAAADh1btADAPJVy5YtVX7wwQdV7tevn8pz5sxReerUqSpv3Lgx\ng6MDAABANk2ePFnlhx56KLa9efNmta9z584q7969O3sDCwneEQUAAAAAOMVEFAAAAADgFB/NTVK5\ncuVUvvDCC5O+rv2RzPPPP1/lhg0bqvzAAw+o/Oyzz6rcq1cvlf/73/+qPGHChNj2E088kfQ4kZ7m\nzZurvGLFCpWrVKmisud5Kvft21flLl26qHzxxRenO0TkiQ4dOqg8b948ldu3b6/yjh07sj4mZMeY\nMWNUtp/TzzlH/3/yjTfeqPJ7772XlXEByIwLLrhA5cqVK6t8++23q1yjRg2VJ02apPK3336bwdEh\nVXXr1lW5T58+Kn///fex7UaNGql9V111lcp8NBcAAAAAgAxjIgoAAAAAcIqJKAAAAADAqTLTI1qn\nTh2VK1SooPL111+v8g033KBy1apVVe7Ro0fGxrZv3z6Vp0yZonL37t1VPnbsmMoff/yxyvQEuXPt\ntdfGthcsWKD22X3Edk+o/XM8deqUynZPaJs2bVS2l3Oxr18WtGvXLrZtP16LFi1yPRxnWrdurfL6\n9esDGgkybcCAASqPGjVKZX9/0dnYzzMAgufvG7R/p9u2bavy1VdfndJt16pVS2X/8iBw79ChQyqv\nWbNGZfv8H2Ud74gCAAAAAJxiIgoAAAAAcIqJKAAAAADAqbztEbXXdFy1apXKqawDmml2j4+9Ttzx\n48dVttcILCoqUvnw4cMqs2Zg5thrvl5zzTUqv/LKK7Ftu0+jJIWFhSpPnDhR5fnz56v8wQcfqGzX\nzfjx41M6fj7wr5nYoEEDtS+fekTttSLr1aun8mWXXaayMSbrY0J22D/L8847L6CRIFXXXXedyv71\nA+21fX/2s58lvK0RI0ao/J///Edl+zwW/tciEZF169YlHiwyyl7/cfjw4Sr37t07tl2pUiW1z36+\n3rt3r8r2+STstSd79uyp8vTp01Xevn17ccNGFpw4cULlsrAWaDp4RxQAAAAA4BQTUQAAAACAU0xE\nAQAAAABO5W2P6J49e1T+8ssvVc5kj6jdi3HkyBGVf/7zn6tsr/c4d+7cjI0FmfXiiy+q3KtXr4zd\ntt1vWrlyZZXt9WD9/ZAiIk2bNs3YWHJVv379Yttr164NcCTZZfcf33fffSrb/WH0BOWOjh07qjx0\n6NCEl7d/tp07d1b5wIEDmRkYSnT33XerPHnyZJWrV68e27b7AN99912Va9SoofIzzzyT8Nj27dnX\nv+eeexJeH6mx/2Z8+umnVbZr4YILLkj6tu3zRdx8880qly9fXmX7OcBfZ2fLcKtq1aoqN2vWLKCR\n5AbeEQUAAAAAOMVEFAAAAADgFBNRAAAAAIBTedsj+tVXX6k8cuRIle2+mn/+858qT5kyJeHtb9q0\nKbbdqVMntc9eQ8heL2zYsGEJbxvBadmypcq33367yonWZ7R7Ot944w2Vn332WZXtdeHsGrTXh/3F\nL36R9FjKCnt9zXw1c+bMhPvtHiOEl73+4+zZs1Uu6fwFdu8ga9Rlz7nn6j+RWrVqpfJLL72ksr3u\n9Jo1a2LbTz75pNr3/vvvq1yxYkWVX3/9dZVvuummhGPdsGFDwv1IT/fu3VX+zW9+U+rb2rVrl8r2\n35D2OqL169cv9bHgnv08UKdOnaSv27p1a5XtfuB8fL4vG3/FAQAAAABCg4koAAAAAMCpEieixphZ\nxpiDxpjNvu9VM8asMMYURv69KLvDRBhQC4iiFiBCHSCOWkAUtQAR6gDJSaZHtEBEponIHN/3RovI\nSs/zJhhjRkfyqMwPL3MWL16s8qpVq1Q+duyYyva6P/fee6/K/n4/uyfUtmXLFpUHDRqUeLDhVSB5\nUAt+zZs3V3nFihUqV6lSRWXP81RetmxZbNteY7R9+/YqjxkzRmW77+/QoUMqf/zxxyp///33Ktv9\nq/a6pBs3bpQsKpAAasFeO7VmzZqZvPnQKqlv0K5bhwokz54Tsq1///4q//SnP014eXu9yTlz5pz9\ngsErkDyrhT59+qhcUq+2/XvoX1vy6NGjCa9rr0NZUk/ovn37VH755ZcTXt6xAsmzWrjrrrtSuvxn\nn32m8vr162Pbo0bpu233hNoaNWqU0rFDpEDyrA6SYZ//o6CgQOWxY8cWe11735EjR1SeNm1aOkML\npRLfEfU8b42IfGV9u6uIRJ/1XhaRbhkeF0KIWkAUtQAR6gBx1AKiqAWIUAdITmnPmlvT87yiyPbn\nIlLs2xLGmEEikrNvAaJESdUCdVAmUAsQ4fUBcTwnIIpagAivD7CkvXyL53meMcZLsH+GiMwQEUl0\nOeS+RLVAHZQt1AJEeH1AHM8JiKIWIMLrA84o7UT0gDGmlud5RcaYWiJyMJODcqGkfo2vv/464f77\n7rsvtv3aa6+pfXYvX57LqVq48sorVbbXl7V78b744guVi4qKVPb35Rw/flzt++tf/5owp6tSpUoq\nP/zwwyr37t07o8dLQtZr4bbbblPZfgzyhd37Wq9evYSX379/fzaHk6qcek7IturVq6v861//WmX7\n9cLuCfrjH/+YnYG5kVO1YK/1+eijj6psnyNg+vTpKtvnASjp7wy/xx57LOnLiog89NBDKtvnGAih\nnKoFm/9vPpEfnuvj7bffVnnnzp0qHzxY+rubZ+dCyOk6KA37eSVRj2hZVNrlW5aKSPSMC/1FZElm\nhoMcRC0gilqACHWAOGoBUdQCRKgDWJJZvuVVEVkrIg2NMfuMMfeKyAQR6WSMKRSRjpGMPEctIIpa\ngAh1gDhqAVHUAkSoAySnxI/mep7Xq5hdHTI8FoQctYAoagEi1AHiqAVEUQsQoQ6QnLRPVpSv7M9w\nt2zZUmX/GpEdO3ZU++x
"text/plain": [
"<matplotlib.figure.Figure at 0x7fdda922aad0>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" 5 0 4 1 9 2 1\n"
]
}
],
"source": [
"draw_examples(x_train[:7], captions=y_train)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"60000 przykładów uczących\n",
"10000 przykładów testowych\n"
]
}
],
"source": [
"num_classes = 10\n",
"\n",
"x_train = x_train.reshape(60000, 784) # 784 = 28 * 28\n",
"x_test = x_test.reshape(10000, 784)\n",
"x_train = x_train.astype('float32')\n",
"x_test = x_test.astype('float32')\n",
"x_train /= 255\n",
"x_test /= 255\n",
"print('{} przykładów uczących'.format(x_train.shape[0]))\n",
"print('{} przykładów testowych'.format(x_test.shape[0]))\n",
"\n",
"# przekonwertuj wektory klas na binarne macierze klas\n",
"y_train = keras.utils.to_categorical(y_train, num_classes)\n",
"y_test = keras.utils.to_categorical(y_test, num_classes)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"_________________________________________________________________\n",
"Layer (type) Output Shape Param # \n",
"=================================================================\n",
"dense_1 (Dense) (None, 512) 401920 \n",
"_________________________________________________________________\n",
"dropout_1 (Dropout) (None, 512) 0 \n",
"_________________________________________________________________\n",
"dense_2 (Dense) (None, 512) 262656 \n",
"_________________________________________________________________\n",
"dropout_2 (Dropout) (None, 512) 0 \n",
"_________________________________________________________________\n",
"dense_3 (Dense) (None, 10) 5130 \n",
"=================================================================\n",
"Total params: 669,706\n",
"Trainable params: 669,706\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"model = Sequential()\n",
"model.add(Dense(512, activation='relu', input_shape=(784,)))\n",
"model.add(Dropout(0.2))\n",
"model.add(Dense(512, activation='relu'))\n",
"model.add(Dropout(0.2))\n",
"model.add(Dense(num_classes, activation='softmax'))\n",
"model.summary()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"((60000, 784), (60000, 10))\n"
]
}
],
"source": [
"print(x_train.shape, y_train.shape)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Train on 60000 samples, validate on 10000 samples\n",
"Epoch 1/5\n",
"60000/60000 [==============================] - 9s 153us/step - loss: 0.2489 - acc: 0.9224 - val_loss: 0.1005 - val_acc: 0.9706\n",
"Epoch 2/5\n",
"60000/60000 [==============================] - 9s 151us/step - loss: 0.1042 - acc: 0.9683 - val_loss: 0.0861 - val_acc: 0.9740\n",
"Epoch 3/5\n",
"60000/60000 [==============================] - 9s 153us/step - loss: 0.0742 - acc: 0.9782 - val_loss: 0.0733 - val_acc: 0.9796\n",
"Epoch 4/5\n",
"60000/60000 [==============================] - 9s 154us/step - loss: 0.0603 - acc: 0.9824 - val_loss: 0.0713 - val_acc: 0.9800\n",
"Epoch 5/5\n",
"60000/60000 [==============================] - 9s 157us/step - loss: 0.0512 - acc: 0.9848 - val_loss: 0.0749 - val_acc: 0.9795\n"
]
},
{
"data": {
"text/plain": [
"<keras.callbacks.History at 0x7fdda4f97110>"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model.compile(loss='categorical_crossentropy', optimizer=RMSprop(), metrics=['accuracy'])\n",
"\n",
"model.fit(x_train, y_train, batch_size=128, epochs=5, verbose=1,\n",
" validation_data=(x_test, y_test))"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Test loss: 0.074858742202\n",
"Test accuracy: 0.9795\n"
]
}
],
"source": [
"score = model.evaluate(x_test, y_test, verbose=0)\n",
"\n",
"print('Test loss: {}'.format(score[0]))\n",
"print('Test accuracy: {}'.format(score[1]))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Warstwa _dropout_ to metoda regularyzacji, służy zapobieganiu nadmiernemu dopasowaniu sieci. Polega na tym, że część węzłów sieci jest usuwana w sposób losowy."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"_________________________________________________________________\n",
"Layer (type) Output Shape Param # \n",
"=================================================================\n",
"dense_4 (Dense) (None, 512) 401920 \n",
"_________________________________________________________________\n",
"dense_5 (Dense) (None, 512) 262656 \n",
"_________________________________________________________________\n",
"dense_6 (Dense) (None, 10) 5130 \n",
"=================================================================\n",
"Total params: 669,706\n",
"Trainable params: 669,706\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n",
"Train on 60000 samples, validate on 10000 samples\n",
"Epoch 1/5\n",
"60000/60000 [==============================] - 8s 139us/step - loss: 0.2237 - acc: 0.9303 - val_loss: 0.0998 - val_acc: 0.9676\n",
"Epoch 2/5\n",
"60000/60000 [==============================] - 8s 136us/step - loss: 0.0818 - acc: 0.9748 - val_loss: 0.0788 - val_acc: 0.9770\n",
"Epoch 3/5\n",
"60000/60000 [==============================] - 8s 136us/step - loss: 0.0538 - acc: 0.9831 - val_loss: 0.1074 - val_acc: 0.9695\n",
"Epoch 4/5\n",
"60000/60000 [==============================] - 10s 161us/step - loss: 0.0397 - acc: 0.9879 - val_loss: 0.0871 - val_acc: 0.9763\n",
"Epoch 5/5\n",
"60000/60000 [==============================] - 12s 195us/step - loss: 0.0299 - acc: 0.9910 - val_loss: 0.0753 - val_acc: 0.9812\n"
]
},
{
"data": {
"text/plain": [
"<keras.callbacks.History at 0x7fdda3dcad50>"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Bez warstw Dropout\n",
"\n",
"num_classes = 10\n",
"\n",
"(x_train, y_train), (x_test, y_test) = mnist.load_data()\n",
"\n",
"x_train = x_train.reshape(60000, 784) # 784 = 28 * 28\n",
"x_test = x_test.reshape(10000, 784)\n",
"x_train = x_train.astype('float32')\n",
"x_test = x_test.astype('float32')\n",
"x_train /= 255\n",
"x_test /= 255\n",
"\n",
"y_train = keras.utils.to_categorical(y_train, num_classes)\n",
"y_test = keras.utils.to_categorical(y_test, num_classes)\n",
"\n",
"model_no_dropout = Sequential()\n",
"model_no_dropout.add(Dense(512, activation='relu', input_shape=(784,)))\n",
"model_no_dropout.add(Dense(512, activation='relu'))\n",
"model_no_dropout.add(Dense(num_classes, activation='softmax'))\n",
"model_no_dropout.summary()\n",
"\n",
"model_no_dropout.compile(loss='categorical_crossentropy',\n",
" optimizer=RMSprop(),\n",
" metrics=['accuracy'])\n",
"\n",
"model_no_dropout.fit(x_train, y_train,\n",
" batch_size=128,\n",
" epochs=5,\n",
" verbose=1,\n",
" validation_data=(x_test, y_test))"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Test loss (no dropout): 0.0753162465898\n",
"Test accuracy (no dropout): 0.9812\n"
]
}
],
"source": [
"# Bez warstw Dropout\n",
"\n",
"score = model_no_dropout.evaluate(x_test, y_test, verbose=0)\n",
"\n",
"print('Test loss (no dropout): {}'.format(score[0]))\n",
"print('Test accuracy (no dropout): {}'.format(score[1]))"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"_________________________________________________________________\n",
"Layer (type) Output Shape Param # \n",
"=================================================================\n",
"dense_7 (Dense) (None, 2500) 1962500 \n",
"_________________________________________________________________\n",
"dense_8 (Dense) (None, 2000) 5002000 \n",
"_________________________________________________________________\n",
"dense_9 (Dense) (None, 1500) 3001500 \n",
"_________________________________________________________________\n",
"dense_10 (Dense) (None, 1000) 1501000 \n",
"_________________________________________________________________\n",
"dense_11 (Dense) (None, 500) 500500 \n",
"_________________________________________________________________\n",
"dense_12 (Dense) (None, 10) 5010 \n",
"=================================================================\n",
"Total params: 11,972,510\n",
"Trainable params: 11,972,510\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n",
"Train on 60000 samples, validate on 10000 samples\n",
"Epoch 1/10\n",
"60000/60000 [==============================] - 145s 2ms/step - loss: 1.4242 - acc: 0.5348 - val_loss: 0.4426 - val_acc: 0.8638\n",
"Epoch 2/10\n",
"60000/60000 [==============================] - 140s 2ms/step - loss: 0.3245 - acc: 0.9074 - val_loss: 0.2231 - val_acc: 0.9360\n",
"Epoch 3/10\n",
"60000/60000 [==============================] - 137s 2ms/step - loss: 0.1993 - acc: 0.9420 - val_loss: 0.1694 - val_acc: 0.9485\n",
"Epoch 4/10\n",
"60000/60000 [==============================] - 136s 2ms/step - loss: 0.1471 - acc: 0.9571 - val_loss: 0.1986 - val_acc: 0.9381\n",
"Epoch 5/10\n",
"60000/60000 [==============================] - 132s 2ms/step - loss: 0.1189 - acc: 0.9650 - val_loss: 0.1208 - val_acc: 0.9658\n",
"Epoch 6/10\n",
"60000/60000 [==============================] - 131s 2ms/step - loss: 0.0983 - acc: 0.9711 - val_loss: 0.1260 - val_acc: 0.9637\n",
"Epoch 7/10\n",
"60000/60000 [==============================] - 129s 2ms/step - loss: 0.0818 - acc: 0.9753 - val_loss: 0.0984 - val_acc: 0.9727\n",
"Epoch 8/10\n",
"60000/60000 [==============================] - 129s 2ms/step - loss: 0.0710 - acc: 0.9784 - val_loss: 0.1406 - val_acc: 0.9597\n",
"Epoch 9/10\n",
"60000/60000 [==============================] - 129s 2ms/step - loss: 0.0611 - acc: 0.9811 - val_loss: 0.0987 - val_acc: 0.9727\n",
"Epoch 10/10\n",
"60000/60000 [==============================] - 136s 2ms/step - loss: 0.0533 - acc: 0.9837 - val_loss: 0.1070 - val_acc: 0.9718\n"
]
},
{
"data": {
"text/plain": [
"<keras.callbacks.History at 0x7fdd95c86610>"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Więcej warstw, inna funkcja aktywacji\n",
"\n",
"num_classes = 10\n",
"\n",
"(x_train, y_train), (x_test, y_test) = mnist.load_data()\n",
"\n",
"x_train = x_train.reshape(60000, 784) # 784 = 28 * 28\n",
"x_test = x_test.reshape(10000, 784)\n",
"x_train = x_train.astype('float32')\n",
"x_test = x_test.astype('float32')\n",
"x_train /= 255\n",
"x_test /= 255\n",
"\n",
"y_train = keras.utils.to_categorical(y_train, num_classes)\n",
"y_test = keras.utils.to_categorical(y_test, num_classes)\n",
"\n",
"model3 = Sequential()\n",
"model3.add(Dense(2500, activation='tanh', input_shape=(784,)))\n",
"model3.add(Dense(2000, activation='tanh'))\n",
"model3.add(Dense(1500, activation='tanh'))\n",
"model3.add(Dense(1000, activation='tanh'))\n",
"model3.add(Dense(500, activation='tanh'))\n",
"model3.add(Dense(num_classes, activation='softmax'))\n",
"model3.summary()\n",
"\n",
"model3.compile(loss='categorical_crossentropy',\n",
" optimizer=RMSprop(),\n",
" metrics=['accuracy'])\n",
"\n",
"model3.fit(x_train, y_train,\n",
" batch_size=128,\n",
" epochs=10,\n",
" verbose=1,\n",
" validation_data=(x_test, y_test))"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Test loss: 0.107020105763\n",
"Test accuracy: 0.9718\n"
]
}
],
"source": [
"# Więcej warstw, inna funkcja aktywacji\n",
"\n",
"score = model3.evaluate(x_test, y_test, verbose=0)\n",
"\n",
"print('Test loss: {}'.format(score[0]))\n",
"print('Test accuracy: {}'.format(score[1]))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Przykład: 4-pikselowy aparat fotograficzny"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkz\nODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2MBERISGBUYLxoaL2NCOEJjY2NjY2NjY2Nj\nY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY//AABEIAWgB4AMBIgACEQED\nEQH/xAAbAAEAAgMBAQAAAAAAAAAAAAAAAwUCBAYBB//EAE0QAAEDAgIGBgYGCAMHAwUAAAEAAgME\nEQUhEhMxQZHRFBVRUmFxBiIygZKxIzNCVKHBNDVTYnJzk6IXJPAWVWN0gtLxQ7LhB2SDlML/xAAZ\nAQEAAwEBAAAAAAAAAAAAAAAAAQIDBAX/xAAlEQEBAAICAwEBAAEFAQAAAAAAAQIRAxITIVExMlIE\nFCJBgWH/2gAMAwEAAhEDEQA/APn6IiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIi\nICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAi\nIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiIC\nKz6jqe/DxPJOo6rvw8TyWnjz+K9orEVn1HVd+HieSdR1Xfh4nknjz+HaKxFZ9R1Xfh4nknUdV34e\nJ5J48/h2isRWfUdV34eJ5J1HVd+HieSePP4dorEVn1HVd+HieSdR1Xfh4nknjz+HaKxFZ9R1Xfh4\nnknUdV34eJ5J48/h2isRWfUdV34eJ5J1HVd+HieSePP4dorEVn1HVd+HieSdR1Xfh4nknjz+HaKx\nFZ9R1Xfh4nknUdV34eJ5J48/h2isRWfUdV34eJ5J1HVd+HieSePP4dorEVn1HVd+HieSdR1Xfh4n\nknjz+HaKxFZ9R1Xfh4nknUdV34eJ5J48/h2isRWfUdV34eJ5J1HVd+HieSePP4dorEVn1HVd+Hie\nSdR1Xfh4nknjz+HaKxFZ9R1Xfh4nknUdV34eJ5J48/h2isRWfUdV34eJ5J1HVd+HieSePP4dorEV\nn1HVd+HieSdR1Xfh4nknjz+HaKxFZ9R1Xfh4nknUdV34eJ5J48/h2isRWfUdV34eJ5J1HVd+HieS\nePP4dorEVn1HVd+HieSdR1Xfh4nknjz+HaKxFZ9R1Xfh4nknUdV34eJ5J48/h2isRWfUdV34eJ5J\n1HVd+HieSePP4dorEVn1HVd+HieSdR1Xfh4nknjz+HaKxFZ9R1Xfh4nknUdV34eJ5J48/h2isRWf\nUdV34eJ5J1HVd+HieSePP4dorEVn1HVd+HieSdR1Xfh4nknjz+HaKxFZ9R1Xfh4nknUdV34eJ5J4\n8/h2isRWfUdV34eJ5J1HVd+HieSePP4dorEVn1HVd+HieSdR1Xfh4nknjz+HaKxFZ9R1Xfh4nknU\ndT34eJ5J48/h2joERF3sBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERARE\nQEREBERARE3gAEk7ABclQCI4OZbWRvZfZptIvxV76MYfS13SjVQiQsLQ25OW1Uy5JJtMx3dKJF2t\nXgWGMpZHNpWghuR0jzUvUGF/c2/Eeay/3E+L+NwqKwx+lho8ZdDTs1cepY7RBO27uSr1vjl2m1LN\nXQiIrIEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQ\nEREBERAREQERSshLm6byGR947/LtQRgFxAAJJ2AK4wqBsNPrXgCR5NydwBtZVhmDGlsALQci4+0e\nS2MOrmU8eoqB9He7XWvbwKw5pbj6Xw1tav0JwYi0PjIs7s8ln6LMmpn1rBC6RoeAHXAva605sUpo\n4zqXCV25rdnvK3PQ0k9Nc43c5zST4m65dXVrXc2ua2Wbokv+WcPV7wU2um+6v+NvNe1gvSSj91TK\nqXDekznOx1xfGWHUMyJB3u7FWK29Kv1+7/l4/m9VK7uH+Iwz/RERaqiIiAiIgIiICIiAiIgIiICI\niAiIgIiICIiAiIgIiILv0boKWtFSamISFpaG3Jy2q0rsIwyChnm6K0athdcE7h5qv9F5XQxVrxE5\n7W6JNiBuPat0YzFiWEVc0dNKYo2OD9PR7Lnf2Lh5Le1b4z0p5aClxKWOuw5mhRU/17XEtLrZnLfk\ntCtnop8QecPY5kIjb6pFvWu6/wCStGyvrR0rCojT4fD+kRXDdO2ZyG3LJVNXVUlXiEklDT6iMRsB\nbogZ3dnkp4re8RnPTBERdrEREQF60Bzmg7CQDxXi9b7bcr+sPmov4T9dyMBwz7oziVT4nT4RS1z6\nI01ppo26q17AkkdvbZdIJZbfoz/ibzXP4xW0zMV1E9G41csbRC+zTom7gDe+Wa87ddGlPPS0uDYZ\nPT4izTrZGudE9hLrZWGfmtFWOJTdBw+amxmE1FbIwmKXJ2iNgz81XLp4LvbPkERF0sxERAV96MYf\nS1oqTUwiQtLQ25OW1UK6T0Re9ravRiL827CB29qx5vWK2H6savA8NZSSubSsDg0kG5U3UOGfdGcS\npa2WXoc3+Xd7B+03mpddN91k+JvNce6204v0gpYaPF3RU7AyPVNdog5XuVoMY6R2ixpJVv6RN08Z\ndJP9ENU0BtwXHN3Yqt8xLdCMaEfYN/md67eK/wDCMc/1l9FDttLJ/aOaifI6R2k83KxRaKiIm8AA\nknYALkqQXrHyRkmOSRl9ug8tvwRzXMtrI3svs0mkKixueWKqaI5XtGhsa629Z5549d/q0l2vzPO4\nWNROQdxldzUsdbUNaWOmmew7jK6/uN1xXS6n7xL8ZTpdT94l+MrHy4fF+l+u0dEZnGSOR8riLEPc\nS8D37fcoFV4FNLIyUvkc4gixJvZXWubJlOCT327ff2rfC7x3GeU9oVLSxNmqGRvJDScyBcrx8Lmt\n02kPj7zfz7FgCQQQbEbwrfqFk7DIwbXk9YkNIsQ2zQbnippaCHWaWi4tD7aEYz9q3BVGtkJI0nuL\n9oFyXc16XzREF+tjJyBcCLrP3v3Vv/GVS0MqpWtFmh5AHvUSHM3KLSKiIikEREBERAREQEREBERA\nREQEREBERBHNPFA3SleGjxUUNfTTu0Y5QXdhyuufxGd09ZIXHJp0QOwLWBsbjauW89l9RpOP0+gY\nTS1s0k01PPq6ePRMrNIjTGZP4LcncK+nfVYXanooAekRW0dZbM5DblkqLDBVS4fHKXOZrG52l0dI\nC47VOylkjjdGwlrH+00TWB8xdReO53tKmZa9LoHp3+bws9HoIfr4baOnbM5DI5ZKmrqmjq8SkkoY\ndVEI2At0Q3O7uxI6aWKN0cbixjtrWzWB/FYR0JjvoMYL7fXHNWw4rjlvaMstzTFFL0aTsb8Y5p0a\nTsb8Y5ro2zRIpejSdjfjHNSU9E+WojbIBq7+tZwO7Zkoyykm0ybrXa17xpMjke3tawkJGQZG+Dhf\niukcdWGtY0XOQGwBVuL0wLY5vVEhdouN9EHsXPOa26rS4adyNi57GaqjjxU080OlVSxNEL9EHRJL\ngM92a57/ADv3mX/9g81DJSyyyiWRxfI3Y501yPfdV8F+p7xY4k9mH4dPTYvaprJI3GKW2loi1hmd\nmaqlnNSPqHB051hGQL5b/MrLo8n7vxjmtuLDpvdUyu0SKXo8nY34xzTo8nY34xzWu4oiTeAASTsA\nFyVMKaUmwDSf4xzVphVKIYS97fpXEgnssbWWfJydYtjjtTOa5ltYx7L7NJpF103ob7FX/E35FRkC\noa5r2tMRuLHeqJpFJJINfMC1xboxOcC62y5Cx73lnVfXW7fQK0Xo5h+4VMvm81dVTZF84bawbrCB\n81j0qs/bzD/87lHgqe
"text/html": [
"\n",
" <iframe\n",
" width=\"800\"\n",
" height=\"600\"\n",
" src=\"https://www.youtube.com/embed/ILsA4nyG7I0\"\n",
" frameborder=\"0\"\n",
" allowfullscreen\n",
" ></iframe>\n",
" "
],
"text/plain": [
"<IPython.lib.display.YouTubeVideo at 0x7f4d3ff03110>"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"YouTubeVideo('ILsA4nyG7I0', width=800, height=600)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"def generate_example(description):\n",
" variant = random.choice([1, -1])\n",
" if description == 's': # solid\n",
" return (np.array([[ 1.0, 1.0], [ 1.0, 1.0]]) if variant == 1 else\n",
" np.array([[-1.0, -1.0], [-1.0, -1.0]]))\n",
" elif description == 'v': # vertical\n",
" return (np.array([[ 1.0, -1.0], [ 1.0, -1.0]]) if variant == 1 else\n",
" np.array([[-1.0, 1.0], [-1.0, 1.0]]))\n",
" elif description == 'd': # diagonal\n",
" return (np.array([[ 1.0, -1.0], [-1.0, 1.0]]) if variant == 1 else\n",
" np.array([[-1.0, 1.0], [ 1.0, -1.0]]))\n",
" elif description == 'h': # horizontal\n",
" return (np.array([[ 1.0, 1.0], [-1.0, -1.0]]) if variant == 1 else\n",
" np.array([[-1.0, -1.0], [ 1.0, 1.0]]))\n",
" else:\n",
" return np.array([[random.uniform(-1, 1), random.uniform(-1, 1)],\n",
" [random.uniform(-1, 1), random.uniform(-1, 1)]])"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"num_classes = 4\n",
"\n",
"trainset_size = 4000\n",
"testset_size = 1000\n",
"\n",
"y4_train = np.array([random.choice(['s', 'v', 'd', 'h']) for i in range(trainset_size)])\n",
"x4_train = np.array([generate_example(desc) for desc in y4_train])\n",
"\n",
"y4_test = np.array([random.choice(['s', 'v', 'd', 'h']) for i in range(testset_size)])\n",
"x4_test = np.array([generate_example(desc) for desc in y4_test])"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA60AAACQCAYAAADjqY0xAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAADlVJREFUeJzt3V+opPddx/HP192mhUi1bZY2bLJN\nxKQhioI7if9AQkVIi2wEe5F6YSuVBSEIXhkQLPRKvRHEYgk1JPWirXihqyiltUi9aE3OSmPTlk3X\nYpvdRtptpJKq3ezy8+LMs3O6Obs7J+eZmd+Zeb3gYc/MPJnnmZN3fs9+d2ZPqrUWAAAA6NEPrPoE\nAAAA4FoMrQAAAHTL0AoAAEC3DK0AAAB0y9AKAABAtwytAAAAdGtfQ2tVvbGqPllVX5n++oZr7He5\nqj4/3U7t55j0SQsMtECiA2a0wEALJDrg1an9/H9aq+qPkrzYWvuDqno0yRtaa7+7y34vtdZ+cB/n\nSee0wEALJDpgRgsMtECiA16d/Q6tZ5I80Fp7oapuTfJPrbW37bKf6NacFhhogUQHzGiBgRZIdMCr\ns9+/0/rm1toL06//M8mbr7Hf66pqq6o+V1W/ss9j0ictMNACiQ6Y0QIDLZDogFfh8I12qKpPJXnL\nLg/93s4brbVWVdd62/atrbXzVfUjST5dVV9orf37Lsc6meTk9ObxG50bfamqb7XWjmgBLezN8eMH\n82U999xzefnll19x/9GjR3Po0KFMJpN2+vTpC621IzrYbItaE26++ebj99xzzwLPnL260brg+kCS\nTP/duz7M6aD+PmEew+8Tbrhja+1Vb0nOJLl1+vWtSc7M8c88keRdc+zXbAdue0YLNi3sfVtHd999\nd/vGN77RkmzpwJYFrQnHjx9feMuM5+67715YCx00btv75vow57bOkmy1OebO/X48+FSS90y/fk+S\nv7l6h6p6Q1W9dvr1LUl+PsmX9nlc+vSm6a9aQAsb7sSJE3nyySeHmzrAmkBOnDiRaIHvpwPmM89k\ne60t2wvPPyb5SpJPJXnj9P5Jkg9Pv/65JF9I8sz01/fN+dwr/1MN2563/9aCTQt739bRhQsX2tvf\n/vaW5P90YMuC1gTvtB4sFy5cWFgLHTRu2/vm+jDnts4y5zut+/rpwYt0nc+306/TrbXJ2E+qhQNJ\nC3vQ6zo8hqoavYV17WDNLWRNmEwmbWtra+ynZYEWsSZMn9e6cPC4PszJ7xP2/9ODAQAAYGEMrQAA\nAHTL0AoAAEC3DK0AAAB0y9AKAABAtwytAAAAdMvQCgAAQLcMrQAAAHTL0AoAAEC3DK0AAAB0y9AK\nAABAtwytAAAAdMvQCgAAQLcMrQAAAHTL0AoAAEC3DK0AAAB0y9AKAABAtwytAAAAdGuUobWqHqyq\nM1V1tqoe3eXx11bVx6eP/0tV3THGcemPFhhoganX64DEmsCMFphyfWBu+x5aq+pQkg8meUeSe5O8\nu6ruvWq39yX5r9bajyb54yR/uN/j0i0tMNDChrt8+XKSHIsO2GZNYKAFEtcH9mCMd1rvT3K2tfbV\n1trFJB9L8tBV+zyU5Mnp13+V5BerqkY4Nn25OVpgmxbIU089lSTf0wGxJjCjBQauD8xtjKH1aJLn\nd9w+N71v131aa5eSfCfJm0Y4Nn25KVpgmxbI+fPnk+Tijrt0sLmsCQy0wMD1gbkdXvUJ7FRVJ5Oc\nXPV5sHpaYKAFEh0ws7OFY8eOrfhsWCXrAokONsUY77SeT3L7jtu3Te/bdZ+qOpzkh5J8++onaq09\n1lqbtNYmI5wXy3cxWmCbFsjRo0eT7XdVBjrYXAtZE44cObKg02WBXB8YuD4wtzGG1qeT3FVVd1bV\nTUkeTnLqqn1OJXnP9Ot3Jfl0a62NcGz68t1ogW1aIPfdd1+SvE4HxJrAjBYYuD4wt31/PLi1dqmq\nHknyiSSHkjzeWvtiVX0gyVZr7VSSP0/yF1V1NsmL2Q6T9aQFBlrYcIcPH06Sr0cHbLMmMNACiesD\ne1C9/oFFVfV5YlzP6UV8NEMLB5IW9qDXdXgMVTV6C+vawZpbyJowmUza1tbW2E/LAi1iTZg+r3Xh\n4HF9mJPfJ4zz8WAAAABYCEMrAAAA3TK0AgAA0C1DKwAAAN0ytAIAANAtQysAAADdMrQCAADQLUMr\nAAAA3TK0AgAA0C1DKwAAAN0ytAIAANAtQysAAADdMrQCAADQLUMrAAAA3TK0AgAA0C1DKwAAAN0y\ntAIAANAtQysAAADdMrQCAADQrVGG1qp6sKrOVNXZqnp0l8ffW1XfqqrPT7ffHOO49EcLDLTA1Ot1\nQGJN4AprAgMtMLfD+32CqjqU5INJfinJuSRPV9Wp1tqXrtr14621R/Z7PLqnBQZa2HCXL19OkmNJ\n7o0OsCZsPGsCV9ECcxvjndb7k5xtrX21tXYxyceSPDTC83Lw3BwtsE0L5KmnnkqS7+mAWBOINYFX\n0AJz2/c7rUmOJnl+x+1zSX56l/1+tap+IclzSX6ntfb81TtU1ckkJ0c4J1bjpmiBbQtp4dixY/na\n1762gNNdrapa9Sks0sUdX1sTNtfCrg9r/t/POhplTUisC2vA9YG5LesHMf1tkjtaaz+R5JNJntxt\np9baY621SWttsqTzYvm0wGDPLRw5cmSpJ8hSWBMYaIFkzg4SLWwAawJXjDG0nk9y+47bt03vu6K1\n9u3W2vemNz+c5PgIx6U/F6MFtmmBwU07vtbB5rImMLAmMNACcxtjaH06yV1VdWdV3ZTk4SSndu5Q\nVbfuuHkiyZdHOC79+W60wDYtMHidDog1gRlrAgMtMLd9/53W1tqlqnokySeSHEryeGvti1X1gSRb\nrbVTSX67qk4kuZTkxSTv3e9x6ZYWGGiBJPl6dMA2awKJNYEZLTC3aq2t+hx2VVV9nhjXc3oRf59A\nCwfSQlqYTCZta2tr7KdduTX/QTKjt2BNOJBcHxhogYHrw5x6ndfGUFVzdbCsH8QEAAAAe2ZoBQAA\noFuGVgAAALplaAUAAKBbhlYAAAC6ZWgFAACgW4ZWAAAAumVoBQAAoFuGVgAAALplaAUAAKBbhlYA\nAAC6ZWgFAACgW4ZWAAAAumVoBQAAoFuGVgAAALplaAUAAKBbhlYAAAC6ZWgFAACgW6MMrVX1eFV9\ns6qevcbjVVV/UlVnq+rfquqnxjgu3blDB0xpgYEWSHTAjBYYaIG5jfVO6xNJHrzO4+9Ictd0O5nk\nz0Y6Ln25EB2wTQsMtECiA2a0wEALzG2UobW19pkkL15nl4eSfKRt+1ySH66qW8c4Nl15KTpgmxYY\naIFEB8xogYEWmNuy/k7r0STP77h9bnofm0UHDLTAQAskOmBGCwy0wBWHV30CO1XVyWy//c+G0wKD\nnS0cO3ZsxWfDqlgTGGiBgRZIdLAplvVO6/kkt++4fdv0vu/TWnustTZprU2WdF4s11wdJFrYAK+q\nhSNHjizl5Fgq1wcS1wdmtMDA9YErljW0nkry69OfAvYzSb7TWnthScemHzpgoAUGWiDRATNaYKAF\nrhjl48FV9dEkDyS5parOJXl/ktckSWvtQ0n+Psk7k5xN8j9JfmOM49KdO5N8NjpAC8xogUQHzGiB\ngRaYW7XWVn0Ou6qqPk+M6zm9iI9maOFAWkgLk8mkbW1tjf20K1dVqz6FRRq9BWvCgeT6wEALDFwf\n5tTrvDaGqpqrg2V9PBgAAAD2zNAKAABAtwytAAAAdMvQCgAAQLcMrQAAAHTL0AoAAEC3DK0AAAB0\ny9AKAABAtwytAAAAdMvQCgAAQLcMrQAAAHTL0AoAAEC3DK0AAAB0y9AKAABAtwytAAAAdMvQCgAA\nQLcMrQAAAHTL0AoAAEC3Rhlaq+rxqvpmVT17jccfqKrvVNXnp9vvj3FcunOHDpjSAgMtkOiAGS0w\n+EkdMK/DIz3PE0n+NMl
"text/plain": [
"<matplotlib.figure.Figure at 0x7f4d3ffc2ed0>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" s s d s h s v\n"
]
}
],
"source": [
"draw_examples(x4_train[:7], captions=y4_train)"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"x4_train = x4_train.reshape(trainset_size, 4)\n",
"x4_test = x4_test.reshape(testset_size, 4)\n",
"x4_train = x4_train.astype('float32')\n",
"x4_test = x4_test.astype('float32')\n",
"\n",
"y4_train = np.array([{'s': 0, 'v': 1, 'd': 2, 'h': 3}[desc] for desc in y4_train])\n",
"y4_test = np.array([{'s': 0, 'v': 1, 'd': 2, 'h': 3}[desc] for desc in y4_test])\n",
"\n",
"y4_train = keras.utils.to_categorical(y4_train, num_classes)\n",
"y4_test = keras.utils.to_categorical(y4_test, num_classes)"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"_________________________________________________________________\n",
"Layer (type) Output Shape Param # \n",
"=================================================================\n",
"dense_16 (Dense) (None, 4) 20 \n",
"_________________________________________________________________\n",
"dense_17 (Dense) (None, 4) 20 \n",
"_________________________________________________________________\n",
"dense_18 (Dense) (None, 8) 40 \n",
"_________________________________________________________________\n",
"dense_19 (Dense) (None, 4) 36 \n",
"=================================================================\n",
"Total params: 116\n",
"Trainable params: 116\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"model4 = Sequential()\n",
"model4.add(Dense(4, activation='tanh', input_shape=(4,)))\n",
"model4.add(Dense(4, activation='tanh'))\n",
"model4.add(Dense(8, activation='relu'))\n",
"model4.add(Dense(num_classes, activation='softmax'))\n",
"model4.summary()"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"model4.layers[0].set_weights(\n",
" [np.array([[ 1.0, 0.0, 1.0, 0.0],\n",
" [ 0.0, 1.0, 0.0, 1.0],\n",
" [ 1.0, 0.0, -1.0, 0.0],\n",
" [ 0.0, 1.0, 0.0, -1.0]],\n",
" dtype=np.float32), np.array([0., 0., 0., 0.], dtype=np.float32)])\n",
"model4.layers[1].set_weights(\n",
" [np.array([[ 1.0, -1.0, 0.0, 0.0],\n",
" [ 1.0, 1.0, 0.0, 0.0],\n",
" [ 0.0, 0.0, 1.0, -1.0],\n",
" [ 0.0, 0.0, -1.0, -1.0]],\n",
" dtype=np.float32), np.array([0., 0., 0., 0.], dtype=np.float32)])\n",
"model4.layers[2].set_weights(\n",
" [np.array([[ 1.0, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],\n",
" [ 0.0, 0.0, 1.0, -1.0, 0.0, 0.0, 0.0, 0.0],\n",
" [ 0.0, 0.0, 0.0, 0.0, 1.0, -1.0, 0.0, 0.0],\n",
" [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, -1.0]],\n",
" dtype=np.float32), np.array([0., 0., 0., 0., 0., 0., 0., 0.], dtype=np.float32)])"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"model4.layers[3].set_weights(\n",
" [np.array([[ 1.0, 0.0, 0.0, 0.0],\n",
" [ 1.0, 0.0, 0.0, 0.0],\n",
" [ 0.0, 1.0, 0.0, 0.0],\n",
" [ 0.0, 1.0, 0.0, 0.0],\n",
" [ 0.0, 0.0, 1.0, 0.0],\n",
" [ 0.0, 0.0, 1.0, 0.0],\n",
" [ 0.0, 0.0, 0.0, 1.0],\n",
" [ 0.0, 0.0, 0.0, 1.0]],\n",
" dtype=np.float32), np.array([0., 0., 0., 0.], dtype=np.float32)])\n",
"\n",
"model4.compile(loss='categorical_crossentropy',\n",
" optimizer=Adagrad(),\n",
" metrics=['accuracy'])"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[array([[ 1., 0., 1., 0.],\n",
" [ 0., 1., 0., 1.],\n",
" [ 1., 0., -1., 0.],\n",
" [ 0., 1., 0., -1.]], dtype=float32), array([ 0., 0., 0., 0.], dtype=float32)]\n",
"[array([[ 1., -1., 0., 0.],\n",
" [ 1., 1., 0., 0.],\n",
" [ 0., 0., 1., -1.],\n",
" [ 0., 0., -1., -1.]], dtype=float32), array([ 0., 0., 0., 0.], dtype=float32)]\n",
"[array([[ 1., -1., 0., 0., 0., 0., 0., 0.],\n",
" [ 0., 0., 1., -1., 0., 0., 0., 0.],\n",
" [ 0., 0., 0., 0., 1., -1., 0., 0.],\n",
" [ 0., 0., 0., 0., 0., 0., 1., -1.]], dtype=float32), array([ 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)]\n",
"[array([[ 1., 0., 0., 0.],\n",
" [ 1., 0., 0., 0.],\n",
" [ 0., 1., 0., 0.],\n",
" [ 0., 1., 0., 0.],\n",
" [ 0., 0., 1., 0.],\n",
" [ 0., 0., 1., 0.],\n",
" [ 0., 0., 0., 1.],\n",
" [ 0., 0., 0., 1.]], dtype=float32), array([ 0., 0., 0., 0.], dtype=float32)]\n"
]
}
],
"source": [
"for layer in model4.layers:\n",
" print(layer.get_weights())"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 0.17831734, 0.17831734, 0.17831734, 0.46504799]], dtype=float32)"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model4.predict([np.array([[1.0, 1.0], [-1.0, -1.0]]).reshape(1, 4)])"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Test loss: 0.765614629269\n",
"Test accuracy: 1.0\n"
]
}
],
"source": [
"score = model4.evaluate(x4_test, y4_test, verbose=0)\n",
"\n",
"print('Test loss: {}'.format(score[0]))\n",
"print('Test accuracy: {}'.format(score[1]))"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"_________________________________________________________________\n",
"Layer (type) Output Shape Param # \n",
"=================================================================\n",
"dense_20 (Dense) (None, 4) 20 \n",
"_________________________________________________________________\n",
"dense_21 (Dense) (None, 4) 20 \n",
"_________________________________________________________________\n",
"dense_22 (Dense) (None, 8) 40 \n",
"_________________________________________________________________\n",
"dense_23 (Dense) (None, 4) 36 \n",
"=================================================================\n",
"Total params: 116\n",
"Trainable params: 116\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"model5 = Sequential()\n",
"model5.add(Dense(4, activation='tanh', input_shape=(4,)))\n",
"model5.add(Dense(4, activation='tanh'))\n",
"model5.add(Dense(8, activation='relu'))\n",
"model5.add(Dense(num_classes, activation='softmax'))\n",
"model5.compile(loss='categorical_crossentropy',\n",
" optimizer=RMSprop(),\n",
" metrics=['accuracy'])\n",
"model5.summary()"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Train on 4000 samples, validate on 1000 samples\n",
"Epoch 1/8\n",
"4000/4000 [==============================] - 0s - loss: 1.1352 - acc: 0.5507 - val_loss: 1.0160 - val_acc: 0.7330\n",
"Epoch 2/8\n",
"4000/4000 [==============================] - 0s - loss: 0.8918 - acc: 0.8722 - val_loss: 0.8094 - val_acc: 0.8580\n",
"Epoch 3/8\n",
"4000/4000 [==============================] - 0s - loss: 0.6966 - acc: 0.8810 - val_loss: 0.6283 - val_acc: 0.8580\n",
"Epoch 4/8\n",
"4000/4000 [==============================] - 0s - loss: 0.5284 - acc: 0.8810 - val_loss: 0.4697 - val_acc: 0.8580\n",
"Epoch 5/8\n",
"4000/4000 [==============================] - 0s - loss: 0.3797 - acc: 0.9022 - val_loss: 0.3312 - val_acc: 1.0000\n",
"Epoch 6/8\n",
"4000/4000 [==============================] - 0s - loss: 0.2555 - acc: 1.0000 - val_loss: 0.2166 - val_acc: 1.0000\n",
"Epoch 7/8\n",
"4000/4000 [==============================] - 0s - loss: 0.1612 - acc: 1.0000 - val_loss: 0.1318 - val_acc: 1.0000\n",
"Epoch 8/8\n",
"4000/4000 [==============================] - 0s - loss: 0.0939 - acc: 1.0000 - val_loss: 0.0732 - val_acc: 1.0000\n"
]
},
{
"data": {
"text/plain": [
"<keras.callbacks.History at 0x7f4d34067510>"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model5.fit(x4_train, y4_train, epochs=8, validation_data=(x4_test, y4_test))"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 0.00708295, 0.00192736, 0.02899081, 0.96199888]], dtype=float32)"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model5.predict([np.array([[1.0, 1.0], [-1.0, -1.0]]).reshape(1, 4)])"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Test loss: 0.0731911802292\n",
"Test accuracy: 1.0\n"
]
}
],
"source": [
"score = model5.evaluate(x4_test, y4_test, verbose=0)\n",
"\n",
"print('Test loss: {}'.format(score[0]))\n",
"print('Test accuracy: {}'.format(score[1]))"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"import contextlib\n",
"\n",
"@contextlib.contextmanager\n",
"def printoptions(*args, **kwargs):\n",
" original = np.get_printoptions()\n",
" np.set_printoptions(*args, **kwargs)\n",
" try:\n",
" yield\n",
" finally: \n",
" np.set_printoptions(**original)"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[array([[-0.2, -0.5, 0.8, 1. ],\n",
" [-0.9, 0.1, -0.8, 0.2],\n",
" [-0.2, 0.4, 0.1, -0.4],\n",
" [-0.8, 0.8, 1. , 0.3]], dtype=float32), array([ 0. , -0. , 0.1, -0.1], dtype=float32)]\n",
"[array([[-0.4, 0.9, -1.3, 1.7],\n",
" [-0.4, -0.7, 0.3, -0.3],\n",
" [ 0.8, -0.9, -1.1, -0.2],\n",
" [ 1.3, 0.5, 0.4, -0.2]], dtype=float32), array([-0. , -0. , 0.2, 0. ], dtype=float32)]\n",
"[array([[-1.6, 0.3, 0.3, -0.3, -1.1, 1.2, 0.7, -1. ],\n",
" [ 0.4, 1.3, -0.9, 0.8, -0.4, -0.7, -1.2, -1. ],\n",
" [ 0.6, 1. , 0.9, -1. , -1.1, -0.2, -0.4, -0.3],\n",
" [ 1.1, 0.1, -0.9, 1.3, -0.3, -0.2, 0.2, -0.4]], dtype=float32), array([-0. , 0.2, -0.1, 0. , -0.1, -0. , -0.1, 0.1], dtype=float32)]\n",
"[array([[ 0.6, -1.5, 1.3, -1.4],\n",
" [-0.4, -1.6, -0.3, 1.2],\n",
" [ 1.2, 1.1, -0.3, -1.5],\n",
" [ 0.6, 1.4, -1.5, -1.2],\n",
" [ 0.2, -1.3, -0.9, 0.8],\n",
" [ 0.6, -1.5, 0.8, -1. ],\n",
" [ 0.4, -1.3, 0.4, 0.3],\n",
" [-1.3, 0.5, -0.9, 0.8]], dtype=float32), array([-0.8, 0.7, 0.4, 0.1], dtype=float32)]\n"
]
}
],
"source": [
"with printoptions(precision=1, suppress=True):\n",
" for layer in model5.layers:\n",
" print(layer.get_weights())"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 12.3. Odmiany metody gradientu prostego\n",
"\n",
"* Batch gradient descent\n",
"* Stochastic gradient descent\n",
"* Mini-batch gradient descent"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### _Batch gradient descent_\n",
"\n",
"* Klasyczna wersja metody gradientu prostego\n",
"* Obliczamy gradient funkcji kosztu względem całego zbioru treningowego:\n",
" $$ \\theta := \\theta - \\alpha \\cdot \\nabla_\\theta J(\\theta) $$\n",
"* Dlatego może działać bardzo powoli\n",
"* Nie można dodawać nowych przykładów na bieżąco w trakcie trenowania modelu (_online learning_)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### _Stochastic gradient descent_ (SGD)\n",
"\n",
"* Aktualizacja parametrów dla każdego przykładu:\n",
" $$ \\theta := \\theta - \\alpha \\cdot \\nabla_\\theta \\, J \\! \\left( \\theta, x^{(i)}, y^{(i)} \\right) $$\n",
"* Dużo szybszy niż _batch gradient descent_\n",
"* Można dodawać nowe przykłady na bieżąco w trakcie trenowania (_online learning_)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### _Stochastic gradient descent_ (SGD)\n",
"\n",
"* Częsta aktualizacja parametrów z dużą wariancją:\n",
"\n",
"<img src=\"http://ruder.io/content/images/2016/09/sgd_fluctuation.png\" style=\"margin: auto;\" width=\"50%\" />\n",
"\n",
"* Z jednej strony dzięki temu nie utyka w złych minimach lokalnych, ale z drugiej strony może „wyskoczyć” z dobrego minimum"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### _Mini-batch gradient descent_\n",
"\n",
"* Kompromis między _batch gradient descent_ i SGD\n",
" $$ \\theta := \\theta - \\alpha \\cdot \\nabla_\\theta \\, J \\left( \\theta, x^{(i : i+n)}, y^{(i : i_n)} \\right) $$\n",
"* Stabilniejsza zbieżność dzięki redukcji wariancji aktualizacji parametrów\n",
"* Szybszy niż klasyczny _batch gradient descent_\n",
"* Typowa wielkość batcha: między 50 a 256 przykładów"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Wady klasycznej metody gradientu prostego, czyli dlaczego potrzebujemy optymalizacji\n",
"\n",
"* Trudno dobrać właściwą szybkość uczenia (_learning rate_)\n",
"* Jedna ustalona wartość stałej uczenia się dla wszystkich parametrów\n",
"* Funkcja kosztu dla sieci neuronowych nie jest wypukła, więc uczenie może utknąć w złym minimum lokalnym lub punkcie siodłowym"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 12.4. Algorytmy optymalizacji metody gradientu\n",
"\n",
"* Momentum\n",
"* Nesterov Accelerated Gradient\n",
"* Adagrad\n",
"* Adadelta\n",
"* RMSprop\n",
"* Adam\n",
"* Nadam\n",
"* AMSGrad"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Momentum\n",
"\n",
"* SGD źle radzi sobie w „wąwozach” funkcji kosztu\n",
"* Momentum rozwiązuje ten problem przez dodanie współczynnika $\\gamma$, który można trakować jako „pęd” spadającej piłki:\n",
" $$ v_t := \\gamma \\, v_{t-1} + \\alpha \\, \\nabla_\\theta J(\\theta) $$\n",
" $$ \\theta := \\theta - v_t $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Przyspiesony gradient Nesterova (_Nesterov Accelerated Gradient_, NAG)\n",
"\n",
"* Momentum czasami powoduje niekontrolowane rozpędzanie się piłki, przez co staje się „mniej sterowna”\n",
"* Nesterov do piłki posiadającej pęd dodaje „hamulec”, który spowalnia piłkę przed wzniesieniem:\n",
" $$ v_t := \\gamma \\, v_{t-1} + \\alpha \\, \\nabla_\\theta J(\\theta - \\gamma \\, v_{t-1}) $$\n",
" $$ \\theta := \\theta - v_t $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Adagrad\n",
"\n",
"* “<b>Ada</b>ptive <b>grad</b>ient”\n",
"* Adagrad dostosowuje współczynnik uczenia (_learning rate_) do parametrów: zmniejsza go dla cech występujących częściej, a zwiększa dla występujących rzadziej\n",
"* Świetny do trenowania na rzadkich (_sparse_) zbiorach danych\n",
"* Wada: współczynnik uczenia może czasami gwałtownie maleć"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Adadelta i RMSprop\n",
"* Warianty algorytmu Adagrad, które radzą sobie z problemem gwałtownych zmian współczynnika uczenia"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Adam\n",
"\n",
"* “<b>Ada</b>ptive <b>m</b>oment estimation”\n",
"* Łączy zalety algorytmów RMSprop i Momentum\n",
"* Można go porównać do piłki mającej ciężar i opór\n",
"* Obecnie jeden z najpopularniejszych algorytmów optymalizacji"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Nadam\n",
"* “<b>N</b>esterov-accelerated <b>ada</b>ptive <b>m</b>oment estimation”\n",
"* Łączy zalety algorytmów Adam i Nesterov Accelerated Gradient"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### AMSGrad\n",
"* Wariant algorytmu Adam lepiej dostosowany do zadań takich jak rozpoznawanie obiektów czy tłumaczenie maszynowe"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"contours_evaluation_optimizers.gif\" style=\"margin: auto;\" width=\"80%\" />"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"saddle_point_evaluation_optimizers.gif\" style=\"margin: auto;\" width=\"80%\" />"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3"
},
"livereveal": {
"start_slideshow_at": "selected",
"theme": "amu"
}
},
"nbformat": 4,
"nbformat_minor": 4
}