Wykład 10 i 11

This commit is contained in:
Paweł Skórzewski 2022-05-20 09:39:29 +02:00
parent 52896c2a9d
commit 3405d80635
2 changed files with 80 additions and 58 deletions

View File

@ -266,7 +266,7 @@
} }
}, },
"source": [ "source": [
"$$ f(x_1, x_2) = \\max(x_1 + x_2) \\hskip{12em} \\\\\n", "$$ f(x_1, x_2) = \\max(x_1, x_2) \\hskip{12em} \\\\\n",
"\\to \\qquad \\frac{\\partial f}{\\partial x_1} = \\mathbb{1}_{x \\geq y}, \\quad \\frac{\\partial f}{\\partial x_2} = \\mathbb{1}_{y \\geq x}, \\quad \\nabla f = (\\mathbb{1}_{x \\geq y}, \\mathbb{1}_{y \\geq x}) $$ " "\\to \\qquad \\frac{\\partial f}{\\partial x_1} = \\mathbb{1}_{x \\geq y}, \\quad \\frac{\\partial f}{\\partial x_2} = \\mathbb{1}_{y \\geq x}, \\quad \\nabla f = (\\mathbb{1}_{x \\geq y}, \\mathbb{1}_{y \\geq x}) $$ "
] ]
}, },
@ -755,7 +755,7 @@
"\n", "\n",
"Pojedyncza iteracja:\n", "Pojedyncza iteracja:\n",
"* Dla parametrów $\\Theta = (\\Theta^{(1)},\\ldots,\\Theta^{(L)})$ utwórz pomocnicze macierze zerowe $\\Delta = (\\Delta^{(1)},\\ldots,\\Delta^{(L)})$ o takich samych wymiarach (dla uproszczenia opuszczono wagi $\\beta$).\n", "* Dla parametrów $\\Theta = (\\Theta^{(1)},\\ldots,\\Theta^{(L)})$ utwórz pomocnicze macierze zerowe $\\Delta = (\\Delta^{(1)},\\ldots,\\Delta^{(L)})$ o takich samych wymiarach (dla uproszczenia opuszczono wagi $\\beta$).\n",
"* Dla $m$ przykładów we wsadzie (_batch_), $i = 1,\\ldots,m$:\n", "* Dla $m$ przykładów we wsadzie (*batch*), $i = 1,\\ldots,m$:\n",
" * Wykonaj algortym propagacji wstecznej dla przykładu $(x^{(i)}, y^{(i)})$ i przechowaj gradienty $\\nabla_{\\Theta}J^{(i)}(\\Theta)$ dla tego przykładu;\n", " * Wykonaj algortym propagacji wstecznej dla przykładu $(x^{(i)}, y^{(i)})$ i przechowaj gradienty $\\nabla_{\\Theta}J^{(i)}(\\Theta)$ dla tego przykładu;\n",
" * $\\Delta := \\Delta + \\dfrac{1}{m}\\nabla_{\\Theta}J^{(i)}(\\Theta)$\n", " * $\\Delta := \\Delta + \\dfrac{1}{m}\\nabla_{\\Theta}J^{(i)}(\\Theta)$\n",
"* Wykonaj aktualizację wag: $\\Theta := \\Theta - \\alpha \\Delta$" "* Wykonaj aktualizację wag: $\\Theta := \\Theta - \\alpha \\Delta$"
@ -969,7 +969,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 9, "execution_count": 5,
"metadata": { "metadata": {
"scrolled": true, "scrolled": true,
"slideshow": { "slideshow": {
@ -981,19 +981,15 @@
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"Model: \"sequential_1\"\n", "Model: \"sequential\"\n",
"_________________________________________________________________\n", "_________________________________________________________________\n",
"Layer (type) Output Shape Param # \n", "Layer (type) Output Shape Param # \n",
"=================================================================\n", "=================================================================\n",
"dense_3 (Dense) (None, 512) 401920 \n", "dense (Dense) (None, 512) 401920 \n",
"_________________________________________________________________\n", "_________________________________________________________________\n",
"dropout (Dropout) (None, 512) 0 \n", "dense_1 (Dense) (None, 512) 262656 \n",
"_________________________________________________________________\n", "_________________________________________________________________\n",
"dense_4 (Dense) (None, 512) 262656 \n", "dense_2 (Dense) (None, 10) 5130 \n",
"_________________________________________________________________\n",
"dropout_1 (Dropout) (None, 512) 0 \n",
"_________________________________________________________________\n",
"dense_5 (Dense) (None, 10) 5130 \n",
"=================================================================\n", "=================================================================\n",
"Total params: 669,706\n", "Total params: 669,706\n",
"Trainable params: 669,706\n", "Trainable params: 669,706\n",
@ -1004,10 +1000,8 @@
], ],
"source": [ "source": [
"model = keras.Sequential()\n", "model = keras.Sequential()\n",
"model.add(Dense(512, activation='relu', input_shape=(784,)))\n", "model.add(Dense(512, activation='tanh', input_shape=(784,)))\n",
"model.add(Dropout(0.2))\n", "model.add(Dense(512, activation='tanh'))\n",
"model.add(Dense(512, activation='relu'))\n",
"model.add(Dropout(0.2))\n",
"model.add(Dense(num_classes, activation='softmax'))\n", "model.add(Dense(num_classes, activation='softmax'))\n",
"\n", "\n",
"model.summary() # wyświetl podsumowanie architektury sieci" "model.summary() # wyświetl podsumowanie architektury sieci"
@ -1015,7 +1009,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 10, "execution_count": 6,
"metadata": { "metadata": {
"slideshow": { "slideshow": {
"slide_type": "subslide" "slide_type": "subslide"
@ -1036,55 +1030,28 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 12, "execution_count": 7,
"metadata": { "metadata": {},
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [ "outputs": [
{ {
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"Epoch 1/10\n", "[[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]\n",
"469/469 [==============================] - 20s 42ms/step - loss: 0.0957 - accuracy: 0.9708 - val_loss: 0.0824 - val_accuracy: 0.9758\n", " [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
"Epoch 2/10\n", " [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]\n",
"469/469 [==============================] - 20s 43ms/step - loss: 0.0693 - accuracy: 0.9793 - val_loss: 0.0807 - val_accuracy: 0.9772\n", " [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
"Epoch 3/10\n", " [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]\n",
"469/469 [==============================] - 18s 38ms/step - loss: 0.0563 - accuracy: 0.9827 - val_loss: 0.0861 - val_accuracy: 0.9758\n", " [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]\n",
"Epoch 4/10\n", " [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
"469/469 [==============================] - 18s 37ms/step - loss: 0.0485 - accuracy: 0.9857 - val_loss: 0.0829 - val_accuracy: 0.9794\n", " [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]\n",
"Epoch 5/10\n", " [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
"469/469 [==============================] - 19s 41ms/step - loss: 0.0428 - accuracy: 0.9876 - val_loss: 0.0955 - val_accuracy: 0.9766\n", " [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]]\n"
"Epoch 6/10\n",
"469/469 [==============================] - 22s 47ms/step - loss: 0.0377 - accuracy: 0.9887 - val_loss: 0.0809 - val_accuracy: 0.9794\n",
"Epoch 7/10\n",
"469/469 [==============================] - 17s 35ms/step - loss: 0.0338 - accuracy: 0.9904 - val_loss: 0.1028 - val_accuracy: 0.9788\n",
"Epoch 8/10\n",
"469/469 [==============================] - 17s 36ms/step - loss: 0.0322 - accuracy: 0.9911 - val_loss: 0.0937 - val_accuracy: 0.9815\n",
"Epoch 9/10\n",
"469/469 [==============================] - 18s 37ms/step - loss: 0.0303 - accuracy: 0.9912 - val_loss: 0.0916 - val_accuracy: 0.9829.0304 - accu\n",
"Epoch 10/10\n",
"469/469 [==============================] - 16s 34ms/step - loss: 0.0263 - accuracy: 0.9926 - val_loss: 0.0958 - val_accuracy: 0.9812\n"
] ]
},
{
"data": {
"text/plain": [
"<tensorflow.python.keras.callbacks.History at 0x228eac95ac0>"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
} }
], ],
"source": [ "source": [
"model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.RMSprop(), metrics=['accuracy'])\n", "print(y_train[:10])"
"\n",
"model.fit(x_train, y_train, batch_size=128, epochs=10, verbose=1,\n",
" validation_data=(x_test, y_test))"
] ]
}, },
{ {
@ -1100,8 +1067,61 @@
"name": "stdout", "name": "stdout",
"output_type": "stream", "output_type": "stream",
"text": [ "text": [
"Test loss: 0.0757974311709404\n", "Epoch 1/10\n",
"Test accuracy: 0.9810000061988831\n" "469/469 [==============================] - 11s 24ms/step - loss: 0.2807 - accuracy: 0.9158 - val_loss: 0.1509 - val_accuracy: 0.9550\n",
"Epoch 2/10\n",
"469/469 [==============================] - 11s 24ms/step - loss: 0.1242 - accuracy: 0.9619 - val_loss: 0.1076 - val_accuracy: 0.9677\n",
"Epoch 3/10\n",
"469/469 [==============================] - 11s 24ms/step - loss: 0.0812 - accuracy: 0.9752 - val_loss: 0.0862 - val_accuracy: 0.9723\n",
"Epoch 4/10\n",
"469/469 [==============================] - 11s 24ms/step - loss: 0.0587 - accuracy: 0.9820 - val_loss: 0.0823 - val_accuracy: 0.9727\n",
"Epoch 5/10\n",
"469/469 [==============================] - 11s 24ms/step - loss: 0.0416 - accuracy: 0.9870 - val_loss: 0.0735 - val_accuracy: 0.9763\n",
"Epoch 6/10\n",
"469/469 [==============================] - 11s 24ms/step - loss: 0.0318 - accuracy: 0.9897 - val_loss: 0.0723 - val_accuracy: 0.9761s: 0.0318 - accuracy: \n",
"Epoch 7/10\n",
"469/469 [==============================] - 11s 23ms/step - loss: 0.0215 - accuracy: 0.9940 - val_loss: 0.0685 - val_accuracy: 0.9792\n",
"Epoch 8/10\n",
"469/469 [==============================] - 11s 23ms/step - loss: 0.0189 - accuracy: 0.9943 - val_loss: 0.0705 - val_accuracy: 0.9786\n",
"Epoch 9/10\n",
"469/469 [==============================] - 11s 24ms/step - loss: 0.0148 - accuracy: 0.9957 - val_loss: 0.0674 - val_accuracy: 0.9790\n",
"Epoch 10/10\n",
"469/469 [==============================] - 11s 23ms/step - loss: 0.0092 - accuracy: 0.9978 - val_loss: 0.0706 - val_accuracy: 0.9798\n"
]
},
{
"data": {
"text/plain": [
"<tensorflow.python.keras.callbacks.History at 0x1bde5f96b50>"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(), metrics=['accuracy'])\n",
"\n",
"model.fit(x_train, y_train, batch_size=128, epochs=10, verbose=1,\n",
" validation_data=(x_test, y_test))"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Test loss: 0.07055816799402237\n",
"Test accuracy: 0.9797999858856201\n"
] ]
} }
], ],

View File

@ -31,6 +31,8 @@
} }
}, },
"source": [ "source": [
"* Złożenie funkcji liniowych jest funkcją liniową.\n",
"* Głównym zadaniem funkcji aktywacji jest wprowadzenie nieliniowości do sieci neuronowej, żeby model mógł odwzorowywać nie tylko liniowe zależności między danymi.\n",
"* Każda funkcja aktywacji ma swoje zalety i wady.\n", "* Każda funkcja aktywacji ma swoje zalety i wady.\n",
"* Różne rodzaje funkcji aktywacji nadają się do różnych zastosowań." "* Różne rodzaje funkcji aktywacji nadają się do różnych zastosowań."
] ]