uczenie-maszynowe/wyk/03_Regresja_liniowa_2.ipynb

1514 lines
524 KiB
Plaintext
Raw Normal View History

2022-10-14 11:34:46 +02:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Uczenie maszynowe\n",
"# 3. Regresja liniowa część 2"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 3.1. Regresja liniowa wielu zmiennych"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"source": [
"Do przewidywania wartości $y$ możemy użyć więcej niż jednej cechy $x$:"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Przykład ceny mieszkań"
]
},
{
"cell_type": "code",
2022-10-28 14:31:38 +02:00
"execution_count": 2,
2022-10-27 16:10:38 +02:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
2022-10-14 11:34:46 +02:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2022-10-18 13:52:38 +02:00
" y:price x1:isNew x2:rooms x3:floor x4:location x5:sqrMetres\n",
"1 476118.0 False 3 1 Centrum 78\n",
"2 459531.0 False 3 2 Sołacz 62\n",
"3 411557.0 False 3 0 Sołacz 15\n",
"4 496416.0 False 4 0 Sołacz 14\n",
"5 406032.0 False 3 0 Sołacz 15\n",
"... ... ... ... ... ... ...\n",
"1335 349000.0 False 4 0 Szczepankowo 29\n",
"1336 399000.0 False 5 0 Szczepankowo 68\n",
"1337 234000.0 True 2 7 Wilda 50\n",
"1338 210000.0 True 2 1 Wilda 65\n",
"1339 279000.0 True 2 2 Łazarz 36\n",
"\n",
"[1339 rows x 6 columns]\n"
2022-10-14 11:34:46 +02:00
]
}
],
"source": [
2022-10-18 13:52:38 +02:00
"import numpy as np\n",
"import pandas as pd\n",
2022-10-14 11:34:46 +02:00
"\n",
2022-10-27 16:10:38 +02:00
"data = pd.read_csv(\"data_flats_train.tsv\", sep=\"\\t\")\n",
2022-10-28 09:00:37 +02:00
"data.rename(\n",
" columns={\n",
" col: f\"x{i}:{col}\" if i > 0 else f\"y:{col}\"\n",
" for i, col in enumerate(data.columns)\n",
" },\n",
" inplace=True,\n",
")\n",
2022-10-18 13:52:38 +02:00
"data.index = np.arange(1, len(data) + 1)\n",
2022-10-28 09:00:37 +02:00
"print(data)\n"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$$ x^{(2)} = ({\\rm \"False\"}, 3, 2, {\\rm \"Sołacz\"}, 62), \\quad x_3^{(2)} = 2 $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Hipoteza"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"W naszym przypadku (wybraliśmy 5 cech):\n",
"\n",
"$$ h_\\theta(x) = \\theta_0 + \\theta_1 x_1 + \\theta_2 x_2 + \\theta_3 x_3 + \\theta_4 x_4 + \\theta_5 x_5 $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"W ogólności ($n$ cech):\n",
"\n",
"$$ h_\\theta(x) = \\theta_0 + \\theta_1 x_1 + \\theta_2 x_2 + \\ldots + \\theta_n x_n $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Jeżeli zdefiniujemy $x_0 = 1$, będziemy mogli powyższy wzór zapisać w bardziej kompaktowy sposób:\n",
"\n",
"$$\n",
"\\begin{array}{rcl}\n",
"h_\\theta(x)\n",
" & = & \\theta_0 x_0 + \\theta_1 x_1 + \\theta_2 x_2 + \\ldots + \\theta_n x_n \\\\\n",
" & = & \\displaystyle\\sum_{i=0}^{n} \\theta_i x_i \\\\\n",
" & = & \\theta^T \\, x \\\\\n",
" & = & x^T \\, \\theta \\\\\n",
"\\end{array}\n",
"$$\n",
"\n",
"($x$ oznacza pojedynczy przykład ze zbioru uczącego)."
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Metoda gradientu prostego notacja macierzowa"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"source": [
"Metoda gradientu prostego przyjmie bardzo elegancką formę, jeżeli do jej zapisu użyjemy wektorów i macierzy."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$$\n",
"X=\\left[\\begin{array}{cc}\n",
"1 & \\left( \\vec x^{(1)} \\right)^T \\\\\n",
"1 & \\left( \\vec x^{(2)} \\right)^T \\\\\n",
"\\vdots & \\vdots\\\\\n",
"1 & \\left( \\vec x^{(m)} \\right)^T \\\\\n",
"\\end{array}\\right] \n",
"= \\left[\\begin{array}{cccc}\n",
"1 & x_1^{(1)} & \\cdots & x_n^{(1)} \\\\\n",
"1 & x_1^{(2)} & \\cdots & x_n^{(2)} \\\\\n",
"\\vdots & \\vdots & \\ddots & \\vdots\\\\\n",
"1 & x_1^{(m)} & \\cdots & x_n^{(m)} \\\\\n",
"\\end{array}\\right]\n",
"\\quad\n",
"\\vec{y} = \n",
"\\left[\\begin{array}{c}\n",
"y^{(1)}\\\\\n",
"y^{(2)}\\\\\n",
"\\vdots\\\\\n",
"y^{(m)}\\\\\n",
"\\end{array}\\right]\n",
"\\quad\n",
"\\theta = \\left[\\begin{array}{c}\n",
"\\theta_0\\\\\n",
"\\theta_1\\\\\n",
"\\vdots\\\\\n",
"\\theta_n\\\\\n",
"\\end{array}\\right]\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"$$h_\\theta(X) = X \\theta$$\n",
"\n",
"($X$ oznacza macierz reprezentującą cechy wszystkich przykładów ze zbioru uczącego)."
]
},
2022-10-14 11:34:46 +02:00
{
"cell_type": "code",
2022-10-28 14:31:38 +02:00
"execution_count": 3,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"# Wersje macierzowe funkcji rysowania wykresów punktowych oraz krzywej regresyjnej\n",
"\n",
2022-10-18 13:52:38 +02:00
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"\n",
2022-10-14 11:34:46 +02:00
"\n",
2022-10-18 13:52:38 +02:00
"def h(theta, x):\n",
" return x * theta\n",
2022-10-14 11:34:46 +02:00
"\n",
"\n",
2022-10-27 16:10:38 +02:00
"def regdots(x, y, xlabel=\"\", ylabel=\"\"):\n",
2022-10-14 11:34:46 +02:00
" fig = plt.figure(figsize=(16 * 0.6, 9 * 0.6))\n",
" ax = fig.add_subplot(111)\n",
" fig.subplots_adjust(left=0.1, right=0.9, bottom=0.1, top=0.9)\n",
2022-10-27 16:10:38 +02:00
" ax.scatter([x], [y], c=\"r\", s=50, label=\"Dane\")\n",
2022-10-14 11:34:46 +02:00
"\n",
2022-10-18 13:52:38 +02:00
" ax.set_xlabel(xlabel)\n",
" ax.set_ylabel(ylabel)\n",
2022-10-14 11:34:46 +02:00
" ax.margins(0.05, 0.05)\n",
" plt.ylim(y.min() - 1, y.max() + 1)\n",
2022-10-27 16:10:38 +02:00
" plt.xlim(np.min(x) - 1, np.max(x) + 1)\n",
2022-10-14 11:34:46 +02:00
" return fig\n",
"\n",
"\n",
2022-10-27 16:10:38 +02:00
"def regline(fig, fun, theta, x, y, cost_fun):\n",
2022-10-14 11:34:46 +02:00
" ax = fig.axes[0]\n",
2022-10-27 16:10:38 +02:00
" x_min = np.min(x)\n",
" x_max = np.max(x)\n",
2022-10-18 13:52:38 +02:00
" x_range = [x_min, x_max]\n",
" x_matrix = np.matrix([1, x_min, 1, x_max]).reshape(2, 2)\n",
2022-10-27 16:10:38 +02:00
" cost = cost_fun(theta, x, y)\n",
2022-10-14 11:34:46 +02:00
" ax.plot(\n",
2022-10-18 13:52:38 +02:00
" x_range,\n",
" fun(theta, x_matrix),\n",
2022-10-14 11:34:46 +02:00
" linewidth=\"2\",\n",
" label=(\n",
2022-10-27 16:10:38 +02:00
" r\"$y={theta0:.1f}{op}{theta1:.1f}x, \\; J(\\theta)={cost:.3f}$\".format(\n",
" theta0=theta[0],\n",
" theta1=(theta[1] if theta[1] >= 0 else -theta[1]),\n",
" op=\"+\" if theta[1] >= 0 else \"-\",\n",
" cost=cost,\n",
2022-10-14 11:34:46 +02:00
" )\n",
" ),\n",
2022-10-27 16:10:38 +02:00
" )\n",
"\n",
2022-10-28 09:00:37 +02:00
"\n",
2022-10-27 16:10:38 +02:00
"def legend(fig):\n",
" ax = fig.axes[0]\n",
" handles, labels = ax.get_legend_handles_labels()\n",
" # try-except block is a fix for a bug in Poly3DCollection\n",
" try:\n",
" fig.legend(handles, labels, fontsize=\"15\", loc=\"lower right\")\n",
" except AttributeError:\n",
2022-10-28 09:00:37 +02:00
" pass\n"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "code",
2022-10-28 14:31:38 +02:00
"execution_count": 4,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2022-10-27 16:10:38 +02:00
"X[:5]=matrix([[ 1., 3., 1., 78.],\n",
2022-10-18 13:52:38 +02:00
" [ 1., 3., 2., 62.],\n",
" [ 1., 3., 0., 15.],\n",
" [ 1., 4., 0., 14.],\n",
" [ 1., 3., 0., 15.]])\n",
2022-10-27 16:10:38 +02:00
"X.shape=(1339, 4)\n",
2022-10-18 13:52:38 +02:00
"y[:5]=matrix([[476118.],\n",
" [459531.],\n",
" [411557.],\n",
" [496416.],\n",
" [406032.]])\n",
"y.shape=(1339, 1)\n"
2022-10-14 11:34:46 +02:00
]
}
],
"source": [
2022-10-18 13:52:38 +02:00
"# Wczytwanie danych z pliku regresja liniowa wielu zmiennych notacja macierzowa\n",
2022-10-14 11:34:46 +02:00
"\n",
2022-10-18 13:52:38 +02:00
"import pandas as pd\n",
2022-10-14 11:34:46 +02:00
"\n",
2022-10-18 13:52:38 +02:00
"data = pd.read_csv(\n",
2022-10-28 09:00:37 +02:00
" \"data_flats_train.tsv\",\n",
" delimiter=\"\\t\",\n",
" usecols=[\"price\", \"rooms\", \"floor\", \"sqrMetres\"],\n",
2022-10-14 11:34:46 +02:00
")\n",
"m, n_plus_1 = data.values.shape\n",
"n = n_plus_1 - 1\n",
2022-10-27 16:10:38 +02:00
"Xn = data.values[:, 1:].reshape(m, n)\n",
2022-10-14 11:34:46 +02:00
"\n",
"# Dodaj kolumnę jedynek do macierzy\n",
2022-10-27 16:10:38 +02:00
"X = np.matrix(np.concatenate((np.ones((m, 1)), Xn), axis=1)).reshape(m, n_plus_1)\n",
2022-10-18 13:52:38 +02:00
"y = np.matrix(data.values[:, 0]).reshape(m, 1)\n",
2022-10-14 11:34:46 +02:00
"\n",
2022-10-27 16:10:38 +02:00
"print(f\"{X[:5]=}\")\n",
"print(f\"{X.shape=}\")\n",
2022-10-18 13:52:38 +02:00
"print(f\"{y[:5]=}\")\n",
"print(f\"{y.shape=}\")\n"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Funkcja kosztu notacja macierzowa"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$$J(\\theta)=\\dfrac{1}{2|\\vec y|}\\left(X\\theta-\\vec{y}\\right)^T\\left(X\\theta-\\vec{y}\\right)$$ \n"
]
},
{
"cell_type": "code",
2022-10-28 14:31:38 +02:00
"execution_count": 5,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [
{
"data": {
"text/latex": [
"$\\displaystyle \\Large J(\\theta) = 85104141370.9717$"
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from IPython.display import display, Math, Latex\n",
"\n",
"\n",
2022-10-18 13:52:38 +02:00
"def J(theta, X, y):\n",
2022-10-14 11:34:46 +02:00
" \"\"\"Wersja macierzowa funkcji kosztu\"\"\"\n",
" m = len(y)\n",
2022-10-18 13:52:38 +02:00
" cost = 1.0 / (2.0 * m) * ((X * theta - y).T * (X * theta - y))\n",
" return cost.item()\n",
2022-10-14 11:34:46 +02:00
"\n",
"\n",
2022-10-18 13:52:38 +02:00
"theta = np.matrix([10, 90, -1, 2.5]).reshape(4, 1)\n",
2022-10-14 11:34:46 +02:00
"\n",
2022-10-27 16:10:38 +02:00
"cost = J(theta, X, y)\n",
2022-10-14 11:34:46 +02:00
"display(Math(r\"\\Large J(\\theta) = %.4f\" % cost))\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Gradient notacja macierzowa"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$$\\nabla J(\\theta) = \\frac{1}{|\\vec y|} X^T\\left(X\\theta-\\vec y\\right)$$"
]
},
{
"cell_type": "code",
2022-10-28 14:31:38 +02:00
"execution_count": 6,
2022-10-18 13:52:38 +02:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Wyświetlanie macierzy w LaTeX-u\n",
"\n",
"\n",
"def latex_matrix(matrix):\n",
" ltx = r\"\\left[\\begin{array}\"\n",
" m, n = matrix.shape\n",
" ltx += \"{\" + (\"r\" * n) + \"}\"\n",
" for i in range(m):\n",
" ltx += r\" & \".join([(\"%.4f\" % j.item()) for j in matrix[i]]) + r\" \\\\ \"\n",
" ltx += r\"\\end{array}\\right]\"\n",
" return ltx\n"
]
},
{
"cell_type": "code",
2022-10-28 14:31:38 +02:00
"execution_count": 7,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [
{
"data": {
"text/latex": [
"$\\displaystyle \\large \\theta = \\left[\\begin{array}{r}10.0000 \\\\ 90.0000 \\\\ -1.0000 \\\\ 2.5000 \\\\ \\end{array}\\right]\\quad\\large \\nabla J(\\theta) = \\left[\\begin{array}{r}-373492.7442 \\\\ -1075656.5086 \\\\ -989554.4921 \\\\ -23806475.6561 \\\\ \\end{array}\\right]$"
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from IPython.display import display, Math, Latex\n",
"\n",
"\n",
2022-10-18 13:52:38 +02:00
"def dJ(theta, X, y):\n",
2022-10-14 11:34:46 +02:00
" \"\"\"Wersja macierzowa gradientu funckji kosztu\"\"\"\n",
" return 1.0 / len(y) * (X.T * (X * theta - y))\n",
"\n",
"\n",
2022-10-18 13:52:38 +02:00
"theta = np.matrix([10, 90, -1, 2.5]).reshape(4, 1)\n",
2022-10-14 11:34:46 +02:00
"\n",
"display(\n",
" Math(\n",
" r\"\\large \\theta = \"\n",
2022-10-18 13:52:38 +02:00
" + latex_matrix(theta)\n",
2022-10-14 11:34:46 +02:00
" + r\"\\quad\"\n",
" + r\"\\large \\nabla J(\\theta) = \"\n",
2022-10-27 16:10:38 +02:00
" + latex_matrix(dJ(theta, X, y))\n",
2022-10-14 11:34:46 +02:00
" )\n",
")\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Algorytm gradientu prostego notacja macierzowa"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$$ \\theta := \\theta - \\alpha \\, \\nabla J(\\theta) $$"
]
},
{
"cell_type": "code",
2022-10-28 14:31:38 +02:00
"execution_count": 8,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
2022-10-28 14:31:38 +02:00
"slide_type": "subslide"
2022-10-14 11:34:46 +02:00
}
},
2022-10-27 16:10:38 +02:00
"outputs": [],
"source": [
"def gradient_descent(fJ, fdJ, theta, X, y, alpha, eps):\n",
" \"\"\"Implementacja algorytmu gradientu prostego za pomocą numpy i macierzy\"\"\"\n",
" current_cost = fJ(theta, X, y)\n",
" history = [[current_cost, theta]]\n",
" while True:\n",
" theta = theta - alpha * fdJ(theta, X, y) # implementacja wzoru\n",
" current_cost, prev_cost = fJ(theta, X, y), current_cost\n",
" if abs(prev_cost - current_cost) <= eps:\n",
" break\n",
" if current_cost > prev_cost:\n",
" print(\"Długość kroku (alpha) jest zbyt duża!\")\n",
" break\n",
" history.append([current_cost, theta])\n",
" return theta, history\n"
]
},
{
"cell_type": "code",
2022-10-28 14:31:38 +02:00
"execution_count": 9,
2022-10-27 16:10:38 +02:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
2022-10-14 11:34:46 +02:00
"outputs": [
{
"data": {
"text/latex": [
"$\\displaystyle \\large\\textrm{Wynik:}\\quad \\theta = \\left[\\begin{array}{r}17446.2135 \\\\ 86476.7960 \\\\ -1374.8950 \\\\ 2165.0689 \\\\ \\end{array}\\right] \\quad J(\\theta) = 10324864803.1591 \\quad \\textrm{po 374575 iteracjach}$"
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
2022-10-18 13:52:38 +02:00
"theta_start = np.zeros((n + 1, 1))\n",
2022-10-14 11:34:46 +02:00
"\n",
"# Zmieniamy wartości alpha (rozmiar kroku) oraz eps (kryterium stopu)\n",
2022-10-27 16:10:38 +02:00
"theta_best, history = gradient_descent(J, dJ, theta_start, X, y, alpha=0.0001, eps=0.1)\n",
2022-10-14 11:34:46 +02:00
"\n",
"display(\n",
" Math(\n",
" r\"\\large\\textrm{Wynik:}\\quad \\theta = \"\n",
2022-10-18 13:52:38 +02:00
" + latex_matrix(theta_best)\n",
2022-10-14 11:34:46 +02:00
" + (r\" \\quad J(\\theta) = %.4f\" % history[-1][0])\n",
" + r\" \\quad \\textrm{po %d iteracjach}\" % len(history)\n",
" )\n",
")\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 3.2. Metoda gradientu prostego w praktyce"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Kryterium stopu"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"source": [
"Algorytm gradientu prostego polega na wykonywaniu określonych kroków w pętli. Pytanie brzmi: kiedy należy zatrzymać wykonywanie tej pętli?\n",
"\n",
"W każdej kolejnej iteracji wartość funkcji kosztu maleje o coraz mniejszą wartość.\n",
"Parametr `eps` określa, jaka wartość graniczna tej różnicy jest dla nas wystarczająca:\n",
"\n",
" * Im mniejsza wartość `eps`, tym dokładniejszy wynik, ale dłuższy czas działania algorytmu.\n",
" * Im większa wartość `eps`, tym krótszy czas działania algorytmu, ale mniej dokładny wynik."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Na wykresie zobaczymy porównanie regresji dla różnych wartości `eps`"
]
},
{
"cell_type": "code",
2022-10-28 14:31:38 +02:00
"execution_count": 10,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
2022-10-18 13:52:38 +02:00
{
2022-10-27 16:10:38 +02:00
"name": "stdout",
"output_type": "stream",
"text": [
"eps= 0.1, cost=10324864803.159, steps=374575\n",
"eps= 1.0, cost=10324942127.799, steps=176746\n",
"eps= 10.0, cost=10325220747.014, steps= 60389\n",
"eps= 100.0, cost=10325742602.406, steps= 46184\n",
"eps= 1000.0, cost=10330453738.393, steps= 34059\n",
"eps=10000.0, cost=10377076139.727, steps= 22123\n"
2022-10-18 13:52:38 +02:00
]
2022-10-14 11:34:46 +02:00
}
],
"source": [
2022-10-27 16:10:38 +02:00
"theta_start = np.zeros((n + 1, 1))\n",
2022-10-18 13:52:38 +02:00
"\n",
2022-10-27 16:10:38 +02:00
"epss = [10.0**n for n in range(-1, 5)]\n",
"costs = []\n",
"lengths = []\n",
"for eps in epss:\n",
2022-10-28 09:00:37 +02:00
" theta_best, history = gradient_descent(\n",
" J, dJ, theta_start, X, y, alpha=0.0001, eps=eps\n",
" )\n",
2022-10-27 16:10:38 +02:00
" cost = history[-1][0]\n",
" steps = len(history)\n",
" print(f\"{eps=:7}, {cost=:15.3f}, {steps=:6}\")\n",
" costs.append(cost)\n",
2022-10-28 09:00:37 +02:00
" lengths.append(steps)\n"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "code",
2022-10-28 14:31:38 +02:00
"execution_count": 11,
2022-10-27 16:10:38 +02:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"def eps_cost_steps_plot(eps, costs, steps):\n",
" \"\"\"Wykres kosztu i liczby kroków w zależności od eps\"\"\"\n",
" fig, ax1 = plt.subplots()\n",
" ax2 = ax1.twinx()\n",
" ax1.plot(eps, steps, \"--s\", color=\"green\")\n",
" ax2.plot(eps, costs, \":o\", color=\"orange\")\n",
" ax1.set_xscale(\"log\")\n",
" ax1.set_xlabel(\"eps\")\n",
" ax1.set_ylabel(\"liczba kroków\", color=\"green\")\n",
" ax2.set_ylabel(\"koszt\", color=\"orange\")\n",
2022-10-28 09:00:37 +02:00
" plt.show()\n"
2022-10-27 16:10:38 +02:00
]
},
{
"cell_type": "code",
2022-10-28 14:31:38 +02:00
"execution_count": 12,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2022-10-28 14:31:38 +02:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAApoAAAHECAYAAACZYIrlAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/av/WaAAAACXBIWXMAAA9hAAAPYQGoP6dpAACEGElEQVR4nO3deVxU5f4H8M+wDIsw7LIIKEqiuKBiIuYuSsZ172Zqaa5pWimlXirT7q3w2mqldu+v0qzMLTVvroQiLrihuKEoigIKuLLvM8/vj4mBERQGGc4An/frNa+Y5zxz5jNHcr4+5zzPkQkhBIiIiIiI6piR1AGIiIiIqHFioUlEREREesFCk4iIiIj0goUmEREREekFC00iIiIi0gsWmkRERESkFyw0iYiIiEgvWGgSERERkV6w0CQiIiIivWChSURERER6wUKTiIiIDFZ0dDSGDRsGNzc3yGQybNu2TafXFxYW4pVXXkGnTp1gYmKCkSNHVtkvKioK3bp1g5mZGby9vbFmzZonzk4sNImIiMiA5eXlwc/PDytWrKjV65VKJSwsLPDGG28gKCioyj5JSUkICQnBgAEDEBcXh7lz52LatGnYs2fPk0QnADIhhJA6BBEREVF1ZDIZtm7dqjUqWVRUhHfffRe//vorMjMz0bFjR/z73/9G//79K73+lVdeQWZmZqVR0YULF2LHjh04f/68pu3FF19EZmYmdu/eradP0zRwRJOIiIgarDlz5iAmJgbr16/H2bNn8fe//x3PPvssrly5UuN9xMTEVBrtDA4ORkxMTF3HbXJYaBIREVGDlJycjNWrV2PTpk3o06cP2rRpg7fffhu9e/fG6tWra7yf9PR0ODs7a7U5OzsjOzsbBQUFdR27STGROgARERFRbZw7dw5KpRJt27bVai8qKoKDg4NEqagiFppERETUIOXm5sLY2BixsbEwNjbW2mZlZVXj/bi4uCAjI0OrLSMjAwqFAhYWFnWStalioUlEREQNUteuXaFUKnH79m306dOn1vsJDAzEzp07tdoiIiIQGBj4pBGbPBaaREREZLByc3ORmJioeZ6UlIS4uDjY29ujbdu2mDBhAiZOnIjPPvsMXbt2xZ07dxAZGYnOnTsjJCQEABAfH4/i4mLcv38fOTk5iIuLAwB06dIFADBz5kx88803WLBgAaZMmYJ9+/Zh48aN2LFjR31/3EaHyxsRERGRwYqKisKAAQMqtU+aNAlr1qxBSUkJPvzwQ6xduxY3b96Eo6MjevbsiQ8++ACdOnUCALRq1Qo3btyotI+KJVBUVBTmzZuH+Ph4uLu7Y9GiRXjllVf09rmaChaaRERERKQXXN6IiIiIiPSChSYRERER6QUnA9Wj0tJSnD59Gs7OzjAyYo1PRETUEKhUKmRkZKBr164wMWHppAserXp0+vRp9OjRQ+oYREREVAvHjx/H008/LXWMBoWFZj0qu73V8ePH4erqKnEaIiIiqom0tDT06NGj0m0qqXosNOtR2elyV1dXuLu7S5yGiIiIdMHL3nTHI0ZEREREesFCk4iIiIj0goUmEREREekFC00iIiIi0gsWmkRERESkFyw0iYiIiEgvWGgSERERkV6w0CQiIiIiveCC7Q1UclYy7ubffeR2R0tHeNp41mMiIiIiiaiUwJ2DQEEaYOEKOPUBjIylTkVgodkgJWclw+cbHxSWFj6yj7mJORLmJLDYJCKixi1lCxD7JpCfWt5m6Q74Lwc8RkuXiwDw1HmDdDf/7mOLTAAoLC187IgnERFRg5eyBTj4vHaRCQD5N9XtKVukyUUaLDSJiIio4VEp1SOZEFVs/Kstdq66H0mGhSYRERE1PHcOVh7J1CKA/BR1P5IMC00iIiJqeArS6rYf6QULTSIiImp4LFzrth/pBQtNIiIianic+qhnl0P2iA4ywNJD3Y8kw0KTiIiIGh4jY/USRlX6q/j0/5LraUqMhWYD5GjpCHMT88f2MTcxh6OlYz0lIiIikoDHaKDXL5XbLd2BPpu5jqYB4ILtDZCnjScS5iRUWifzdNppTPvfNJjITHBg0gEu1k5ERI2f27NA+wVA+p9Au7cASzfeGciAsNBsoDxtPCsVkt1cu+HHMz/iYPJB/J7wO3q495AoHRERUT2R2wFd/y11CnoEFpqNzIcDP0T8nXi80uUVqaMQERFRE8dCs5Hp27Iv+rbsK3UMIiIi/cu/CRTdAWz9ANmjZp+TlDgZqBFTqpQoKi2SOgYREZF+XP0e2NUVODZV6iT0CCw0G6ntCdvRYWUHfHH0C6mjEBER6UdpLmBsATg9I3USegQWmo1UVmEWEu4l4POYz5Ffki91HCIiorrXdRkw5h7QcpzUSegRWGg2Ui92fBGtbFvhTv4dfH/qe6njEBER6YeJBWBiKXUKegQWmo2UqbEpFj6zEADwyZFPUKwsljgRERFRHeL3WoPAQrMRe6XLK3C1ckVKdgp+Pvuz1HGIiIjqRmkBsMUZ2BcMFGdKnYYeg4VmI2ZuYo63At8CACw9tBRKlVLiRERERHXgziGgJBPIvgiY2kidhh6DhWYj92r3V2Fnbocr968g4lqE1HGIiIienEsQEHIRCPie62caOC7Y3shZya3wzXPfwNXKFf1b9Zc6DhER0ZOTyQCbduoHGTQWmk3A+E7jpY5ARERETRBPnTcx2UXZEEJIHYOIiKh2rqwCYucBD85InYRqQNJCc9WqVejcuTMUCgUUCgUCAwOxa9cuzfb+/ftDJpNpPWbOnKm1j+TkZISEhMDS0hLNmzfH/PnzUVpaqtUnKioK3bp1g5mZGby9vbFmzZpKWVasWIFWrVrB3NwcAQEBOH78uNb2wsJCzJ49Gw4ODrCyssKYMWOQkZFRdwejHiyJWgL3z91x4MYBqaMQERHVztXvgYQvgfunpE5CNSBpoenu7o6lS5ciNjYWJ0+exMCBAzFixAhcuHBB02f69OlIS0vTPJYtW6bZplQqERISguLiYhw5cgQ//vgj1qxZg/fff1/TJykpCSEhIRgwYADi4uIwd+5cTJs2DXv27NH02bBhA0JDQ7F48WKcOnUKfn5+CA4Oxu3btzV95s2bh//973/YtGkTDhw4gFu3bmH06NF6PkJ1607eHeQU5+Djgx9LHYWIiKh2OrwLtH4FcHtO6iRUE8LA2NnZie+++04IIUS/fv3Em2+++ci+O3fuFEZGRiI9PV3TtmrVKqFQKERRUZEQQogFCxaIDh06aL1u7NixIjg4WPO8R48eYvbs2ZrnSqVSuLm5ifDwcCGEEJmZmcLU1FRs2rRJ0+fixYsCgIiJianxZ0tJSREAREpKSo1fU5eSHiQJ4w+MBZZAHE89LkkGIiKihkbq7++GzGCu0VQqlVi/fj3y8vIQGBioaf/ll1/g6OiIjh07IiwsDPn55fftjomJQadOneDs7KxpCw4ORnZ2tmZUNCYmBkFBQVrvFRwcjJiYGABAcXExYmNjtfoYGRkhKChI0yc2NhYlJSVafdq1awdPT09Nn6oUFRUhOztb88jJyanNoakzrWxb4aXOLwEAPj7EUU0iIiLSL8kLzXPnzsHKygpmZmaYOXMmtm7dCl9fXwDA+PHj8fPPP2P//v0ICwvDTz/9hJdeeknz2vT0dK0iE4DmeXp6+mP7ZGdno6CgAHfv3oVSqayyT8V9yOVy2NraPrJPVcLDw2FjY6N5lH0uKS18ZiFkkGHbpW24cPtC9S8gIiIyBKX5QOL/AQVpUichHUheaPr4+CAuLg7Hjh3DrFmzMGnSJMTHxwMAZsyYgeDgYHTq1AkTJkzA2rVrsXXrVly9elXi1DUTFhaGrKwszaPsc0mpvVN7jG6vvrY0/FC4xGmIiIhqKD0SOD4D2NsL4OopDYbkhaZcLoe3tzf8/f0RHh4OPz8/LF++vMq+AQEBAIDExEQAgIuLS6WZ32XPXVxcHttHoVDAwsICjo6OMDY2rrJPxX0UFxcjMzPzkX2qYmZmpplRr1AoYG1t/bhDUW/e6fMOAGBz/Gbcy78ncRoiIqIakBkDDj2AFsN4N6AGRPJC82EqlQpFRUVVbouLiwMAuLq
2022-10-14 11:34:46 +02:00
"text/plain": [
2022-10-27 16:10:38 +02:00
"<Figure size 640x480 with 2 Axes>"
2022-10-14 11:34:46 +02:00
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
2022-10-27 16:10:38 +02:00
"eps_cost_steps_plot(epss, costs, lengths)\n"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Długość kroku ($\\alpha$)"
]
},
{
"cell_type": "code",
2022-10-28 14:31:38 +02:00
"execution_count": 13,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
2022-10-27 16:10:38 +02:00
"import ipywidgets as widgets\n",
"\n",
2022-10-14 11:34:46 +02:00
"# Jak zmienia się koszt w kolejnych krokach w zależności od alfa\n",
"\n",
"\n",
2022-10-28 09:00:37 +02:00
"def costchangeplot(history, return_fig=False):\n",
2022-10-14 11:34:46 +02:00
" fig = plt.figure(figsize=(16 * 0.6, 9 * 0.6))\n",
" ax = fig.add_subplot(111)\n",
" fig.subplots_adjust(left=0.1, right=0.9, bottom=0.1, top=0.9)\n",
" ax.set_xlabel(\"krok\")\n",
" ax.set_ylabel(r\"$J(\\theta)$\")\n",
"\n",
" X = np.arange(0, 500, 1)\n",
" Y = [history[step][0] for step in X]\n",
" ax.plot(X, Y, linewidth=\"2\", label=(r\"$J(\\theta)$\"))\n",
2022-10-28 09:00:37 +02:00
" if return_fig:\n",
" return fig\n",
2022-10-14 11:34:46 +02:00
"\n",
"\n",
"def slide7(alpha):\n",
2022-10-28 09:00:37 +02:00
" theta_best, history = gradient_descent(\n",
" J, dJ, theta_start, X, y, alpha=0.0001, eps=0.1\n",
" )\n",
" fig = costchangeplot(history, return_fig=True)\n",
2022-10-14 11:34:46 +02:00
" legend(fig)\n",
"\n",
"\n",
"sliderAlpha1 = widgets.FloatSlider(\n",
" min=0.01, max=0.03, step=0.001, value=0.02, description=r\"$\\alpha$\", width=300\n",
")\n"
]
},
{
"cell_type": "code",
2022-10-28 14:31:38 +02:00
"execution_count": 14,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
2022-10-28 14:31:38 +02:00
"model_id": "ff4fdbda4b1f4f2b8be74d4416294e0a",
2022-10-14 11:34:46 +02:00
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"interactive(children=(FloatSlider(value=0.02, description='$\\\\alpha$', max=0.03, min=0.01, step=0.001), Button…"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"<function __main__.slide7(alpha)>"
]
},
2022-10-28 14:31:38 +02:00
"execution_count": 14,
2022-10-14 11:34:46 +02:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"widgets.interact_manual(slide7, alpha=sliderAlpha1)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 3.3. Normalizacja danych"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"source": [
"Normalizacja danych to proces, który polega na dostosowaniu danych wejściowych w taki sposób, żeby ułatwić działanie algorytmowi gradientu prostego.\n",
"\n",
"Wyjaśnię to na przykladzie."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Użyjemy danych z „Gratka flats challenge 2017”.\n",
"\n",
2022-10-28 14:31:38 +02:00
"Rozważmy model $h(x) = \\theta_0 + \\theta_1 x_1 + \\theta_2 x_2 + \\theta_3 x_3$, w którym cena mieszkania prognozowana jest na podstawie liczby pokoi $x_1$, piętra $x_2$ i metrażu $x_3$:"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "code",
2022-10-27 16:10:38 +02:00
"execution_count": 14,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
2022-10-27 16:10:38 +02:00
"name": "stdout",
"output_type": "stream",
"text": [
" price rooms floor sqrMetres\n",
"0 476118.0 3 1 78\n",
"1 459531.0 3 2 62\n",
"2 411557.0 3 0 15\n",
"3 496416.0 4 0 14\n",
"4 406032.0 3 0 15\n",
"... ... ... ... ...\n",
"1334 349000.0 4 0 29\n",
"1335 399000.0 5 0 68\n",
"1336 234000.0 2 7 50\n",
"1337 210000.0 2 1 65\n",
"1338 279000.0 2 2 36\n",
"\n",
"[1339 rows x 4 columns]\n"
]
2022-10-14 11:34:46 +02:00
}
],
"source": [
2022-10-27 16:10:38 +02:00
"# Dane, które wczytaliśmy na początku wykładu\n",
2022-10-28 09:00:37 +02:00
"print(data)\n"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "code",
2022-10-27 16:10:38 +02:00
"execution_count": 15,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
2022-10-27 16:10:38 +02:00
"def show_mins_and_maxs(X):\n",
" \"\"\"Funkcja, która pokazuje wartości minimalne i maksymalne w macierzy X\"\"\"\n",
" mins = np.amin(X, axis=0).tolist()[0] # wartości minimalne\n",
" maxs = np.amax(X, axis=0).tolist()[0] # wartości maksymalne\n",
2022-10-14 11:34:46 +02:00
" for i, (xmin, xmax) in enumerate(zip(mins, maxs)):\n",
" display(Math(r\"${:.2F} \\leq x_{} \\leq {:.2F}$\".format(xmin, i, xmax)))\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Cechy w danych uczących przyjmują wartości z zakresu:"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "code",
2022-10-27 16:10:38 +02:00
"execution_count": 16,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/latex": [
"$\\displaystyle 1.00 \\leq x_0 \\leq 1.00$"
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/latex": [
"$\\displaystyle 2.00 \\leq x_1 \\leq 7.00$"
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/latex": [
2022-10-27 16:10:38 +02:00
"$\\displaystyle 0.00 \\leq x_2 \\leq 16.00$"
2022-10-14 11:34:46 +02:00
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
2022-10-27 16:10:38 +02:00
},
{
"data": {
"text/latex": [
"$\\displaystyle 12.00 \\leq x_3 \\leq 196.00$"
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"4\n"
]
2022-10-14 11:34:46 +02:00
}
],
"source": [
2022-10-28 14:31:38 +02:00
"show_mins_and_maxs(X)\n"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Jak widzimy, $x_2$ przyjmuje wartości dużo większe niż $x_1$.\n",
"Powoduje to, że wykres funkcji kosztu jest bardzo „spłaszczony” wzdłuż jednej z osi:"
]
},
{
"cell_type": "code",
2022-10-27 16:10:38 +02:00
"execution_count": 17,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
2022-10-27 16:10:38 +02:00
"def contour_plot(X, y):\n",
" theta0_vals = np.linspace(-1e7, 1e7, 100)\n",
" theta1_vals = np.linspace(-1e7, 1e7, 100)\n",
2022-10-14 11:34:46 +02:00
"\n",
" J_vals = np.zeros(shape=(theta0_vals.size, theta1_vals.size))\n",
" for t1, element in enumerate(theta0_vals):\n",
" for t2, element2 in enumerate(theta1_vals):\n",
" thetaT = np.matrix([1.0, element, element2]).reshape(3, 1)\n",
2022-10-27 16:10:38 +02:00
" J_vals[t1, t2] = J(thetaT, X, y)\n",
2022-10-14 11:34:46 +02:00
"\n",
" plt.figure()\n",
2022-10-27 16:10:38 +02:00
" plt.contour(theta0_vals, theta1_vals, J_vals.T, levels=20)\n",
" plt.xlabel(r\"$\\theta_0$\")\n",
" plt.ylabel(r\"$\\theta_1$\")\n"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "code",
2022-10-27 16:10:38 +02:00
"execution_count": 18,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
2022-10-28 09:00:37 +02:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAl4AAAHDCAYAAAD1MRSGAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/av/WaAAAACXBIWXMAAA9hAAAPYQGoP6dpAACqlUlEQVR4nO29eZxcVZ33/7lrdSchAQwkBCOyKIKyRDARxgUljwmgI8rDEGWGZRRGNCgGROIjYFgMCDK48IijbM7AoPgTXMAIRvM4agY0yoBsA0wUQTosMemk03X33x/3nnO/59xT1d1Jd93q7u/79cqrq84991ZVd7r6XZ/vWawsyzIwDMMwDMMwY45d9xNgGIZhGIaZLLB4MQzDMAzDdAgWL4ZhGIZhmA7B4sUwDMMwDNMhWLwYhmEYhmE6BIsXwzAMwzBMh2DxYhiGYRiG6RAsXgzDMAzDMB2CxYthGIZhGKZDsHgxDMMwDMN0CBavMeQXv/gF3vOe92DOnDmwLAt33XXXiM7/3Oc+B8uyKv+mTp06Nk+YYRiGYZgxhcVrDBkYGMAhhxyC6667brvOP++88/D8888r/w488ECceOKJo/xMGYZhGIbpBCxeY8gxxxyDyy67DO973/uMx4MgwHnnnYc999wTU6dOxYIFC7BmzRp5fNq0aZg9e7b8t2HDBjz66KP40Ic+1KFXwDAMwzDMaMLiVSNLly7F2rVrcfvtt+Ohhx7CiSeeiMWLF+PJJ5809v/mN7+J1772tXjrW9/a4WfKMAzDMMxowOJVE8888wxuuukm3HHHHXjrW9+KfffdF+eddx7e8pa34Kabbqr0bzabuPXWWzntYhiGYZhxjFv3E5isPPzww0iSBK997WuV9iAI8IpXvKLS/84778SWLVtw6qmnduopMgzDMAwzyrB41cTWrVvhOA7WrVsHx3GUY9OmTav0/+Y3v4l3v/vdmDVrVqeeIsMwDMMwowyLV03MmzcPSZLghRdeGHLM1vr16/Hzn/8cP/jBDzr07BiGYRiGGQtYvMaQrVu34qmnnpL3169fjwcffBC77rorXvva1+Lkk0/GKaecgi9+8YuYN28eXnzxRaxevRoHH3wwjjvuOHnejTfeiD322APHHHNMHS+DYRiGYZhRwsqyLKv7SUxU1qxZg3e84x2V9lNPPRU333wzoijCZZddhm9961t47rnnMHPmTLz5zW/GihUrcNBBBwEA0jTFXnvthVNOOQWXX355p18CwzAMwzCjyLgVr1/84he46qqrsG7dOjz//PO48847cfzxx7c9Z82aNVi2bBkeeeQRzJ07F5/97Gdx2mmnKX2uu+46XHXVVejr68MhhxyCr3zlK5g/f/7YvRCGYRiGYSYN43Y5iZGuCr9+/Xocd9xxeMc73oEHH3wQ55xzDj784Q/jJz/5iezz7W9/G8uWLcPFF1+M3/3udzjkkEOwaNEivPDCC2P1MhiGYRiGmUSM28SLYlnWkInXpz/9adx99934wx/+INuWLFmCTZs2YdWqVQCABQsW4E1vehO++tWvAsjLfHPnzsXZZ5+NCy64YExfA8MwDMMwE59JM7h+7dq1WLhwodK2aNEinHPOOQCAMAyxbt06LF++XB63bRsLFy7E2rVrW143CAIEQSDvp2mKjRs34hWveAUsyxrdF8EwDMMwzJiQZRm2bNmCOXPmwLbHriA4acSrr6+vsgbWrFmz0N/fj8HBQfz1r39FkiTGPo8//njL665cuRIrVqwYk+fMMAzDMExn+fOf/4xXvvKVY3b9SSNeY8Xy5cuxbNkyeX/z5s141atehaX3LkJjqgcLFhzLg2O5cGwXruXl/2wXtuXBRd4mjjmWC6c47lgePMuDLc8rj9nI76vnenDt/BryWjbpZ7lwLJeTOGa7SLMYaRYhyUIkWUhuR8iyEEkaIslipFmAJIuQZBHSLJD90ixCgvLctDg3Py76RUjSML8ePU9+DZCh+0ZHWLDhWD4cy4dteeVX24eF/Ksj20UfD7bVKL56cFCeb9Nr2T5s+HBs/XzRx4VjNYrznKGfLDPhyIrfu6z4lyL/HcqyAGnWRFb8LubHQ6RZE8gipCDn0H4Q/QJkiJBlTaRZABTnZllUnJP3yxDW/S1QsNCAZfmwLfq1ARS/P5bVAwvkOPLjA1szHPaGq7DTTjuN6fObNOI1e/ZsbNiwQWnbsGEDpk+fjt7eXjiOA8dxjH1mz57d8rqNRgONRqPaPtVDY5pX3MsAREgQIcEggkpv0k38TUmG86pGjhBBV0qZJ6XMtX0ic54me6ZjQvRyQRTiJ69Z6efCs3wpg/kfnHE7v4OpiaoAhoqYCXlLIWSQ9gkr51GRVNrSECkiJGmgXL8UwFR7ZlHxD0iLf3G7F0J/30cJCw4cKXM+HJvclqKmCmLZ1ijksFGcq/Z3ij520Uc5txBL22rABn+4m2xkWYoMIZExIXKhlDbZ1rJfYOjXbNGPSGUhj+ovk/hdHGj7vMVvsPhzG1h5y1j//5004nXEEUfgnnvuUdruu+8+HHHEEQAA3/dx2GGHYfXq1XKQfpqmWL16NZYuXTrix7vgwBsxdVov4ixCnEVIshhxGiHOQsRZjCSNDMfyT/xxFiFOY8RZWD3W9jz1unFapA7kD0SGrHgOISp/N2pApHAukTaHyForiau25+IopNK1fIPstTqfJXA8YVsubMuFi95an4dZAIn4VSQvImkgTQKrAphkgXIN+jj6uZny+50gzrYhHuMPcO2x2ghbIXi2bxa4VrftFtfRH8f2Wf5qwLJsWOgBrB7UkblmWVYkc7qgNUkKGFTaZWJXHG+kmwCsHPPnO27Fq92q8K961auwfPlyPPfcc/jWt74FAPjIRz6Cr371qzj//PPxj//4j/jZz36G73znO7j77rvlNZYtW4ZTTz0Vhx9+OObPn49rr70WAwMDOP3000f8/Hy7gV63uudiHaRZIqUsIjInJa+lCFYl0XSevGYWIU7zc+KUtGWRcg1KksVIshghmjX9kSgxSaBr+UUamN8WSSFN+BwifsNJAstzzBLpWC5L4DigmwRQCJ+S7KVU9kJN+Ewpnn6uLpShcm6eCubnpqC/11nRJ0DU8lmPPTK5MwpdQ5Zxq5JIE0ByjOWva7EsCxZ8wPLhYPvLhFOsfrB4teG3v/2tsiq8GGclVoV//vnn8cwzz8jje++9N+6++2588pOfxJe+9CW88pWvxDe/+U0sWrRI9jnppJPw4osv4qKLLkJfXx8OPfRQrFq1atxvTG1bDnzLAexGzX8i8k8mQrbiLERERa6S+tGkT0hcaEwDdbmr3G/Rj9KdEujDtYuvhZR5ItkzpH3DLfm6SvnYl2MKy8crpZP/eHQ3QgC9obuOGVmWGsu/iZLwkRJu5VhE+qupn7mcrIqm6EMZX/JXTf5oIqjLn54mmuXPhw2Pf3+7kAmxjlc30d/fjxkzZmDz5s2YPn163U+HaYMugbmcxUiyCJFM7EKt9BtpshhXRFHI3XCSQJEsJlnb0UC14rZJ7Twidu1KuiIxLNNDr5IQlreryaFr8R8Qpj2l/JWiF0vhi6rJXUqSO03+zAmgOfUry8dV+esGqvJHRM3WhbCh9NPLvMZ+tq89BpVID9Y4Su879fd73CZeDLOjWJaV/3GHh7qzQCGBIuEbThLYauyfKAdHbY61O1+XQNEe1DwmsBz/R8q6BgEcltRthziyAHY3lmXnf/TRAFDPMI/WyR8VulLg4jRAdeIHlTqR/Oml5Gq5WJZ+oSf59SZ/NkwlXYMM2r5R3nZEALt1pi+L1xjxrfX/H6bPmA7PduFZHjzbhWu5xX0Xru3KY67twi/KO9XjxW3Lg2PZ/KY/QaESWDdplhZJoFn0zGXcsNKHtidpXOmTZDGiVLSFEBNEYtGmlYKlAAK1loIdMnmjKmnVdpEWepZeItbLvdWEUJSZHcuDR1JDLgF3J6r81UOWZcVs3FBKlz45ozrrl6Z/RAJTPdkzlI9TVRYTbYZhighpGiEaYobhWFHO9PWJ3BnEzvLRHOhMAZDFa4z46Qu/hDfgj+o1LVi5pBXlGq8QObcQNEX
2022-10-14 11:34:46 +02:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
2022-10-28 09:00:37 +02:00
"contour_plot(\n",
" X[:, [0, 2, 3]], y\n",
") # Wybieramy cechy [0, 2, 3], bo więcej nie da się zobaczyć na płaskim na wykresie\n"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"source": [
2022-10-27 16:10:38 +02:00
"Jeżeli funkcja kosztu ma kształt taki, jak na powyższym wykresie, to łatwo sobie wyobrazić, że znalezienie minimum lokalnego przy użyciu metody gradientu prostego musi stanowć nie lada wyzwanie: algorytm szybko znajdzie „rynnę”, ale „zjazd” wzdłuż „rynny” w poszukiwaniu minimum będzie odbywał się bardzo powoli."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Liczba kroków: 374575\n",
"Koszt: 10324864803.159063\n"
]
}
],
"source": [
"theta_start = np.zeros((n + 1, 1))\n",
"theta_best, history = gradient_descent(J, dJ, theta_start, X, y, alpha=0.0001, eps=0.1)\n",
"print(f\"Liczba kroków: {len(history)}\")\n",
2022-10-28 09:00:37 +02:00
"print(f\"Koszt: {history[-1][0]}\")\n"
2022-10-27 16:10:38 +02:00
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
2022-10-28 14:31:38 +02:00
"scrolled": true,
2022-10-27 16:10:38 +02:00
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2022-10-28 09:00:37 +02:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAz4AAAH+CAYAAABdvNtFAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/av/WaAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAzkUlEQVR4nO3dfZRVdb0/8M+ZGRhQmUF5FEEkH1PEwIy45s9SfLrk0nvLay66at6baWha1lVWV83lJbS6rh50UVmBrauS1kVv5kMootcCQxRFKcQyIRG5pTCAOODM/v2BczhnmEEG2GdvDq/XWmc1c84+e3/2mW8s3+f73Z9dSJIkCQAAgCpWk3UBAAAAaRN8AACAqif4AAAAVU/wAQAAqp7gAwAAVD3BBwAAqHqCDwAAUPUEHwAAoOoJPgAAQNUTfAAAgKpXtcHn8ccfj9NPPz0GDRoUhUIh7rnnni69/+23347zzz8/jjzyyKirq4szzzyzw+1mz54do0aNivr6+jjooINi2rRpO1w7AACwc1Vt8Fm3bl0cddRRccstt2zX+1taWqJnz57xhS98IcaOHdvhNi+//HKMGzcuPvaxj8WCBQvi8ssvj3/913+Nhx56aEdKBwAAdrJCkiRJ1kWkrVAoxIwZM8pmbZqbm+OrX/1q3HnnnbFq1aoYPnx43HjjjfHRj350i/eff/75sWrVqi1mja688sr41a9+Fc8//3zxuU996lOxatWqePDBB1M6GwAAoKuqdsbnvVxyySUxZ86cmD59ejz33HNx1llnxamnnhpLlizZ5n3MmTNni9mgU045JebMmbOzywUAAHbAbhl8li5dGlOnTo277747jjvuuDjwwAPjy1/+cnzkIx+JqVOnbvN+VqxYEQMGDCh7bsCAAdHU1BTr16/f2WUDAADbqS7rArKwcOHCaGlpiUMOOaTs+ebm5ujTp09GVQEAAGnZLYPP2rVro7a2NubPnx+1tbVlr+21117bvJ+BAwfG66+/Xvbc66+/Hg0NDdGzZ8+dUisAALDjdsvgM3LkyGhpaYmVK1fGcccdt937GTNmTNx///1lz82cOTPGjBmzoyUCAAA7UdUGn7Vr18ZLL71U/P3ll1+OBQsWxD777BOHHHJIjB8/Ps4999z4z//8zxg5cmT83//9XzzyyCMxYsSIGDduXERELFq0KDZs2BBvvPFGrFmzJhYsWBARER/4wAciIuKiiy6Km2++Of7t3/4tLrjggpg1a1bcdddd8atf/arSpwsAAGxF1baznj17dnzsYx/b4vnzzjsvpk2bFhs3boz/+I//iJ/+9Kfx6quvRt++fePDH/5wXHfddXHkkUdGRMQBBxwQr7zyyhb7KP3IZs+eHV/84hdj0aJFMXjw4Lj66qvj/PPPT+28AACArqva4AMAANBmt2xnDQAA7F4EHwAAoOpVVXOD1tbWWL58efTq1SsKhULW5QAAAClKkiTWrFkTgwYNipqarc/pVFXwWb58eQwZMiTrMgAAgApatmxZDB48eKvbVFXw6dWrV0RsOvGGhoaMqwEAANLU1NQUQ4YMKeaAramq4NO2vK2hoUHwAQCA3cS2XOaiuQEAAFD1BB8AAKDqCT4AAEDVE3wAAICqJ/gAAABVT/ABAACqnuADAABUPcEHAACoeoIPAABQ9QQfAACg6gk+AABA1RN8AACAqif4AAAAVU/wAQAAql5d1gVUo5bWJDa2tEZERG1NIbrVypcAAJAl/0Weggeefy0Ou/rBOOzqB+O23/4563IAAGC3J/ikoBCF4s9JkmEhAABARAg+qajZnHsiCckHAACyJvikoFASfFrlHgAAyJzgk4JCwVI3AADIE8EnBSUTPtEq+QAAQOYEnxTUlK51AwAAMif4pKDsGh8X+QAAQOYEnxQUyrq6AQAAWRN8UqC5AQAA5IvgkwLNDQAAIF8EnxSUNjcQewAAIHuCTwrKrvEx4wMAAJkTfFJQ4xofAADIFcEnBa7xAQCAfBF80qCdNQAA5IrgkwJL3QAAIF8EnxSULnXT3AAAALIn+KSgpkY7awAAyBPBJwVlzQ1aRR8AAMia4JOCghuYAgBArgg+KSi9gal21gAAkD3BJwXlzQ0yKwMAAHiX4JOC8nbWkg8AAGRN8ElBwQ1MAQAgVwSfFLiBKQAA5Euugk9LS0tcffXVMWzYsOjZs2cceOCBcf311+/Sy8U0NwAAgOzVZV1AqRtvvDGmTJkSt912WxxxxBHx1FNPxWc+85lobGyML3zhC1mXt81qtLMGAIBcyVXw+e1vfxtnnHFGjBs3LiIiDjjggLjzzjvjd7/7XcaVdU3ZNT5mfAAAIHO5Wur2d3/3d/HII4/Eiy++GBERzz77bDzxxBNx2mmndbh9c3NzNDU1lT3yoDz4ZFcHAACwSa5mfK666qpoamqKww47LGpra6OlpSUmTZoU48eP73D7yZMnx3XXXVfhKt9b6VI31/gAAED2cjXjc9ddd8Xtt98ed9xxRzz99NNx2223xbe+9a247bbbOtx+4sSJsXr16uJj2bJlFa64Y25gCgAA+ZKrGZ+vfOUrcdVVV8WnPvWpiIg48sgj45VXXonJkyfHeeedt8X29fX1UV9fX+ky31NBcwMAAMiVXM34vPXWW1FTU15SbW1ttLa2ZlTR9im9xsdSNwAAyF6uZnxOP/30mDRpUuy///5xxBFHxDPPPBM33XRTXHDBBVmX1iU1Zd0NsqsDAADYJFfB53vf+15cffXV8fnPfz5WrlwZgwYNis997nNxzTXXZF1al5Re42PGBwAAsper4NOrV6/49re/Hd/+9rezLmWHmPABAIB8ydU1PtWivJ11hoUAAAARIfikLrHUDQAAMif4pKCmRjtrAADIE8EnBeU3MBV9AAAga4JPCkqv8ZF7AAAge4JPCtzAFAAA8kXwSUH5UrfMygAAAN4l+KSgoJ01AADkiuCTgtKlbvq6AQBA9gSfFLiBKQAA5IvgkwLtrAEAIF8EnxSUtbPOsA4AAGATwScNZe2ssysDAADYRPBJQWlzA0vdAAAge4JPCsqWusk9AACQOcEnBWXNDVzlAwAAmRN8UlDWzro1w0IAAICIEHxSUXaNjxkfAADInOCTgvLmBtnVAQAAbCL4pKAQmhsAAECeCD4psNQNAADyRfBJQVlzA7kHAAAyJ/ikoKydtbVuAACQOcEnBaVL3cz4AABA9gSfFBRKko/cAwAA2RN8UlLTln0sdQMAgMwJPilpm/Wx1A0AALIn+KSkOOFjsRsAAGRO8ElJW0vr1taMCwEAAASf1Lw75WO+BwAAsif4pKStuYH7+AAAQPYEn5QU3p3ykXsAACB7gk9KijM+FrsBAEDmBJ+UaGcNAAD5IfikZPP9SyUfAADImuCTkkKxuUG2dQAAAIJPatqWusk9AACQPcEnJW3NDVpN+QAAQOYEn5QUZ3zkHgAAyJzgkxIzPgAAkB+CT2rM+AAAQF4IPilp6+oGAABkT/BJiaVuAACQH4JPSgqWugEAQG7kKvgccMABUSgUtnhMmDAh69K6zIwPAADkR13WBZSaN29etLS0FH9//vnn46STToqzzjorw6q2jxuYAgBAfuQq+PTr16/s9xtuuCEOPPDAOP744zOqaPu1NTdIzPgAAEDmchV8Sm3YsCH+67/+K770pS8VZ0/aa25ujubm5uLvTU1NlSrvPW0OPtnWAQAA5Owan1L33HNPrFq1Ks4///xOt5k8eXI0NjYWH0OGDKlcge+h2Nwg4zoAAIAcB58f//jHcdppp8WgQYM63WbixImxevXq4mPZsmUVrHDrNDcAAID8yOVSt1deeSUefvjh+O///u+tbldfXx/19fUVqqpris0N5B4AAMhcLmd8pk6dGv37949x48ZlXcp2K5jxAQCA3Mhd8GltbY2pU6fGeeedF3V1uZyQ2ibFdgxyDwAAZC53wefhhx+OpUuXxgUXXJB1KTuk5t0pHzM+AACQvdxNqZx88slVce+bYjvrbMsAAAAihzM+1aLYzlryAQCAzAk+KdHcAAAA8kPwSUmxnXXGdQAAAIJPatpuYFoN1ysBAMCuTvB
2022-10-27 16:10:38 +02:00
"text/plain": [
"<Figure size 960x540 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
2022-10-28 09:00:37 +02:00
"costchangeplot(history)\n"
2022-10-27 16:10:38 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
2022-10-14 11:34:46 +02:00
"\n",
"Jak temu zaradzić?\n",
"\n",
"Spróbujemy przekształcić dane tak, żeby funkcja kosztu miała „ładny”, regularny kształt."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Skalowanie"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Będziemy dążyć do tego, żeby każda z cech przyjmowała wartości w podobnym zakresie.\n",
"\n",
"W tym celu przeskalujemy wartości każdej z cech, dzieląc je przez wartość maksymalną:\n",
"\n",
"$$ \\hat{x_i}^{(j)} := \\frac{x_i^{(j)}}{\\max_j x_i^{(j)}} $$"
]
},
{
"cell_type": "code",
2022-10-27 16:10:38 +02:00
"execution_count": 21,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/latex": [
"$\\displaystyle 1.00 \\leq x_0 \\leq 1.00$"
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/latex": [
"$\\displaystyle 0.29 \\leq x_1 \\leq 1.00$"
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/latex": [
2022-10-27 16:10:38 +02:00
"$\\displaystyle 0.00 \\leq x_2 \\leq 1.00$"
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/latex": [
"$\\displaystyle 0.06 \\leq x_3 \\leq 1.00$"
2022-10-14 11:34:46 +02:00
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
2022-10-27 16:10:38 +02:00
"X_scaled = X / np.amax(X, axis=0)\n",
2022-10-14 11:34:46 +02:00
"\n",
2022-10-27 16:10:38 +02:00
"show_mins_and_maxs(X_scaled)\n"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "code",
2022-10-27 16:10:38 +02:00
"execution_count": 22,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2022-10-28 09:00:37 +02:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAl4AAAHDCAYAAAD1MRSGAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/av/WaAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOyddXhUd/r2P3F3g4QkhIQkxN0VtyIFShVoqW6Nbrvd7cq72/3tbne3u11a6galBUop7hp3d4MkBEhC3GWSmfP+MSGUKjITac/nunLRnpHznfkmM/d55H5UBEEQEBERERERERERUTqq470AEREREREREZFfCqLwEhEREREREREZI0ThJSIiIiIiIiIyRojCS0RERERERERkjBCFl4iIiIiIiIjIGCEKLxERERERERGRMUIUXiIiIiIiIiIiY4QovERERERERERExghReImIiIiIiIiIjBGi8BIRERERERERGSNE4aVEEhMTueuuu7C2tkZFRYUDBw7c0uP/8pe/oKKi8p0fPT095SxYRERERERERKmIwkuJ9Pb24u3tzTvvvHNbj3/ppZdoaGi44cfNzY01a9YoeKUiIiIiIiIiY4EovJTIokWL+Nvf/sbKlSu/9/bBwUFeeuklbGxs0NPTIzg4mPj4+NHb9fX1mTJlyujP1atXKS0tZePGjWP0CkREREREREQUiSi8xpFnnnmGtLQ0vvzySwoLC1mzZg0LFy6kqqrqe+//8ccf4+zsTGRk5BivVEREREREREQRiMJrnKirq2Pr1q3s2bOHyMhIHB0deemll4iIiGDr1q3fuf/AwAA7duwQo10iIiIiIiKTGPXxXsAvlaKiIqRSKc7OzjccHxwcxMzM7Dv3379/P93d3axfv36sligiIiIiIiKiYEThNU709PSgpqZGTk4OampqN9ymr6//nft//PHHLF26FCsrq7FaooiIiIiIiIiCEYXXOOHr64tUKqWpqekna7ZqamqIi4vj0KFDY7Q6EREREREREWUgCi8l0tPTw/nz50f/v6amhvz8fExNTXF2duaBBx5g3bp1/Pe//8XX15fm5mbOnj2Ll5cXS5YsGX3cp59+ytSpU1m0aNF4vAwRERERERERBaEiCIIw3ov4uRIfH09sbOx3jq9fv55t27YxNDTE3/72N7Zv386VK1cwNzcnJCSEV199FU9PTwBkMhn29vasW7eOv//972P9EkREREREREQUyKQVXomJibz++uvk5OTQ0NDA/v37WbFixY8+Jj4+nl//+teUlJRga2vLH//4RzZs2HDDfd555x1ef/11Ghsb8fb2ZsuWLQQFBSnvhYiIiIiIiIj8Ypi0dhK36gpfU1PDkiVLiI2NJT8/n02bNvHoo49y8uTJ0fvs3r2bX//61/z5z38mNzcXb29vFixYQFNTk7JehoiIiIiIiMgviEkb8fomKioqPxnx+u1vf8vRo0cpLi4ePXbvvffS0dHBiRMnAAgODiYwMJC3334bkKf5bG1tefbZZ/nd736n1NcgIiIiIiIi8vPnF1Ncn5aWxty5c284tmDBAjZt2gSARCIhJyeHV155ZfR2VVVV5s6dS1pa2g8+7+DgIIODg6P/L5PJaGtrw8zMDBUVFcW+CBERERERERGlIAgC3d3dWFtbo6qqvITgL0Z4NTY2fscDy8rKiq6uLvr7+2lvb0cqlX7vfcrLy3/weV977TVeffVVpaxZREREREREZGy5dOkS06ZNU9rz/2KEl7J45ZVX+PWvfz36/52dndjZ2XHp0iUMDQ0Vdh7psIyG+nYu1bZQd7GF+sttNFxpp7G+g66Ovpt6DgMjHYxN9DAy1kXfUAcDAx0MDLTQN9RBT18LHR0tdHQ10dbWQEdPEy0tTbS01NDQVEdTUx0NTXXU1dVQVb35SJ5MJjA8LEU6LGNoSMqQZIjBwWEkg8NIJMMMDEgY6B+iv09Cf7+Egb4h+voG6O0dpLdnkL6eAXp6B+npHqC7s5/urn6Gh6S39N7p6GlhbmGAuYUBZuaGmFnoY25piIWlIZZWRphbGqKpJf4p3CmCIHCxrpWsnFqycqopK69HJrteyaCrq4Wftx0BAQ4E+jlgbKw7jqu9ebp6BkjMOs+5tAoKK+pHj2tqqBPmO525YS4EeNqjrq72I8+iXHr6BjmdXcHhlBIuXGkdPT7D2oxlEe7MC3BBX1drTNZyvqGFr1ILOZVfyeCw/G/V1ECHFYEe3B3sjpmhnlLOOzA0zNHCcr7IyOdSeycAmmpqLPF05oEQH6abmSr0fNVtbWzLyeVoeSVSmQwAjymWPBLgT8yMGagqIOPROyRhd2kh24pyaRsYAGCqvj6PegewwtkNLbU7+9w639HCeyVpnLpcCYAKsMjOlSfcQplhdPvvV1N/F5+eT+VgXT7DI9VM0VZOPOEcxUyj2zMBbxroYOfFOE415CJDQAUVZlt58eD0OUzVufW1Ng+0svfKMdJbcwBQU1EjVMeXF8OfxsDA4LbWeLP8Ymq8oqKi8PPzY/PmzaPHtm7dyqZNm+js7EQikaCrq8vXX399w/OsX7+ejo4ODh48eFNr6erqwsjIiM7OztsWXv19Es5XNlBRWk9VeQO1F5q4XNfK0I8IDhNTPaZYG2M5xRgLK7mgsJxihLmFASZm+hib6KGhMX5fDIpCEAT6+yR0dfbT2dFLR3sv7W3yn472Xtpbe2hr6aG1pZvWlm4G+odu6nlNzfSxmmrEFGsTplgbYz3NlKk2JljbmGBmYSCmjW+Dnp4BsnJqSM+sJiPrAp2d/aO3qaiAm6s1oSFOhIU4Md3efFK8xw3NXZxKKuNEUikXr7SNHjc20GFuuAsLo9yZ5Wg1bq9FEASKqxvYl1DEqewKBiXDAGhrqrMw2JVV0d7Mmj420y/ae/rZm17ElykFNHX2AKChpsZCX2cejPJj1jRLpZxXKpNxtvwCnybnUHC5AZD/vs12cWRjRAC+dtYKPV99VxcfZ2Wzu7BoVGjONDfjqeAgFru6oK6AlFXfkISdpYV8kJ9Fc18vAFP09PmVXzD3uHqirX5nAqysvYk3C5M4eUkuwFRVVFg+3Z3nvSKwNzC57ee93NvOuxUJHKorGBFLsMjGg2dmxTJd/7uj8W6Gi71NfFJ9kvimQgDUVdRYZhPCeoe5mGh+d+rLT1HTW8euiwco6CxlqFfC/nlb7+j7+2b4xQiv3/72txw7doyioqLRY/fffz9tbW03FNcHBQWxZcsWQF6vZWdnxzPPPHPTxfW3KrwEQaD+cjslBXUUFdRRXnKFS7UtN0QKrqGlrYG9gwX2DhbYTjfDZpoZ1tNMmGpjgq7e2FzNTjb6egdpae6mpamL5pGflqYumho7aWrs5GpjJ4MDPy7OtLQ1sJlmio2dKTa2ptjYmmFrb4atvTkGhjpj9EomN1KpjIrKBtIzL5CWcYHzF27sFJ46xYjw0JmEh87E02MaamoTu+FaEAQqapo4mVjKqZRy2juvR52n25iyKNqdhVGzsDBV7pXzj9HdN8DRtDL2JRRSXX89CuY23YrVMd7MD3RBW0tD6esYkko5W3ieHUl5FNQ2jB73n2HDQ9F+RLvPQE0J9TSCIJBbV8+nKdmcK6++fl47azZGBhI90+GWovc/RUtvL9ty8vgiL58eiQQAe2NjnggOZIW7G5pqd37hOzA8xJdlRbyXl8nVXrmYVaQAK2lrZHNhMmcuVwGgpqLCakcvnvEIZ5q+0W0/b3V3M2+Xx3PiSsno895t58tTrtFM0bm95y3vusSHF46T3SZfq46aJmvtornXLgpdde1bfr6iznKSL6bxK+9HROH1Q3zTFd7X15c33niD2NhYTE1NsbOz45VXXuHKlSts374dkNtJeHh48PTTT/PII49w7tw5nnvuOY4ePcqCBQsAuZ3E+vXr+eCDDwgKCmLz5s189dVXlJeX3/SMxJ8SXoIgUH+pjZzMagrzLlKcX0dba8937mduaYjzrKk4u1ozY6YV9jMssJpirNAPChH5fnR19nO1sYPG+g4aR9K39VfaabjSztXGDmTSH/4TMTLRxdbeHFt7c+wdzLGbbo6dgwU
2022-10-14 11:34:46 +02:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
2022-10-27 16:10:38 +02:00
"contour_plot(X_scaled[:, [0, 2, 3]], y)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Teraz możemy użyć większej długości kroku $\\alpha$, dzięki czemu algorytm szybciej znajdzie rozwiązanie."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Liczba kroków: 82456\n",
"Koszt: 10324856880.491594\n"
]
}
],
"source": [
"theta_start = np.zeros((n + 1, 1))\n",
2022-10-28 09:00:37 +02:00
"theta_best, history = gradient_descent(\n",
" J, dJ, theta_start, X_scaled, y, alpha=0.01, eps=0.1\n",
")\n",
2022-10-27 16:10:38 +02:00
"print(f\"Liczba kroków: {len(history)}\")\n",
2022-10-28 09:00:37 +02:00
"print(f\"Koszt: {history[-1][0]}\")\n"
2022-10-27 16:10:38 +02:00
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2022-10-28 09:00:37 +02:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAz4AAAH+CAYAAABdvNtFAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/av/WaAAAACXBIWXMAAA9hAAAPYQGoP6dpAABG50lEQVR4nO3deXhU9d3//9fJTDLZN7KQkEDYd5Bd6i4oWqRaW2+ltGJpa7W4tbZ39b6/Vv15U7S1fr1b/VqXCva+FW6XG627KAJlNayyyL6FJYQtmayTycz5/ZFkmIEACWTmzEyej+s6V86c85mT9yRHrrz8LMcwTdMUAAAAAESxGKsLAAAAAIBgI/gAAAAAiHoEHwAAAABRj+ADAAAAIOoRfAAAAABEPYIPAAAAgKhH8AEAAAAQ9Qg+AAAAAKIewQcAAABA1CP4AAAAAIh6URt8Fi9erEmTJik/P1+GYejdd99t0/vr6up0xx13aPDgwbLb7brppptabLdw4UINHz5cDodDvXr10uzZsy+4dgAAAADtK2qDT3V1tYYOHarnn3/+vN7v8XiUkJCg++67T+PHj2+xze7duzVx4kRdddVVWrdunR544AH99Kc/1aeffnohpQMAAABoZ4ZpmqbVRQSbYRiaN29eQK+Ny+XSv//7v2vOnDkqLy/XoEGD9NRTT+nKK6887f133HGHysvLT+s1+u1vf6sPP/xQGzdu9B277bbbVF5erk8++SRInwYAAABAW0Vtj8+53HPPPVq+fLnmzp2rr7/+Wrfccouuu+46bd++vdXXWL58+Wm9QRMmTNDy5cvbu1wAAAAAF6BDBp99+/Zp1qxZeuutt3TZZZepZ8+e+vWvf61LL71Us2bNavV1SktLlZubG3AsNzdXTqdTtbW17V02AAAAgPNkt7oAK2zYsEEej0d9+vQJOO5yudSpUyeLqgIAAAAQLB0y+FRVVclms2n16tWy2WwB55KTk1t9nc6dO+vw4cMBxw4fPqzU1FQlJCS0S60AAAAALlyHDD7Dhg2Tx+NRWVmZLrvssvO+ztixY/XRRx8FHJs/f77Gjh17oSUCAAAAaEdRG3yqqqq0Y8cO3+vdu3dr3bp1yszMVJ8+fTRlyhTdfvvt+tOf/qRhw4bpyJEj+uKLLzRkyBBNnDhRkrR582bV19fr+PHjqqys1Lp16yRJF110kSTprrvu0nPPPad//dd/1bRp07RgwQK9+eab+vDDD0P9cQEAAACcRdQuZ71w4UJdddVVpx2fOnWqZs+eLbfbrf/4j//Q3//+dx04cEBZWVm6+OKL9fjjj2vw4MGSpKKiIu3du/e0a/j/yBYuXKhf/vKX2rx5swoKCvTII4/ojjvuCNrnAgAAANB2URt8AAAAAKBZh1zOGgAAAEDHQvABAAAAEPWianEDr9ergwcPKiUlRYZhWF0OAAAAgCAyTVOVlZXKz89XTMzZ+3SiKvgcPHhQhYWFVpcBAAAAIIRKSkpUUFBw1jZRFXxSUlIkNX7w1NRUi6sBAAAAEExOp1OFhYW+HHA2URV8moe3paamEnwAAACADqI101xY3AAAAABA1CP4AAAAAIh6BB8AAAAAUY/gAwAAACDqEXwAAAAARD2CDwAAAICoR/ABAAAAEPUIPgAAAACiHsEHAAAAQNQj+AAAAACIegQfAAAAAFGP4AMAAAAg6hF8AAAAAEQ9gg8AAACAqEfwCRKv19TB8lpVuRqsLgUAAADo8Ag+QTD3q30a8Ogn+taTC/TlljKrywEAAAA6PIJPEKQlxKrO7ZUk7TlabXE1AAAAAAg+QVCUleTb332M4AMAAABYjeATBEWdTgYfenwAAAAA6xF8giAhzqa8tHhJ0m6CDwAAAGA5gk+QNPf6nKhxq6LGbXE1AAAAQMdG8AmS7tnM8wEAAADCBcEnSLr7zfPZfbTKwkoAAAAAEHyCJGBlt6M1FlYCAAAAgOATJN2zEn37rOwGAAAAWIvgEySFmYmKMRr39zDHBwAAALAUwSdIHHabumQkSJJ2H6mWaZoWVwQAAAB0XASfIGpe0rrS1aBj1fUWVwMAAAB0XASfIOrut8AB83wAAAAA6xB8gsg/+Owi+AAAAACWIfgEURE9PgAAAEBYIPgEkf9DTFnZDQAAALAOwSeICjISZG9a05qHmAIAAADWIfgEkd0Wo66ZjQ8y3XOUJa0BAAAAqxB8gqx5nk+t26PDTpfF1QAAAAAdE8EnyIr85vnsZoEDAAAAwBIEnyDrnk3wAQAAAKxG8AkyVnYDAAAArEfwCbKirETfPj0+AAAAgDUIPkGWn5agOHvjj5mHmAIAAADWIPgEWUyMoaJOjb0+e4/VyONlSWsAAAAg1MIq+Hg8Hj3yyCPq3r27EhIS1LNnTz3xxBMR//yb5pXd6j1eHSyvtbgaAAAAoOOxW12Av6eeekovvPCCXnvtNQ0cOFCrVq3Sj3/8Y6Wlpem+++6zurzz1j0rcIGDwszEs7QGAAAA0N7CKvgsW7ZMN954oyZOnChJKioq0pw5c/TVV19ZXNmFKcoKXNL6st7ZFlYDAAAAdDxhNdTtW9/6lr744gtt27ZNkrR+/XotWbJE119/fYvtXS6XnE5nwBaOumfxLB8AAADASmHV4/PQQw/J6XSqX79+stls8ng8mjFjhqZMmdJi+5kzZ+rxxx8PcZVt18Mv+Ow6QvABAAAAQi2senzefPNNvf7663rjjTe0Zs0avfbaa3r66af12muvtdj+4YcfVkVFhW8rKSkJccWtk53iUIqjMWPuPFJlcTUAAABAxxNWPT6/+c1v9NBDD+m2226TJA0ePFh79+7VzJkzNXXq1NPaOxwOORyOUJfZZoZhqEdOstaXlOtAea1q6z1KiLNZXRYAAADQYYRVj09NTY1iYgJLstls8nq9FlXUfnpmNw53M03m+QAAAAChFlY9PpMmTdKMGTPUtWtXDRw4UGvXrtUzzzyjadOmWV3aBeuZnezb33GkSgPyUy2sBgAAAOhYwir4/OUvf9EjjzyiX/ziFyorK1N+fr5+/vOf63e/+53VpV2wXjkng8/OMub5AAAAAKEUVsEnJSVFzz77rJ599lmrS2l3/j0+LHAAAAAAhFZYzfGJZt06JcoeY0iSdrKkNQAAABBSBJ8QibXFqGunREnSriNV8npNiysCAAAAOg6CTwj1ahru5mrw6kB5rcXVAAAAAB0HwSeEeuYEruwGAAAAIDQIPiEUsMABK7sBAAAAIUPwCaHmh5hKLHAAAAAAhBLBJ4R68iwfAAAAwBIEnxBKjY9VTopDEs/yAQAAAEKJ4BNizfN8jlXX60R1vcXVAAAAAB0DwSfEeuacnOez6yi9PgAAAEAoEHxCzH9ltx3M8wEAAABCguATYr38FzhgZTcAAAAgJAg+IcazfAAAAIDQI/iEWOfUeCXG2SSxshsAAAAQKgSfEIuJMdSj6UGm+47XqM7tsbgiAAAAIPoRfCzQq2m4m9eU9h6rsbgaAAAAIPoRfCwQMM+H4W4AAABA0BF8LNAzhwUOAAAAgFAi+Fgg4Fk+9PgAAAAAQUfwsUBRVqJijMZ9HmIKAAAABB/BxwIOu03dOjWu7LbzSJW8XtPiigAAAIDoRvCxSO+meT51bq9KTrCyGwAAABBMBB+L9O2c4tvfWlppYSUAAABA9CP4WKR37sngs515PgAAAEBQEXws0jeXHh8AAAAgVAg+FumelSR709Ju2w4TfAAAAIBgIvhYJM4eo6KsxpXddh2pVoPHa3FFAAAAQPQi+FioebhbvcerPcdY2Q0AAAAIFoKPhXrnJvv2Ge4GAAAABA/Bx0L+CxwQfAAAAIDgIfhYqDfBBwAAAAgJgo+FijolKs7W+CvYdphn+QAAAADBQvCxkN0Wox7ZjSu77TlaLVeDx+KKAAAAgOhE8LFY386Nw90avKZ2H622uBoAAAAgOhF8LNYnYJ4Pw90AAACAYCD4WKx3jt+S1qUscAAAAAAEA8HHYs1D3SRWdgMAAACCheBjscKMRMXHNq/sRvABAAAAgoHgY7GYGEO9cxp7ffYer1Gdm5XdAAAAgPZG8AkDvXMb5/m
2022-10-27 16:10:38 +02:00
"text/plain": [
"<Figure size 960x540 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
2022-10-28 09:00:37 +02:00
"costchangeplot(history)\n"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Normalizacja średniej"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Będziemy dążyć do tego, żeby dodatkowo średnia wartość każdej z cech była w okolicach $0$.\n",
"\n",
"W tym celu oprócz przeskalowania odejmiemy wartość średniej od wartości każdej z cech:\n",
"\n",
"$$ \\hat{x_i}^{(j)} := \\frac{x_i^{(j)} - \\mu_i}{\\max_j x_i^{(j)}} $$"
]
},
{
"cell_type": "code",
2022-10-27 16:10:38 +02:00
"execution_count": 25,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/latex": [
"$\\displaystyle 0.00 \\leq x_0 \\leq 0.00$"
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/latex": [
"$\\displaystyle -0.10 \\leq x_1 \\leq 0.62$"
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/latex": [
2022-10-27 16:10:38 +02:00
"$\\displaystyle -0.17 \\leq x_2 \\leq 0.83$"
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/latex": [
"$\\displaystyle -0.24 \\leq x_3 \\leq 0.70$"
2022-10-14 11:34:46 +02:00
],
"text/plain": [
"<IPython.core.display.Math object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
2022-10-27 16:10:38 +02:00
"X_normalized = (X - np.mean(X, axis=0)) / np.amax(X, axis=0)\n",
2022-10-14 11:34:46 +02:00
"\n",
2022-10-27 16:10:38 +02:00
"show_mins_and_maxs(X_normalized)\n"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "code",
2022-10-27 16:10:38 +02:00
"execution_count": 26,
2022-10-14 11:34:46 +02:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2022-10-28 09:00:37 +02:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAl4AAAHDCAYAAAD1MRSGAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/av/WaAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOyddXhT5/uH7yR1dzdatFhxd4a7uzNgg2FjG9tgY0O2AWMwYAx3d3d3KBSn0JZS6u5t9Pz+yODHlJM0LWPf3NeVi4s2z3veJM05z3nk80gEQRAwYsSIESNGjBgxUuxI3/YGjBgxYsSIESNG/lcwOl5GjBgxYsSIESMlhNHxMmLEiBEjRowYKSGMjpcRI0aMGDFixEgJYXS8jBgxYsSIESNGSgij42XEiBEjRowYMVJCGB0vI0aMGDFixIiREsLoeBkxYsSIESNGjJQQRsfLiBEjRowYMWKkhDA6XkaMGDFixIgRIyWE0fEqRs6fP0/Hjh3x8vJCIpGwd+9eney//vprJBLJnx7W1tbFs2EjRowYMWLESLFidLyKkby8PKpWrcqSJUv0sv/4449JSEj43SM4OJiePXsaeKdGjBgxYsSIkZLA6HgVI23btmXmzJl07dr1L38vl8v5+OOP8fb2xtramjp16nD27NlXv7exscHDw+PVIykpiYcPHzJ8+PASegVGjBgxYsSIEUNidLzeImPHjuXKlSts3bqVu3fv0rNnT9q0acPTp0//8vkrV66kbNmyNGrUqIR3asSIESNGjBgxBEbH6y0RExPDmjVr2LFjB40aNSIoKIiPP/6Yhg0bsmbNmj89v7CwkE2bNhmjXUaMGDFixMg7jMnb3sD/Kvfu3UOtVlO2bNnf/Vwul+Ps7Pyn5+/Zs4ecnBwGDx5cUls0YsSIESNGjBgYo+P1lsjNzUUmkxEaGopMJvvd72xsbP70/JUrV9KhQwfc3d1LaotGjBgxYsSIEQNjdLzeEtWqVUOtVpOcnPzGmq1nz55x5swZ9u/fX0K7M2LEiBEjRowUB0bHqxjJzc0lIiLi1f+fPXtGWFgYTk5OlC1blv79+zNo0CDmz59PtWrVSElJ4dSpU1SpUoX27du/slu9ejWenp60bdv2bbwMI0aMGDFixIiBkAiCILztTfxXOXv2LM2aNfvTzwcPHszatWtRKpXMnDmT9evXExcXh4uLC3Xr1mXGjBlUrlwZAI1Gg7+/P4MGDWLWrFkl/RKMGDFixIgRIwbknXW8zp8/z9y5cwkNDSUhIYE9e/bQpUuXf7Q5e/YskyZN4sGDB/j6+vLll18yZMiQ3z1nyZIlzJ07l8TERKpWrcrPP/9M7dq1i++FGDFixIgRI0b+Z3hn5SR0VYV/9uwZ7du3p1mzZoSFhTFhwgRGjBjBsWPHXj1n27ZtTJo0ia+++opbt25RtWpVWrduTXJycnG9DCNGjBgxYsTI/xDvbMTrdSQSyRsjXp9++imHDh3i/v37r37Wp08fMjMzOXr0KAB16tShVq1aLF68GNCm+Xx9fRk3bhyfffZZsb4GI0aMGDFixMh/n/+Z4vorV67QsmXL3/2sdevWTJgwAQCFQkFoaChTp0599XupVErLli25cuXK364rl8uRy+Wv/q/RaEhPT8fZ2RmJRGLYF2HEiBEjRowYKRYEQSAnJwcvLy+k0uJLCP7POF6JiYl/0sByd3cnOzubgoICMjIyUKvVf/mcx48f/+26c+bMYcaMGcWyZyNGjBgxYsRIyfLixQt8fHyKbf3/GceruJg6dSqTJk169f+srCz8/Px48eIFdnZ2f2t3IXk9oen7sDZxYlCpnzCXWb/xWLczQlnzbDkSJEws+wkBNoGi9ng17T7fP9qABAlfVRxBVcfSouwAjsff4+u7+wCYW703DdzKiLYFuJsWz9Az21Bq1IyoUIePKus3Z/KXW9dYcvsaEuC7pq1pH1ROr3UA5CoVUw4d4/yzaExkMn5o14rmQeLeyzeRnJ3L5G2HeJyYiqlMxqdtG9O5WrBB1v4rnsanMmfHKR7HpQJQq7QPn/RoireTfbEdUyyFCiUXbkVy6NJD7jyJf/Vze1sLmtcqQ8ta5ahQyt0YGf6Xk5CaxYlrTzh25RHxKdmvfl4x0INOTSrRpHppzM3e/qUkX65k8cFL7L32AIBS7k581aclZb1di+2Y92ITmbj1IFkFcgJcHPm5b0c8HGwNsnZUejrDtu8hs7CQGt5eLOnSEQtT/d/nx2kpDDi4g0KVikGVQvikTmO91jkR+4TJl/cjAZY36Ukdd3+d7OPy0hl4eTmFahWTK7Shu39N0bYphRmMu/Ujco2CD0r34D2PWqLsclW5fPtgGgXqfHr79aeBy1+/9uzsbHx9fbG1Ncxn+He8/W9LCeHh4UFSUtLvfpaUlISdnR2WlpbIZDJkMtlfPsfDw+Nv1zU3N8fc3PxPP7ezs/tbxyux4CkPFUewsDGhs+9EXG0937j/bGU2+6J2YWZjRluPjlTxCnmjDUCqPJPl8QcxsTanl28LGvlXF2UHEJWTzLxnp5FZmTOydBPalq4h2hYgIT+bybePobYwpbVPRT6v3w6pHhfZDQ/C+OXxHaSWFsxo2Jy+lcS/hj8iV6kYv2c/FxMSsbSxZmmXTjQJDNB7vdd5GJ/EB5sOkpyTh4ujI4v6daRGgLdB1v4jSrWalceus/L4dVQaDQ729kzp1oROtYPfuiOTmJbNzlN32HfuHlm5hQCYmltSt7I/nRpXolG1IExNZG9Yxci/BTs7O8oF+vJB7+bcfBTD7jN3OXcrksexmTzedJEV+0Pp3rwqPVqG4GRn9fb2Ccwc2olWtSvx9eYTxGTkM2r5QcZ1aMCgZjWQSg3/vWgQbMfWcUMYsWYXL7JzGb3tCCuHdCPQ1anIa4fY2bF+UD/6b93J7bR0pp27wJIuHTHRMwVW286OBe268uGJA2yMfEw1v1J0L1dR53W6B9fkVm4yWyLCmHb/DId9R+BkIf5zt7OzY3L1Tnz34BC/xl7ivaBq+Fg5irYdXrELy6P2sSXlFC0C6uBg9udJL3+yw44eZXuz7cUmTmYfpYlfM6xM/j7YUdzn0He2q1FX6tWrx6lTp373sxMnTlCvXj0AzMzMqFGjxu+eo9FoOHXq1KvnGAKNoOZYwiIENJS3a0xp27qi7DbHrCdXlYO3pQ8dvDqLPJaGuY83k6PKp6ytL4MCxAuw5qsUTA7dSoFaQR2XQMaUay7aFqBQpWT0+V2kFOZRzsGV+fU66uV0HX8WwVcXtZ/J+Br1GFxEp+uDvQe48Ow5lqYmrOzexWBO16lHkQxcsZ3knDyC3JzYOrpPsTld0UnpDF6wjWVHr6LSaGhRtTR7Ph9E5zoV35rTJQgCtx7H8unP++kyeRXrD90gK7cQD2dbRnWrz775I/hpcjea1yprdLreUaRSCbUr+vPd2I4c+HEEo7s3wMPZloycAlbuu0qnSSuYvfoE0fHpb3WfjSsGsvOzgTSrEoRKrWHBvgt8tHwfmXkFxXK8QFcnNr3fh0BXJxKzchi0cjsP4w3TCV/Jw53l3TpjJpNxKiKKb06eoSj9cO2DyjGuuvaa8/n5EzxI1W+fX1Z/j9J2ziQX5PL59cM676l3QG1qOAVQoFbw7d19Otl38WlMKWsvclT5rHl2ULRdE9fmeFh4kqPK4UiieLvi4J11vHJzcwkLCyMsLAz4f1X4mJgYQJsCHDRo0Kvnjx49mqioKD755BMeP37M0qVL2b59OxMnTnz1nEmTJrFixQrWrVvHo0ePGDNmDHl5eQwdOtRg+76Vvp+kwgjMpTa08BgtyuZm+nVuZdxAipShpd7HVGoqym5P7DnCMp9gLjXj0/IDMZWKD3DOuX+QqNwUXM1tmVOtJzKJ+D8VQRD44voR7qUn4GhuyfLGPbEx/XNU8E2EJSXw0amDaASBvhWqMKFmfZ3XeIlCrWbsvoOci4rGwsSEFd27UM/fT+/1XmfD5Vt8tHk/BUoVDUr7s/n9Pvg6ORhk7dcRBIGt58Po/cMmHsQkYWtpzneD2/Hj8I642r/5rq84UGs0nLgWzuCvNzN6znb
2022-10-14 11:34:46 +02:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
2022-10-27 16:10:38 +02:00
"contour_plot(X_normalized[:, [0, 2, 3]], y)\n"
2022-10-14 11:34:46 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"source": [
"Teraz funkcja kosztu ma wykres o bardzo regularnym kształcie algorytm gradientu prostego zastosowany w takim przypadku bardzo szybko znajdzie minimum funkcji kosztu."
]
2022-10-27 16:10:38 +02:00
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Liczba kroków: 9511\n",
"Koszt: 80221516127.09409\n"
]
}
],
"source": [
"theta_start = np.zeros((n + 1, 1))\n",
2022-10-28 09:00:37 +02:00
"theta_best, history = gradient_descent(\n",
" J, dJ, theta_start, X_normalized, y, alpha=0.1, eps=0.1\n",
")\n",
2022-10-27 16:10:38 +02:00
"print(f\"Liczba kroków: {len(history)}\")\n",
2022-10-28 09:00:37 +02:00
"print(f\"Koszt: {history[-1][0]}\")\n"
2022-10-27 16:10:38 +02:00
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2022-10-28 09:00:37 +02:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0sAAAH+CAYAAABN3JWZAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/av/WaAAAACXBIWXMAAA9hAAAPYQGoP6dpAABaIElEQVR4nO3dd3hUZeL28XsyqaSSSgJJCKEEQq8CKigoILL2gogU14oi6voTdhddX2WRVVlddbEhRRREXWQVC0g19JLQIfRACAQIaYS0mfP+ER3JEspIkjNJvp/rmmud55yZ3OMew9w85zzHYhiGIQAAAABAOW5mBwAAAAAAV0RZAgAAAIAKUJYAAAAAoAKUJQAAAACoAGUJAAAAACpAWQIAAACAClCWAAAAAKAClCUAAAAAqABlCQAAAAAqQFkCAAAAgApQls6xYsUKDRo0SFFRUbJYLPr666+den1hYaGGDx+uNm3ayN3dXbfeemuF+y1btkwdO3aUl5eXmjZtqunTp19xdgAAAACVi7J0jjNnzqhdu3Z69913f9frbTabfHx8NHr0aPXt27fCfQ4cOKCBAwfquuuuU0pKisaMGaM//vGP+vHHH68kOgAAAIBKZjEMwzA7hCuyWCyaN29eudmhoqIi/eUvf9Hs2bOVnZ2t1q1ba9KkSerdu/d5rx8+fLiys7PPm516/vnntWDBAm3bts0xdu+99yo7O1s//PBDFX0aAAAAAM5iZskJTzzxhFavXq05c+Zoy5Ytuuuuu9S/f3/t2bPnst9j9erV58069evXT6tXr67suAAAAACuAGXpMqWlpWnatGn64osvdM011yg+Pl5/+tOfdPXVV2vatGmX/T7Hjh1TREREubGIiAjl5ubq7NmzlR0bAAAAwO/kbnaAmmLr1q2y2Wxq3rx5ufGioiKFhISYlAoAAABAVaEsXab8/HxZrVZt3LhRVqu13DY/P7/Lfp8GDRro+PHj5caOHz+ugIAA+fj4VEpWAAAAAFeOsnSZOnToIJvNpszMTF1zzTW/+326d++u7777rtzYokWL1L179yuNCAAAAKASUZbOkZ+fr7179zqeHzhwQCkpKQoODlbz5s01ZMgQPfDAA3rjjTfUoUMHnThxQosXL1bbtm01cOBASdKOHTtUXFysrKws5eXlKSUlRZLUvn17SdKjjz6qd955R//3f/+nkSNHasmSJZo7d64WLFhQ3R8XAAAAwEWwdPg5li1bpuuuu+688WHDhmn69OkqKSnRK6+8opkzZyo9PV2hoaG66qqr9NJLL6lNmzaSpMaNG+vQoUPnvce5/5qXLVump59+Wjt27FCjRo00fvx4DR8+vMo+FwAAAADnUZYAAAAAoAIsHQ4AAAAAFaAsAQAAAEAF6vwCD3a7XUePHpW/v78sFovZcQAAAABUIcMwlJeXp6ioKLm5XXzuqM6XpaNHjyo6OtrsGAAAAACq0eHDh9WoUaOL7lPny5K/v7+ksn9ZAQEBJqcBAAAAUJVyc3MVHR3t6AEXU+fL0q+n3gUEBFCWAAAAgDrici7BYYEHAAAAAKgAZQkAAAAAKkBZAgAAAIAKUJYAAAAAoAKUJQAAAACoAGUJAAAAACpAWQIAAACAClCWAAAAAKAClCUAAAAAqABlCQAAAAAqQFkCAAAAgApQlgAAAACgApQlAAAAAKgAZQkAAAAAKkBZciElNrvmp6TLMAyzowAAAAB1nrvZAVDmRF6RRn26SesOZim/qFRDusWaHQkAAACo05hZchFrD5zSuoNZkqS//Xe7Nh7KMjkRAAAAULdRllzEzW2jNLJnnCSpxGbo0VmblJlbaHIqAAAAoO6iLLmQcTcl6KomwZLKTst77NNNKi61m5wKAAAAqJsoSy7Ew+qmd+7rqMhAb0nSxkOn9f++3W5yKgAAAKBuoiy5mFA/L713fyd5upf9XzNrTZrmrj9scioAAACg7qEsuaB20UF65dbWjud//XqbUg5nmxcIAAAAqIMoSy7q7s7RGnpV2fLhxTa7Hpu1USfzi0xOBQAAANQdlCUXNv7mVuocW1+SlJFTqFGfblKJjQUfAAAAgOpAWXJhnu5u+vf9HRXu7yVJWnsgS3//bqfJqQAAAIC6gbLk4sL9vTXl/k7ysFokSdNWHtS85CMmpwIAAABqP8pSDdAptr5e+sNvCz6M/WqrtqXnmJgIAAAAqP0oSzXEfd1idG+XaElSUaldj3yyUVlnik1OBQAAANRelKUa5KVbEtU+OkiSlJ59lgUfAAAAgCpEWapBvNyteu/+Tgr1K1vwYfX+U3r52x0mpwIAAABqJ8pSDdMg0FvvD+0kT2vZ/3UzVx/S7HVpJqcCAAAAah/KUg3UKba+XrnttwUfXpi/TesOZJmYCAAAAKh9KEs11N2dozWyZ5wkqcRm6LFZG3XkdIHJqQAAAIDag7JUg/35pgRd0yxUknTqTLEenrlRBcWlJqcCAAAAagfKUg3mbnXT24M7qHFIPUnSjoxcPffFFhmGYXIyAAAAoOajLNVwQfU89dGwzvLzcpckLdiaoXeW7DU5FQAAAFDzUZZqgabh/nrr3vayWMqev7EoVQu3HzM3FAAAAFDDuVRZstlsGj9+vOLi4uTj46P4+Hi9/PLLFz2tbNmyZbJYLOc9jh2rW2WhT8sIPdevheP505+naPexPBMTAQAAADWbu9kBzjVp0iRNmTJFM2bMUGJiojZs2KARI0YoMDBQo0ePvuhrd+/erYCAAMfz8PDwqo7rch7rFa9dGXn67+ajOlNs0x9nrtd/R12t+r6eZkcDAAAAahyXKkurVq3SLbfcooEDB0qSGjdurNmzZ2vdunWXfG14eLiCgoKqOKFrs1gsmnRHWx04eUZb03N0OOusRn22STNGdpWH1aUmEQEAAACX51LfoHv06KHFixcrNTVVkrR582YlJSVpwIABl3xt+/btFRkZqRtuuEErV6684H5FRUXKzc0t96hNfDyt+uCBTgr185Ikrdp3Sn/773ZWyAMAAACc5FJlaezYsbr33nuVkJAgDw8PdejQQWPGjNGQIUMu+JrIyEi99957+uqrr/TVV18pOjpavXv31qZNmyrcf+LEiQoMDHQ8oqOjq+rjmCYy0EfvD+0kz19mkz5dm6bpqw6aGwoAAACoYSyGC005zJkzR88995xee+01JSYmKiUlRWPGjNHkyZM1bNiwy36fXr16KSYmRp988sl524qKilRUVOR4npubq+joaOXk5JS75qk2mJd8RE9/vlmS5GaRpg7vouta1L1ruQAAAIBf5ebmKjAw8LK+/7vUNUvPPfecY3ZJktq0aaNDhw5p4sSJTpWlrl27KikpqcJtXl5e8vLyqpS8ru62Do20L/OM3lm6V3ZDevKzZH31WA+1aOBvdjQAAADA5bnUaXgFBQVycysfyWq1ym63O/U+KSkpioyMrMxoNdYzNzTXTW0aSJLyi0r14Iz1OplfdIlXAQAAAHCpmaVBgwZpwoQJiomJUWJiopKTkzV58mSNHDnSsc+4ceOUnp6umTNnSpLefPNNxcXFKTExUYWFhfroo4+0ZMkSLVy40KyP4VLc3Cx64672Opy1WlvTc3Tk9Fk9+slGffpQN3m5W82OBwAAALgsl5pZevvtt3XnnXfq8ccfV8uWLfWnP/1JjzzyiF5++WXHPhkZGUpLS3M8Ly4u1rPPPqs2bdqoV69e2rx5s3766Sf16dPHjI/gknw8rfrwgc6KCCg7/XDDodMa99VWVsgDAAAALsKlFngwgzMXeNV029JzdNd7q3W2xCZJeq5fC426rqnJqQAAAIDq48z3f5eaWULVat0wUP+8p53j+Ws/7tb3WzNMTAQAAAC4LspSHdO/daSe69fC8fzpuSnaeiTHxEQAAACAa6Is1UGP947X7R0bSpIKS+z648z1OpZTaHIqAAAAwLVQluogi8Wiibe3UefY+pKk47lF+uPM9SooLjU5GQAAAOA6KEt1lJe7Ve8P7aRG9X0kSdvSczV6drJs9jq93gcAAADgQFmqw0L8vDRteBf5e5fdbuunnZl6+dsdJqcCAAAAXANlqY5rFuG
2022-10-27 16:10:38 +02:00
"text/plain": [
"<Figure size 960x540 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
2022-10-28 09:00:37 +02:00
"costchangeplot(history)\n"
2022-10-27 16:10:38 +02:00
]
2022-10-14 11:34:46 +02:00
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
},
"livereveal": {
"start_slideshow_at": "selected",
"theme": "white"
},
"vscode": {
"interpreter": {
"hash": "916dbcbb3f70747c44a77c7bcd40155683ae19c65e1c03b4aa3499c5328201f1"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}