1287 lines
73 KiB
Plaintext
1287 lines
73 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Uczenie maszynowe – zastosowania\n",
|
||
"# 4. Metody ewaluacji"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## 4.1. Metodologia testowania"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"source": [
|
||
"W uczeniu maszynowym bardzo ważna jest ewaluacja budowanego modelu. Dlatego dobrze jest podzielić posiadane dane na odrębne zbiory – osobny zbiór danych do uczenia i osobny do testowania. W niektórych przypadkach potrzeba będzie dodatkowo wyodrębnić tzw. zbiór walidacyjny."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"source": [
|
||
"### Zbiór uczący a zbiór testowy"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"* Na zbiorze uczącym (treningowym) uczymy algorytmy, a na zbiorze testowym sprawdzamy ich poprawność.\n",
|
||
"* Zbiór uczący powinien być kilkukrotnie większy od testowego (np. 4:1, 9:1 itp.).\n",
|
||
"* Zbiór testowy często jest nieznany.\n",
|
||
"* Należy unikać mieszania danych testowych i treningowych – nie wolno „zanieczyszczać” danych treningowych danymi testowymi!"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"source": [
|
||
"Czasami potrzebujemy dobrać parametry modelu, np. $\\alpha$ – który zbiór wykorzystać do tego celu?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"source": [
|
||
"### Zbiór walidacyjny"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Do doboru parametrów najlepiej użyć jeszcze innego zbioru – jest to tzw. **zbiór walidacyjny**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
" * Zbiór walidacyjny powinien mieć wielkość zbliżoną do wielkości zbioru testowego, czyli np. dane można podzielić na te trzy zbiory w proporcjach 3:1:1, 8:1:1 itp."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"### Walidacja krzyżowa"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Którą część danych wydzielić jako zbiór walidacyjny tak, żeby było „najlepiej”?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
" * Niech każda partia danych pełni tę rolę naprzemiennie!"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"source": [
|
||
"<img width=\"100%\" src=\"https://chrisjmccormick.files.wordpress.com/2013/07/10_fold_cv.png\"/>\n",
|
||
"Żródło: https://chrisjmccormick.wordpress.com/2013/07/31/k-fold-cross-validation-with-matlab-code/"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"source": [
|
||
"### Walidacja krzyżowa\n",
|
||
"\n",
|
||
"* Podziel dane $D = \\left\\{ (x^{(1)}, y^{(1)}), \\ldots, (x^{(m)}, y^{(m)})\\right\\} $ na $N$ rozłącznych zbiorów $T_1,\\ldots,T_N$\n",
|
||
"* Dla $i=1,\\ldots,N$, wykonaj:\n",
|
||
" * Użyj $T_i$ do walidacji i zbiór $S_i$ do trenowania, gdzie $S_i = D \\smallsetminus T_i$. \n",
|
||
" * Zapisz model $\\theta_i$.\n",
|
||
"* Akumuluj wyniki dla modeli $\\theta_i$ dla zbiorów $T_i$.\n",
|
||
"* Ustalaj parametry uczenia na akumulowanych wynikach."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"source": [
|
||
"### Walidacja krzyżowa – wskazówki\n",
|
||
"\n",
|
||
"* Zazwyczaj ustala się $N$ w przedziale od $4$ do $10$, tzw. $N$-krotna walidacja krzyżowa (*$N$-fold cross validation*). \n",
|
||
"* Zbiór $D$ warto zrandomizować przed podziałem.\n",
|
||
"* W jaki sposób akumulować wyniki dla wszystkich zbiórow $T_i$?\n",
|
||
"* Po ustaleniu parametrów dla każdego $T_i$, trenujemy model na całych danych treningowych z ustalonymi parametrami.\n",
|
||
"* Testujemy na zbiorze testowym (jeśli nim dysponujemy)."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"source": [
|
||
"### _Leave-one-out_\n",
|
||
"\n",
|
||
"Jest to szczególny przypadek walidacji krzyżowej, w której $N = m$."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"* Jaki jest rozmiar pojedynczego zbioru $T_i$?\n",
|
||
"* Jakie są zalety i wady tej metody?\n",
|
||
"* Kiedy może być przydatna?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"source": [
|
||
"### Zbiór walidujący a algorytmy optymalizacji\n",
|
||
"\n",
|
||
"* Gdy błąd rośnie na zbiorze uczącym, mamy źle dobrany parametr $\\alpha$. Należy go wtedy zmniejszyć.\n",
|
||
"* Gdy błąd zmniejsza się na zbiorze trenującym, ale rośnie na zbiorze walidującym, mamy do czynienia ze zjawiskiem **nadmiernego dopasowania** (*overfitting*).\n",
|
||
"* Należy wtedy przerwać optymalizację. Automatyzacja tego procesu to _early stopping_."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## 4.2. Miary jakości"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"source": [
|
||
"Aby przeprowadzić ewaluację modelu, musimy wybrać **miarę** (**metrykę**), jakiej będziemy używać.\n",
|
||
"\n",
|
||
"Jakiej miary użyc najlepiej?\n",
|
||
" * To zależy od rodzaju zadania.\n",
|
||
" * Innych metryk używa się do regresji, a innych do klasyfikacji"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"### Metryki dla zadań regresji\n",
|
||
"\n",
|
||
"Dla zadań regresji możemy zastosować np.:\n",
|
||
" * błąd średniokwadratowy (*mean-square error*, MSE):\n",
|
||
" $$ \\mathrm{MSE} \\, = \\, \\frac{1}{m} \\sum_{i=1}^{m} \\left( \\hat{y}^{(i)} - y^{(i)} \\right)^2 $$\n",
|
||
" * pierwiastek z błędu średniokwadratowego (*root-mean-square error*, RMSE):\n",
|
||
" $$ \\mathrm{RMSE} \\, = \\, \\sqrt{ \\frac{1}{m} \\sum_{i=1}^{m} \\left( \\hat{y}^{(i)} - y^{(i)} \\right)^2 } $$\n",
|
||
" * średni błąd bezwzględny (*mean absolute error*, MAE):\n",
|
||
" $$ \\mathrm{MAE} \\, = \\, \\frac{1}{m} \\sum_{i=1}^{m} \\left| \\hat{y}^{(i)} - y^{(i)} \\right| $$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"source": [
|
||
"W powyższych wzorach $y^{(i)}$ oznacza **oczekiwaną** wartości zmiennej $y$ w $i$-tym przykładzie, a $\\hat{y}^{(i)}$ oznacza wartość zmiennej $y$ w $i$-tym przykładzie wyliczoną (**przewidzianą**) przez nasz model."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"### Metryki dla zadań klasyfikacji\n",
|
||
"\n",
|
||
"Aby przedstawić kilka najpopularniejszych metryk stosowanych dla zadań klasyfikacyjnych, posłużmy się następującym przykładem:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Przydatne importy\n",
|
||
"\n",
|
||
"import ipywidgets as widgets\n",
|
||
"import matplotlib.pyplot as plt\n",
|
||
"import numpy as np\n",
|
||
"import pandas\n",
|
||
"import random\n",
|
||
"import seaborn\n",
|
||
"\n",
|
||
"%matplotlib inline"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"def powerme(x1,x2,n):\n",
|
||
" \"\"\"Funkcja, która generuje n potęg dla zmiennych x1 i x2 oraz ich iloczynów\"\"\"\n",
|
||
" X = []\n",
|
||
" for m in range(n+1):\n",
|
||
" for i in range(m+1):\n",
|
||
" X.append(np.multiply(np.power(x1,i),np.power(x2,(m-i))))\n",
|
||
" return np.hstack(X)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"def plot_data_for_classification(X, Y, xlabel=None, ylabel=None, Y_predicted=[], highlight=None):\n",
|
||
" \"\"\"Wykres danych dla zadania klasyfikacji\"\"\"\n",
|
||
" fig = plt.figure(figsize=(16*.6, 9*.6))\n",
|
||
" ax = fig.add_subplot(111)\n",
|
||
" fig.subplots_adjust(left=0.1, right=0.9, bottom=0.1, top=0.9)\n",
|
||
" X = X.tolist()\n",
|
||
" Y = Y.tolist()\n",
|
||
" X1n = [x[1] for x, y in zip(X, Y) if y[0] == 0]\n",
|
||
" X1p = [x[1] for x, y in zip(X, Y) if y[0] == 1]\n",
|
||
" X2n = [x[2] for x, y in zip(X, Y) if y[0] == 0]\n",
|
||
" X2p = [x[2] for x, y in zip(X, Y) if y[0] == 1]\n",
|
||
" \n",
|
||
" if len(Y_predicted) > 0:\n",
|
||
" Y_predicted = Y_predicted.tolist()\n",
|
||
" X1tn = [x[1] for x, y, yp in zip(X, Y, Y_predicted) if y[0] == 0 and yp[0] == 0]\n",
|
||
" X1fn = [x[1] for x, y, yp in zip(X, Y, Y_predicted) if y[0] == 1 and yp[0] == 0]\n",
|
||
" X1tp = [x[1] for x, y, yp in zip(X, Y, Y_predicted) if y[0] == 1 and yp[0] == 1]\n",
|
||
" X1fp = [x[1] for x, y, yp in zip(X, Y, Y_predicted) if y[0] == 0 and yp[0] == 1]\n",
|
||
" X2tn = [x[2] for x, y, yp in zip(X, Y, Y_predicted) if y[0] == 0 and yp[0] == 0]\n",
|
||
" X2fn = [x[2] for x, y, yp in zip(X, Y, Y_predicted) if y[0] == 1 and yp[0] == 0]\n",
|
||
" X2tp = [x[2] for x, y, yp in zip(X, Y, Y_predicted) if y[0] == 1 and yp[0] == 1]\n",
|
||
" X2fp = [x[2] for x, y, yp in zip(X, Y, Y_predicted) if y[0] == 0 and yp[0] == 1]\n",
|
||
" \n",
|
||
" if highlight == 'tn':\n",
|
||
" ax.scatter(X1tn, X2tn, c='r', marker='x', s=100, label='Dane')\n",
|
||
" ax.scatter(X1fn, X2fn, c='k', marker='o', s=50, label='Dane')\n",
|
||
" ax.scatter(X1tp, X2tp, c='k', marker='o', s=50, label='Dane')\n",
|
||
" ax.scatter(X1fp, X2fp, c='k', marker='x', s=50, label='Dane')\n",
|
||
" elif highlight == 'fn':\n",
|
||
" ax.scatter(X1tn, X2tn, c='k', marker='x', s=50, label='Dane')\n",
|
||
" ax.scatter(X1fn, X2fn, c='g', marker='o', s=100, label='Dane')\n",
|
||
" ax.scatter(X1tp, X2tp, c='k', marker='o', s=50, label='Dane')\n",
|
||
" ax.scatter(X1fp, X2fp, c='k', marker='x', s=50, label='Dane')\n",
|
||
" elif highlight == 'tp':\n",
|
||
" ax.scatter(X1tn, X2tn, c='k', marker='x', s=50, label='Dane')\n",
|
||
" ax.scatter(X1fn, X2fn, c='k', marker='o', s=50, label='Dane')\n",
|
||
" ax.scatter(X1tp, X2tp, c='g', marker='o', s=100, label='Dane')\n",
|
||
" ax.scatter(X1fp, X2fp, c='k', marker='x', s=50, label='Dane')\n",
|
||
" elif highlight == 'fp':\n",
|
||
" ax.scatter(X1tn, X2tn, c='k', marker='x', s=50, label='Dane')\n",
|
||
" ax.scatter(X1fn, X2fn, c='k', marker='o', s=50, label='Dane')\n",
|
||
" ax.scatter(X1tp, X2tp, c='k', marker='o', s=50, label='Dane')\n",
|
||
" ax.scatter(X1fp, X2fp, c='r', marker='x', s=100, label='Dane')\n",
|
||
" else:\n",
|
||
" ax.scatter(X1tn, X2tn, c='r', marker='x', s=50, label='Dane')\n",
|
||
" ax.scatter(X1fn, X2fn, c='g', marker='o', s=50, label='Dane')\n",
|
||
" ax.scatter(X1tp, X2tp, c='g', marker='o', s=50, label='Dane')\n",
|
||
" ax.scatter(X1fp, X2fp, c='r', marker='x', s=50, label='Dane')\n",
|
||
"\n",
|
||
" else:\n",
|
||
" ax.scatter(X1n, X2n, c='r', marker='x', s=50, label='Dane')\n",
|
||
" ax.scatter(X1p, X2p, c='g', marker='o', s=50, label='Dane')\n",
|
||
" \n",
|
||
" if xlabel:\n",
|
||
" ax.set_xlabel(xlabel)\n",
|
||
" if ylabel:\n",
|
||
" ax.set_ylabel(ylabel)\n",
|
||
" \n",
|
||
" ax.margins(.05, .05)\n",
|
||
" return fig"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Wczytanie danych\n",
|
||
"import pandas\n",
|
||
"import numpy as np\n",
|
||
"\n",
|
||
"alldata = pandas.read_csv('data-metrics.tsv', sep='\\t')\n",
|
||
"data = np.matrix(alldata)\n",
|
||
"\n",
|
||
"m, n_plus_1 = data.shape\n",
|
||
"n = n_plus_1 - 1\n",
|
||
"\n",
|
||
"X2 = powerme(data[:, 1], data[:, 2], n)\n",
|
||
"Y2 = np.matrix(data[:, 0]).reshape(m, 1)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAm8AAAFmCAYAAAA70X3dAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3df3Dc913n8ddbieS5rndo7bjFdWKSYk2H2NCS6kKhmiqFJpeIay2LFiVnIDeXOZOjmXHtwMR33EGHH3PAQHzuEcqlpkPL+JrNTSXFUBU3DZRieoUomSS160ulhpC4yiWqXdq1OCQn+74/vt+v/dVqV/pK2t3v97v7fMzsaPfz/X7Xn/16V/vS9/PL3F0AAADIh660KwAAAIDkCG8AAAA5QngDAADIEcIbAABAjhDeAAAAcoTwBgAAkCNXpl2BNFx11VV+7bXXpl0NAACARZ544olvufuW5fbpyPB27bXXanJyMu1qAAAALGJm/7DSPjSbAgAA5AjhDQAAIEdSD29m9gkze8XMTtXZbmb2UTObNrNnzOyG2LZbzezZcNuh1tUaAAAgHamHN0l/LOnWZbbfJqk3vO2T9DFJMrMrJD0Qbr9e0h1mdn1TawoAAJCy1MObu39J0vlldtkt6VMe+Iqk15vZVkk3Spp29+fcfUHSQ+G+AAAAbSv18JbANkkvxh6fDcvqlQMAALStPIQ3q1Hmy5TXfhKzfWY2aWaTs7OzDascAABAK+UhvJ2VdE3s8dWSZpYpr8ndH3T3Pnfv27Jl2bnvAAAAMisP4e24pJ8LR52+U9J33P0lSY9L6jWz68ysR9Lt4b4AqrlLY2PBzyTlAIDMSj28mdmnJf1vSW81s7NmdpeZ3W1md4e7TEh6TtK0pI9L+gVJcvdXJd0j6YSkM5IedvfTLX8BQB6Mj0vDw9KBA5eDmnvweHg42A4AyIXUl8dy9ztW2O6SPlRn24SCcAdgOUND0v790pEjwePDh4PgduRIUD40lG79AACJpR7eALSAWRDYpCCwRSFu//6g3GqN/wEAZJF5B/Z16evrcxamR0dyl7pivSUqFYIbAGSImT3h7n3L7ZN6nzcALRL1cYuL94EDAOQC4Q3oBFFwi/q4VSqX+8AR4AAgV+jzBnSC8fHLwS3q4xbvAzcwIO3Zk24dAQCJEN6ATjA0JI2OBj+jPm5RgBsYYLQpAOQI4Q3oBGa1r6zVKwcAZBZ93gAAAHKE8AYAAJAjhDcAAIAcIbwBAADkCOENAAAgRwhvAAAAOUJ4AwAAyBHCGwAAQI4wSS8AADWU58sqnS5p6tyUejf3amTniIobimlXCyC8AQBQ7eQLJzV4bFAVr2ju4pwK3QUdPHFQE3sn1L+9P+3qocPRbAoAQEx5vqzBY4MqL5Q1d3FOkjR3cU7lhaD8wsKFlGuITkd4AwAgpnS6pIpXam6reEWlU6UW1whYjPAGAEDM1LmpS1fcqs1dnNP0+ekW1whYjPAGAEBM76YdKtiGmtsKtkE7Nn1/i2sELEZ4AwAgZuTvC+r65/ma27r+eV4jz72uxTUCFiO8AQAQU/ypOzQx/wEV56WCB5MyFPxKFeelifkPaONP3ZFyDdHpmCoEAIA4M/X/7sOaOfAhlf78Y5reJO04/6pG3v0ftPHwA5JZ2jVEhzN3T7sOLdfX1+eTk5NpVwMAkGXuUlesgapSIbih6czsCXfvW24fmk0BAKjmLh04sLjswIGgHEgZ4Q0AgLgouB05Iu3fH1xx278/eEyAQwbQ5w0AgLjx8cvB7fDhoKn08OFg25Ej0sCAtGdPunVER8tEeDOzWyUdkXSFpKPu/ltV239J0t7w4ZWSfkDSFnc/b2bPSypLek3Sqyu1EwMAsKyhIWl0NPgZ9XGLAtzAQFAOpCj1AQtmdoWkr0u6WdJZSY9LusPdv1Zn//dJOuDuPx4+fl5Sn7t/K+m/yYAFAACQRXkZsHCjpGl3f87dFyQ9JGn3MvvfIenTLakZAABAxmQhvG2T9GLs8dmwbAkze52kWyV9Jlbskj5vZk+Y2b56/4iZ7TOzSTObnJ2dbUC1AXQ8d2lsbGkH9nrlANAAWQhvtSbNqfcb732S/sbdz8fK3uXuN0i6TdKHzOzdtQ509wfdvc/d+7Zs2bK+GgOAFHRsHx5ePAIxGqk4PBxsB4AGy0J4OyvpmtjjqyXN1Nn3dlU1mbr7TPjzFUljCpphAaD5hoaWTiERn2KCju0AmiALo00fl9RrZtdJ+qaCgPZvqncys++RNCDpZ2JlBUld7l4O798i6ddaUmsAqJ5C4siR4H58igkAaLDUr7y5+6uS7pF0QtIZSQ+7+2kzu9vM7o7tukfS5919Llb2JkknzexpSX8n6bPu/uetqjsALApwEYIbgCbKwpU3ufuEpImqsj+sevzHkv64quw5SW9rcvUAoL56yygR4AA0SepX3gAgt1hGCUAKMnHlDQByiWWUAKSA8AYAa8UySgBSQHgDgLUyq31lrV45ADQAfd4AAAByhPAGAACQI4Q3AACAHCG8AQAA5AjhDQAAIEcIbwAAADlCeAMAAMgRwhsAAECOEN4AAAByhPAGAACQI4Q3AACAHCG8AQAA5AjhDQAAIEcIbwAAADlCeAMAAMgRwhsAAECOEN4AAAByhPAGAACQI4Q3AACAHCG8AQAA5AjhDQAAIEcIbwAAADlCeAMAAMiRTIQ3M7vVzJ41s2kzO1Rj+01m9h0zeyq8/UrSYwEAANrJlWlXwMyukPSApJslnZX0uJkdd/evVe361+7+r9d4LAAAQFvIwpW3GyVNu/tz7r4g6SFJu1twLAAAQO5kIbxtk/Ri7PHZsKzaj5rZ02b2OTPbucpjZWb7zGzSzCZnZ2cbUW8AAICWS73ZVJLVKPOqx09K+j53v2Bmg5LGJfUmPDYodH9Q0oOS1NfXV3MfIOvK82WVTpc0dW5KvZt7NbJzRMUNxbSrBQBooSyEt7OSrok9vlrSTHwHd/9u7P6Emf2BmV2V5FigXZx84aQGjw2q4hXNXZxTobuggycOamLvhPq396ddPQBAi2Sh2fRxSb1mdp2Z9Ui6XdLx+A5m9r1mZuH9GxXU+1ySY4F2UJ4va/DYoMoLZc1dnJMkzV2cU3khKL+wcCHlGgIAWiX18Obur0q6R9IJSWckPezup83sbjO7O9ztA5JOmdnTkj4q6XYP1Dy29a8CaK7S6ZIqXqm5reIVlU6VWlwjAEBastBsKnefkDRRVfaHsfu/L+n3kx4LtJupc1OXrrhVm7s4p+nz0y2uEQAgLalfeQOwst7NvSp0F2puK3QXtGPTjhbXCACQFsIbkAMjO0fUZbU/rl3WpZFdIy2uEQAgLYQ3IAeKG4qa2DuhYk/x0hW4QndBxZ6gfGPPxpRrGHKXxsaCn0nKAQCrlok+bwBW1r+9XzP3zqh0qqTp89PasWmHRnaNZCe4SdL4uDQ8LO3fLx0+LJkFge3AAenIEWl0VNqzJ+1aAkCuEd6AHNnYs1F33XBX2tWob2goCG5HjgSPDx++HNz27w+2AwDWhfAGoHHMgsAmBYEtCnHxK3EAgHWhz1u7ou8R0hIPcBGCGwA0DOGtXUV9jw4cuBzUor5Hw8PBdqAZovdZXPx9CABYF8Jbu4r3PYq+OOl7hGarfp9VKkvfhwCAdaHPW7ui7xHSMD5+ObhF77P4+3BggNGmALBO5h34l3BfX59PTk6mXY3WcJe6YhdYKxWCG5rHPQhwQ0OL32f1ygEAi5jZE+7et9w+NJu2M/oeodXMgitr1QGtXjkAYNUIb+2KvkcAALQl+ry1K/oeAQDQlghv7WpoKFiKKN7HKApwAwOMNgUAIKdoNm1X9D3CajGxMwDkAuENQICJnTtWeb6so08e1X2P3qejTx5Veb6cdpUALINmUwAqz5dV2v4tTd33DvU+ekQjBxZUPPwAEzt3gJMvnNTgsUFVvKK5i3MqdBd08MRBTeydUP/2/rSr11xMbYOcYp43oMMt+fL2K9W18Komjkn9L4iJndtYeb6sbfdvU3lh6ZW2Yk9RM/fOaGPPxhRq1iJjY8FV5fh7PD5Sf3SUgV1oOeZ5A7Cs8nxZg8cGVV4oa+7inCRpzl5VeYM0uFe60COCWxsrnS6p4pWa2ypeUelUqcU1ajGWEUROEd6ADrbsl7ek0k4xL2Abmzo3dSm0V5u7OKfp89MtrlGLRSPwowDX1bV0iiUggwhvQAdb9st7gzR9Sx8TO7ex3s29KnQXam4rdBe0Y9OOFtcoBfE5MCMEN2Qc4Q3oYCt+ef/0z1++KsFo07YzsnNEXVb7a6DLujSya6TFNUoBywgihwhvQAdb+cv79uAqRDThM9pKcUNRE3snVOwpXgrxhe6Cij1BeVsPVpBYRhC5xWhToMPVmiqiy7o6Y6oISJIuLFxQ6VRJ0+entWPTDo3sGmn/4CYx2hSZlGS0KeENQOd+eaOzMc8bMojwVgfhDZfwyxsAkCG5mefNzG41s2fNbNrMDtXYvtfMnglvXzazt8W2PW9mXzWzp8yMRIbVYUkoAEDOpL48lpldIekBSTdLOivpcTM77u5fi+3295IG3P3bZnabpAcl/Uhs+3vc/VstqzTaR3ySTino98IknQCADEs9vEm6UdK0uz8nSWb2kKTdki6FN3f/cmz/r0i6uqU1RPuKz/F05MjlEMcknQCAjMpCs+k2SS/GHp8Ny+q5S9LnYo9d0ufN7Akz29eE+qHdMUknACBHshDean1D1hxFYWbvURDe7osVv8vdb5B0m6QPmdm76xy7z8wmzWxydnZ2vXVGO2GSTgBAjmQhvJ2VdE3s8dWSZqp3MrMfknRU0m53PxeVu/tM+PMVSWMKmmGXcPcH3b3P3fu2bNnSwOoj15ikEwCQM1kIb49L6jWz68ysR9Ltko7HdzCz7ZJGJf2su389Vl4ws2J0X9Itkk61rOZ55R5MTlkdTOqVt7Px8aULUccXqma0KQAgY1IPb+7+qqR7JJ2QdEbSw+5+2szuNrO7w91+RdJmSX9QNSXImySdNLOnJf2dpM+6+5+3+CXkD9NjXDY0FMyiHu/jFgU4loQCgM6W0YsdTNLbiaqbCqunx6CzPgAAqSyhlmSS3ixMFYJWY3oMAABWltG5QLny1sncpa5Yy3mlQnADEijPl1U6XdLUuSn1bu7VyM4RFTcU064WgGaIX2mLNPFiB2ub1kF4U8vfjEC7OPnCSQ0eG1TFK5q7OKdCd0Fd1qWJvRPq396fdvUANEMLL3bkZm1TtBjTYwBrUp4va/DYoMoLZc1dnJMkzV2cU3khKL+wcCHlGgJouAzOBUp460RMjwGsSel0SRWv1NxW8YpKp0otrhGApsroxQ4GLHSiaHqMoaGl02MMDDA9BlDH1LmpS1fcqs1dnNP0+ekW1whAU9W72CEF5QMDDR9tmgThrROZ1X6z1SsHIEnq3dyrQnehZoArdBe0Y9OOFGoFoGkyerGDZlMASGhk54i6rPavzS7r0siukRbXCEBTRRc1qgcn1CtvEcIbACRU3FDUxN4JFXuKKnQXJAVX3Io9QfnGno0p1xBAJ6DZFABWoX97v2bunVHpVEnT56e1Y9MOjewaIbgBaBnCGwCs0saejbrrhrvSrgaADkWzKQAAQI4Q3gAAAHKE8AYAAJAjhDeky10aG1s6S3W9cgAAOhzhDekaH5eGhxcvMxItRzI8zFJdAABUIbwhXUNDS9eJi68jx1JdAIBGaZPWHsIb0hUtMxIFuK6upevIAQDQCG3S2kN4Q/riC/1GCG4AgEZrk9YewhvSF3144uJ/FQEA0Aht0tpDeEO6qv/qqVSW/lUEAECjtEFrD+EN6RofX/pXT/yvopz0PwAA5EQbtPYQ3pCuoSFpdHTxXz1RgBsdzU3/AwBADrRJaw8L0yNdZtKePcnLAQBYq3qtPVJQPjCQi+8ewhsAAOgMUWvP0NDS1p6Bgdy09hDeAABAZ2iT1h76vAEAAORIJsKbmd1qZs+a2bSZHaqx3czso+H2Z8zshqTHAgAAtJPUw5uZXSHpAUm3Sbpe0h1mdn3VbrdJ6g1v+yR9bBXHAgAAtI0s9Hm7UdK0uz8nSWb2kKTdkr4W22e3pE+5u0v6ipm93sy2Sro2wbEAgCYpz5dVOl3S1Lkp9W7u1cjOERU3FNOuFtDWEoc3M7tZ0k9LesDdnzKzfe7+YAPqsE3Si7HHZyX9SIJ9tiU8FgDQBCdfOKnBY4OqeEVzF+dU6C7o4ImDmtg7of7t/WlXD2hbq2k2/QVJvyTpZ8zsxyW9vUF1qLUeRfUsefX2SXJs8ARm+8xs0swmZ2dnV1lFAEBceb6swWODKi+UNXdxTpI0d3FO5YWg/MLChZRrCLSv1YS3WXf/R3f/RUm3SPqXDarDWUnXxB5fLWkm4T5JjpUkufuD7t7n7n1btmxZd6UBoJOVTpdU8UrNbRWvqHSq1OIaAZ1jNeHts9Eddz8k6VMNqsPjknrN7Doz65F0u6TjVfscl/Rz4ajTd0r6jru/lPBYAECDTZ2bunTFrdrcxTlNn59ucY2AzrFieDOz/2Zm5u6PxMvd/b83ogLu/qqkeySdkHRG0sPuftrM7jazu8PdJiQ9J2la0scVNOHWPbYR9QIA1Ne7uVeF7kLNbYXugnZs2tHiGgGdw3yFRVjN7DckvU3SiLv/k5ndIulX3f1drahgM/T19fnk5GTa1QCA3CrPl7Xt/m0qL5SXbCv2FDVz74w29mxMoWZAvpnZE+7et9w+K155c/f/LOnTkv7KzE5KulcSk+ECQAcrbihqYu+Eij3FS1fgCt0FFXuCcoIbGsJdGhsLfiYp7xArThViZj8h6d9LmpO0VdJd7v5ssysGAMi2/u39mrl3RqVTJU2fn9aOTTs0smuE4IbGGR+Xhoel/fuDxePNgsB24IB05EiwyHyO1iRtlCTzvP2ypP/i7ifN7AcllczsoLv/RZPrBgDIuI09G3XXDXelXQ20q6GhILgdORI8Pnz4cnDbvz/Y3oFWDG/u/uOx+181s9skfUbSjzWzYgAAoMOZBYFNCgJbFOLiV+I60IoDFmoeZPYv3P3/NaE+LcGABQAAcsRd6op1069U2ja4NWTAQi15Dm4AsB7l+bKOPnlU9z16n44+eVTl+aWjLQE0UNTHLe7AgY4drCBlY2F6AMgF1vIEWiw+OCFqKo0eSx3bdLqmK28A0GlYyxNIwfj44uAW9YGLBjGMj6ddw1QQ3gAgAdbyBFIwNBRMBxK/whYFuNFRRpsCAOpjLU8gBWa153GrV94huPIGAAmwlieArCC8AUACIztH1GW1f2V2WZdGdo20uEYAOhXNpgCQQLSWZ/Vo0y7rYi3PBivPl1U6XdLUuSn1bu7VyM4RFTcU064WkBlrmqQ371o2Sa97MBJmaGjxUOZ65QAy78LCBdbybKJa07FEAZnpWNAJkkzSS3hrprExFtQFgITK82Vtu3+bygtLJz4u9hQ1c+8MQRltr2krLCCh+IK60WzQLKgLADUxHQuQDH3emokFdQEgMaZjAZLhyluzxQNchOAGAEswHQuQDOGt2VhQFwASYToWIBnCWzNV93GrVJb2gQMASLo8HUuxp3jpClyhu6BiT5HpWIAY+rw1U70FdaWgfGCA0abVmF4F6Gj92/s1c+8M07EAy2CqkGYiiKwe06sAADoYU4WkLVo4tzqg1SsH06sAQCdzD/6Ir76wVK+8QxHekC1R03IU4Lq6ljY9AwDa0/h40PoS7xce/RE/PBxsB+ENGcT0KgDQmWh9SYTwhuxhepXsokkDQDPR+pII4Q2tk+SLn+lVso0mDQDNRuvLilINb2a2ycweNbOp8OcbauxzjZn9pZmdMbPTZrY/tu0jZvZNM3sqvA229hVkSB6uiCT54q83vUoU4AgH6aJJA0Cz0fqyMndP7SbpdyQdCu8fkvTbNfbZKumG8H5R0tclXR8+/oikX1ztv/uOd7zD287oaHDdav9+90olKKtUgsdSsD1t8fpE9ax+XKkEdY1eQ/zYWuVovfj/W3SLv+8AYK2SfE+0OUmTvlJ+WmmHZt4kPStpq18Oac8mOOYRSTc74W2xvLzh2/WLv9NCZ6Wy+P+w3V4fgHTk4UJEkyUJb2n3eXuTu78kSeHPNy63s5ldK+mHJf1trPgeM3vGzD5Rq9m1Y+Slk2e79mXopL5g0euKo0kDQCMMDQWTsce/F6LvjdFRumaEmh7ezOwLZnaqxm33Kp9no6TPSPqwu383LP6YpO+X9HZJL0n6vWWO32dmk2Y2OTs7u8ZXk3F5CEbt+sXfKX3Bql8XA0oANBKT2yez0qW5Zt6UsNlUUrekE5IOLvNc10o6leTfbctmU/fsN0nmpWl3rbJ+/huBJg0AaCrloNn0uKQ7w/t3KujPtoiZmaQ/knTG3e+v2rY19nCPpFNNqmf25eGKSLuPJM3Dlc/1okkDAFKX6sL0ZrZZ0sOStkt6QdIH3f28mb1Z0lF3HzSzfkl/Lemrkirhof/J3SfM7E8UNJm6pOcl/byHfeiW07KF6VspDwu6uwcBbWhocaCpV5438fMdyVqfQwBApiVZmD7V8JaWtgxv7R6Msq76yufhw0sfc/4BACtIEt6ubFVl0GRRZ86k5Wisek3CUlA+MMD/AwCgIdLu8wa0B/qCAcgaz8HKO1gTwhvQCAxvB5A1nTT/ZIeh2RQAgHYUn39SWtoXlxaB3CK8AQDQjqr73kYhjkFUucdoUwAA2pl7sGRipFIhuGVYktGm9HkDAKBdteuShB2O8AYAQDvKw8o7WBP6vAEA0I6Yf7JtEd4AAGhH0fyT8RV2ogA3MMBo0xwjvKHtlOfLKp0uaerclHo392pk54iKG4ppVwsAWouVd9oW4Q1t5eQLJzV4bFAVr2ju4pwK3QUdPHFQE3sn1L+9P+3qAQCwbgxYQNsoz5c1eGxQ5YWy5i7OSZLmLs6pvBCUX1i4kHINAQBYP8Ib2kbpdEkVr9TcVvGKSqdKLa4RAACNR3hD25g6N3Xpilu1uYtzmj4/3eIaAQDQeIQ3tI3ezb0qdBdqbit0F7Rj044W1wgAgMYjvKFtjOwcUZfVfkt3WZdGdo20uEYAADQe4Q1to7ihqIm9Eyr2FC9dgSt0F1TsCco39mxMuYYAAKwfU4WgrfRv79fMvTMqnSpp+vy0dmzaoZFdIwQ3AB2BeS47g3kHrm3W19fnk5OTaVejtdyDpVLiM20vVw4AyJVa81x2WRfzXOaMmT3h7n3L7UOzaacYH5eGhxcvRhwtWjw8HGwHAOQS81x2FsJbpxgaChYnPnLkcoA7cODyosWscQcAucU8l52FPm+dIlqMWAoC25Ejwf39+4NymkwBILeY57KzcOWtk8QDXITgBgC5xzyXnYXw1kmiptK4eB84AEAuMc9lZyG8dYrqPm6VytI+cACAXGKey85Cn7dOMT5+ObhFTaXxPnADA9KePa2tE9OXAEDDMM9l50h1njcz2ySpJOlaSc9L+ml3/3aN/Z6XVJb0mqRXo/lPkh5fjXneMhKUxsaCaUrigTJ+hXB0tPWBEgCAFOVhnrdDkh5z915Jj4WP63mPu7+96gWt5vjOZhYEoeqAVq+8FZi+BACAVUu72XS3pJvC+5+U9EVJ97XweKSJ6UsAAFi1tJtN/9HdXx97/G13f0ON/f5e0rcluaT/4e4Prub4ah3ZbJpl7lJX7CJwpUJwAwB0pEw0m5rZF8zsVI3b7lU8zbvc/QZJt0n6kJm9ew312Gdmk2Y2OTs7u9rD0SxMXwIAwKo0Pby5+3vdfVeN2yOSXjazrZIU/nylznPMhD9fkTQm6cZwU6Ljw2MfdPc+d+/bsmVL414g1o7pSwAAWLW0Bywcl3RneP9OSY9U72BmBTMrRvcl3SLpVNLjkWH1pi+JAtz4eNo1BAAgc9Lu87ZZ0sOStkt6QdIH3f28mb1Z0lF3HzSztyi42iYFAyz+p7v/5nLHr/Tv0uctI7I4fQkAAClK0uct1fCWFsIbAACoKeULC5kYsAAAAJAb4+PBBPLxvtdRH+3h4Ux06Ul7njcAGVOeL6t0uqSpc1Pq3dyrkZ0jKm4opl0tAGiN+ATyUtAXO2MTyNNsCuCSky+c1OCxQVW8ormLcyp0F9RlXZrYO6H+7f1pVw8AWiM+G0KkRRPI0+etDsIbsFR5vqxt929TeaG8ZFuxp6iZe2dY4BpA50hpAnn6vAFIrHS6pIpXam6reEWlU6UW1wgAUpLxCeQJbwAkSVPnpjR3ca7mtrmLc5o+P93iGgFACnIwgTzhrVHcpbGxpf+p9cqBjOnd3KtCd6HmtkJ3QTs27WhxjQAgBTmYQJ7w1ig5GFoMLGdk54i6rPavhC7r0siukRbXCABSMDQkjY4uHpwQBbjR0UyMNiW8NUp8aHEU4DI2tBhYTnFDURN7J1TsKV66AlfoLqjYE5QzWAFARzCT9uxZOjihXnkKGG3aSCkOLQYa5cLCBZVOlTR9flo7Nu3QyK4RglsjsSwcgGUwVUgdTZ0qJKWhxQByYmws6EoR/8Mu/off6Gjw1z2yiwCOJmKqkFbL+NBiABlAF4vGSmOwGH2ckTLCW6PkYGgxgAyoHrnW1bV0ZBuSSyNIEcCRMppNG4WmEACrQReLxqgOTtXrUDYrENPHGU1Cn7c6mhLe6AMBICm++BsrrfNJAEcT0OetlXIwtBhABtDFovGipui4VgQ3+jgjJYQ3AGilHMzenjutDlIEcKSM8AYArZSD2dtzJY0gRQBHyujzBgDIrzQGi9HHGU3EgIU6CG8A0CYIUmgzScLbla2qDAAADRcNCktaDrQB+rwBAADkCOENAAC0VhrLmrURwhsAAGgt1oddF/q8AQCA1oqvDystXdaMKXOWRXgDAACtFV8V48iRyyGOZeISYaoQAACQDtaHXSLza5ua2SYze9TMpsKfb6ixz1vN7KnY7btm9uFw20fM7JuxbYOtfxUAAGDVWB92zdIesHBI0mPu3ivpsfDxIu7+rLu/3d3fLukdkv5J0lhsl8PRdnefaEmtAQDA2rE+7Lqk3edtt6SbwvuflPRFSfcts/9PSPqGu/9Dc6sFAACapt76sFJQPjDAJMvLSPvK25vc/SVJCnBYF1gAAAxZSURBVH++cYX9b5f06aqye8zsGTP7RK1mVwAAkDFDQ8G6s/HBCVGAGx1ltOkKmh7ezOwLZnaqxm33Kp+nR9L7Jf2vWPHHJH2/pLdLeknS7y1z/D4zmzSzydnZ2TW8EmAVmIASqI3PBqTLy5dVD06oV45Fmh7e3P297r6rxu0RSS+b2VZJCn++ssxT3SbpSXd/OfbcL7v7a+5ekfRxSTcuU48H3b3P3fu2bNnSmBcH1MMElEBtfDaAdUu72fS4pDvD+3dKemSZfe9QVZNpFPxCeySdamjtgLWKT0AZfUkxASU6WXRlbffuxZ+NSkV63/v4bACrkOo8b2a2WdLDkrZLekHSB939vJm9WdJRdx8M93udpBclvcXdvxM7/k8UNJm6pOcl/XzUh245zPOGlogHtggTUKJTjY0FV9b275fuv186eHDxZ+Mnf1L60z/ls4GOl2SeNybpBZqJCSiBQPXV5/vvl6644vL2115b/FkBOlTmJ+kF2hoTUAKXRSMJoybTeHCTgitxfDaARAhvQDMwASWwlFlwxS3utdf4bACrlPYkvUB7YgJKYCl36f3vX1x28ODlQMdnA0iE8AY0QzQB5dDQ0gkoBwYYUYfOE12N/uxng8EJx48vHrRw//18NoCECG9AM0QTTSYtB9odV6OBhiG8AQCaj6vRQMMQ3gAAzcfVaKBhGG0KAACQI4Q3AACAHCG8AUAzRGt5Vs9bVq8cABIivAFAM4yPB2t5xieejabLGB4OtgPAGjBgAQCaYWjo8soBUjCqMr7qBqMrAawR4Q0AmqF6HrMoxMXnOQOANaDZFACaJR7gInkIbvTXAzKN8AYAzRL1cYvLw+Lr9NcDMo3wBgDNEIWdqI9bpXK5D1zWA1y8v15UV/rrAZlBnzcAaIY8r+VJfz0g08yz/Ndfk/T19fnk5GTa1QDQztyDABdfy3O58ixyl7piDTSVSvbrjNZoh/d3RpnZE+7et9w+NJsCQDNEa3ZWf4HVK8+avPbXQ2vQLzJVhDcAwGJ57q+H1mhmv0hGO6+I8AYAWKxef73oy5qrKqh+T3R1LX3PrBVX9VZEnzcAwGL0Z0JSzegXWX0Vr3p1kjYfNJOkzxujTQEAi0X98pKWozPV6xe53nDFaOcV0WwKAJ2CvkRolGb3i8zr6iQtQngDgE5BXyI0SrP7RTLaeVmENwDoFKycgEYZGpJGRxdfDYsC3Ojo+kebMtp5WQxYAIBOEv9ijNCXCFkyNhZcCY6/L+Pv29HRtu57mWTAAuENALKglSM8WTkBWdbho50zv8KCmX3QzE6bWcXM6lbUzG41s2fNbNrMDsXKN5nZo2Y2Ff58Q2tqDgAN1qr+aPQlQtblfXWSFki7z9spScOSvlRvBzO7QtIDkm6TdL2kO8zs+nDzIUmPuXuvpMfCxwCQP63oj0ZfIqAtpDrPm7ufkSRbPkXfKGna3Z8L931I0m5JXwt/3hTu90lJX5R0X3NqCwBN1Iq5reqNEIz+zYGBtu5LBLSLtK+8JbFN0ouxx2fDMkl6k7u/JEnhzzfWexIz22dmk2Y2OTs727TKAsCaNXtuq2aOEATQMk0Pb2b2BTM7VeO2O+lT1Chb9bV9d3/Q3fvcvW/Lli2rPRwAmq/Z/dHoSwS0haY3m7r7e9f5FGclXRN7fLWkmfD+y2a21d1fMrOtkl5Z578FAOlYbj1Hiak8AFySh2bTxyX1mtl1ZtYj6XZJx8NtxyXdGd6/U9IjKdQPANav2TPWA2gbaU8VssfMzkr6UUmfNbMTYfmbzWxCktz9VUn3SDoh6Yykh939dPgUvyXpZjObknRz+BgA8of+aAASYpJeAACAjMj8JL0AAABYHcIbAABAjhDeAAAAcoTwBgAAkCOENwAAgBwhvAEAAOQI4Q0AACBHCG8AAAA50pGT9JrZrKR/aNDTXSXpWw16rjzjPHAOIpyHAOeBcxDhPHAOIknOw/e5+5bldujI8NZIZja50kzInYDzwDmIcB4CnAfOQYTzwDmINOo80GwKAACQI4Q3AACAHCG8rd+DaVcgIzgPnIMI5yHAeeAcRDgPnINIQ84Dfd4AAAByhCtvAAAAOUJ4S8DMPmhmp82sYmZ1R4mY2a1m9qyZTZvZoVj5JjN71Mymwp9vaE3NGyvJ6zCzt5rZU7Hbd83sw+G2j5jZN2PbBlv/KtYn6f+lmT1vZl8NX+fkao/PuoTvhWvM7C/N7Ez4+dkf25bb90K9z3lsu5nZR8Ptz5jZDUmPzZME52Fv+PqfMbMvm9nbYttqfj7yJsE5uMnMvhN7n/9K0mPzJMF5+KXYOThlZq+Z2aZwW7u8Fz5hZq+Y2ak62xv7e8Hdua1wk/QDkt4q6YuS+ursc4Wkb0h6i6QeSU9Luj7c9juSDoX3D0n67bRf0xrPw6peR3hO/q+COWsk6SOSfjHt19GKcyDpeUlXrfccZvWW5HVI2irphvB+UdLXY5+JXL4Xlvucx/YZlPQ5SSbpnZL+NumxebklPA8/JukN4f3bovMQPq75+cjTLeE5uEnSn63l2LzcVvtaJL1P0l+003shfB3vlnSDpFN1tjf09wJX3hJw9zPu/uwKu90oadrdn3P3BUkPSdodbtst6ZPh/U9KGmpOTZtuta/jJyR9w90bNSFyFqz3/7Jj3gvu/pK7PxneL0s6I2lby2rYHMt9ziO7JX3KA1+R9Hoz25rw2LxY8bW4+5fd/dvhw69IurrFdWy29fx/dtR7ocodkj7dkpq1kLt/SdL5ZXZp6O8FwlvjbJP0YuzxWV3+onqTu78kBV9okt7Y4ro1ympfx+1a+iG9J7xk/ImcNhkmPQcu6fNm9oSZ7VvD8Vm3qtdhZtdK+mFJfxsrzuN7YbnP+Ur7JDk2L1b7Wu5ScNUhUu/zkSdJz8GPmtnTZvY5M9u5ymPzIPFrMbPXSbpV0mdixe3wXkiiob8Xrmxo1XLMzL4g6XtrbPpld38kyVPUKMvdUN7lzsMqn6dH0vsl/cdY8cck/bqC8/Lrkn5P0r9bW02bp0Hn4F3uPmNmb5T0qJn9n/Avs9xo4Htho4Jf1h929++Gxbl4L9SQ5HNeb5+2+B0RSvxazOw9CsJbf6w4958PJTsHTyroNnIh7Nc5Lqk34bF5sZrX8j5Jf+Pu8StU7fBeSKKhvxcIbyF3f+86n+KspGtij6+WNBPef9nMtrr7S+Fl0lfW+W81zXLnwcxW8zpuk/Sku78ce+5L983s45L+rBF1brRGnAN3nwl/vmJmYwoujX9JHfZeMLNuBcHtmLuPxp47F++FGpb7nK+0T0+CY/MiyXmQmf2QpKOSbnP3c1H5Mp+PPFnxHMT+WJG7T5jZH5jZVUmOzZHVvJYlrTFt8l5IoqG/F2g2bZzHJfWa2XXhVafbJR0Ptx2XdGd4/05JSa7kZdFqXseSfg3hl3xkj6Sao3IybsVzYGYFMytG9yXdosuvtWPeC2Zmkv5I0hl3v79qW17fC8t9ziPHJf1cOLrsnZK+EzYtJzk2L1Z8LWa2XdKopJ9196/Hypf7fORJknPwveHnQGZ2o4Lv3HNJjs2RRK/FzL5H0oBivyva6L2QRGN/L6Q9QiMPNwVfLmclzUt6WdKJsPzNkiZi+w0qGFH3DQXNrVH5ZkmPSZoKf25K+zWt8TzUfB01zsPrFPyC+p6q4/9E0lclPRO+Obem/ZqacQ4UjBp6Oryd7tT3goJmMg//v58Kb4N5fy/U+pxLulvS3eF9k/RAuP2rio1Qr/c7Io+3BOfhqKRvx/7vJ8Pyup+PvN0SnIN7wtf4tIJBGz/Wie+F8PG/lfRQ1XHt9F74tKSXJF1UkBfuaubvBVZYAAAAyBGaTQEAAHKE8AYAAJAjhDcAAIAcIbwBAADkCOENAAAgRwhvAAAAOUJ4AwAAyBHCGwCsgpn9pZndHN7/DTP7aNp1AtBZWNsUAFbnVyX9WriQ9g9Len/K9QHQYVhhAQBWycz+StJGSTe5e9nM3iLplxUsCfeBdGsHoN3RbAoAq2BmPyhpq6R5dy9Lkrs/5+53pVszAJ2C8AYACZnZVknHJO2WNGdm/yrlKgHoQIQ3AEjAzF4naVTSve5+RtKvS/pIqpUC0JHo8wYA62RmmyX9pqSbJR119/+acpUAtDHCGwAAQI7QbAoAAJAjhDcAAIAcIbwBAADkCOENAAAgRwhvAAAAOUJ4AwAAyBHCGwAAQI4Q3gAAAHKE8AYAAJAj/x84am1cPs4OtwAAAABJRU5ErkJggg==\n",
|
||
"text/plain": [
|
||
"<Figure size 691.2x388.8 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"fig = plot_data_for_classification(X2, Y2, xlabel=r'$x_1$', ylabel=r'$x_2$')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"def safeSigmoid(x, eps=0):\n",
|
||
" \"\"\"Funkcja sigmoidalna zmodyfikowana w taki sposób, \n",
|
||
" żeby wartości zawsz były odległe od asymptot o co najmniej eps\n",
|
||
" \"\"\"\n",
|
||
" y = 1.0/(1.0 + np.exp(-x))\n",
|
||
" if eps > 0:\n",
|
||
" y[y < eps] = eps\n",
|
||
" y[y > 1 - eps] = 1 - eps\n",
|
||
" return y\n",
|
||
"\n",
|
||
"def h(theta, X, eps=0.0):\n",
|
||
" \"\"\"Funkcja hipotezy (regresja logistyczna)\"\"\"\n",
|
||
" return safeSigmoid(X*theta, eps)\n",
|
||
"\n",
|
||
"def J(h,theta,X,y, lamb=0):\n",
|
||
" \"\"\"Funkcja kosztu dla regresji logistycznej\"\"\"\n",
|
||
" m = len(y)\n",
|
||
" f = h(theta, X, eps=10**-7)\n",
|
||
" j = -np.sum(np.multiply(y, np.log(f)) + \n",
|
||
" np.multiply(1 - y, np.log(1 - f)), axis=0)/m\n",
|
||
" if lamb > 0:\n",
|
||
" j += lamb/(2*m) * np.sum(np.power(theta[1:],2))\n",
|
||
" return j\n",
|
||
"\n",
|
||
"def dJ(h,theta,X,y,lamb=0):\n",
|
||
" \"\"\"Gradient funkcji kosztu\"\"\"\n",
|
||
" g = 1.0/y.shape[0]*(X.T*(h(theta,X)-y))\n",
|
||
" if lamb > 0:\n",
|
||
" g[1:] += lamb/float(y.shape[0]) * theta[1:] \n",
|
||
" return g\n",
|
||
"\n",
|
||
"def classifyBi(theta, X):\n",
|
||
" \"\"\"Funkcja predykcji - klasyfikacja dwuklasowa\"\"\"\n",
|
||
" prob = h(theta, X)\n",
|
||
" return prob"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"def GD(h, fJ, fdJ, theta, X, y, alpha=0.01, eps=10**-3, maxSteps=10000):\n",
|
||
" \"\"\"Metoda gradientu prostego dla regresji logistycznej\"\"\"\n",
|
||
" errorCurr = fJ(h, theta, X, y)\n",
|
||
" errors = [[errorCurr, theta]]\n",
|
||
" while True:\n",
|
||
" # oblicz nowe theta\n",
|
||
" theta = theta - alpha * fdJ(h, theta, X, y)\n",
|
||
" # raportuj poziom błędu\n",
|
||
" errorCurr, errorPrev = fJ(h, theta, X, y), errorCurr\n",
|
||
" # kryteria stopu\n",
|
||
" if abs(errorPrev - errorCurr) <= eps:\n",
|
||
" break\n",
|
||
" if len(errors) > maxSteps:\n",
|
||
" break\n",
|
||
" errors.append([errorCurr, theta]) \n",
|
||
" return theta, errors"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"theta = [[ 1.37136167]\n",
|
||
" [ 0.90128948]\n",
|
||
" [ 0.54708112]\n",
|
||
" [-5.9929264 ]\n",
|
||
" [ 2.64435168]\n",
|
||
" [-4.27978238]]\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# Uruchomienie metody gradientu prostego dla regresji logistycznej\n",
|
||
"theta_start = np.matrix(np.zeros(X2.shape[1])).reshape(X2.shape[1],1)\n",
|
||
"theta, errors = GD(h, J, dJ, theta_start, X2, Y2, \n",
|
||
" alpha=0.1, eps=10**-7, maxSteps=10000)\n",
|
||
"print('theta = {}'.format(theta))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"def plot_decision_boundary(fig, theta, X):\n",
|
||
" \"\"\"Wykres granicy klas\"\"\"\n",
|
||
" ax = fig.axes[0]\n",
|
||
" xx, yy = np.meshgrid(np.arange(-1.0, 1.0, 0.02),\n",
|
||
" np.arange(-1.0, 1.0, 0.02))\n",
|
||
" l = len(xx.ravel())\n",
|
||
" C = powerme(xx.reshape(l, 1), yy.reshape(l, 1), n)\n",
|
||
" z = classifyBi(theta, C).reshape(int(np.sqrt(l)), int(np.sqrt(l)))\n",
|
||
"\n",
|
||
" plt.contour(xx, yy, z, levels=[0.5], lw=3);"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"Y_expected = Y2.astype(int)\n",
|
||
"Y_predicted = (classifyBi(theta, X2) > 0.5).astype(int)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Przygotowanie interaktywnego wykresu\n",
|
||
"\n",
|
||
"dropdown_highlight = widgets.Dropdown(options=['all', 'tp', 'fp', 'tn', 'fn'], value='all', description='highlight')\n",
|
||
"\n",
|
||
"def interactive_classification(highlight):\n",
|
||
" fig = plot_data_for_classification(X2, Y2, xlabel=r'$x_1$', ylabel=r'$x_2$',\n",
|
||
" Y_predicted=Y_predicted, highlight=highlight)\n",
|
||
" plot_decision_boundary(fig, theta, X2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"application/vnd.jupyter.widget-view+json": {
|
||
"model_id": "6325cec10a034a9d96d862dee900013d",
|
||
"version_major": 2,
|
||
"version_minor": 0
|
||
},
|
||
"text/plain": [
|
||
"interactive(children=(Dropdown(description='highlight', options=('all', 'tp', 'fp', 'tn', 'fn'), value='all'),…"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"<function __main__.interactive_classification(highlight)>"
|
||
]
|
||
},
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"widgets.interact(interactive_classification, highlight=dropdown_highlight)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"source": [
|
||
"Zadanie klasyfikacyjne z powyższego przykładu polega na przypisaniu punktów do jednej z dwóch kategorii:\n",
|
||
" 0. <font color=\"red\">czerwone krzyżyki</font>\n",
|
||
" 1. <font color=\"green\">zielone kółka</font>\n",
|
||
"\n",
|
||
"W tym celu zastosowano regresję logistyczną.\n",
|
||
"\n",
|
||
"W rezultacie otrzymano model, który dzieli płaszczyznę na dwa obszary:\n",
|
||
" 0. <font color=\"red\">na zewnątrz granatowej krzywej</font>\n",
|
||
" 1. <font color=\"green\">wewnątrz granatowej krzywej</font>\n",
|
||
" \n",
|
||
"Model przewiduje klasę <font color=\"red\">0 („czerwoną”)</font> dla punktów znajdujący się w obszarze na zewnątrz krzywej, natomiast klasę <font color=\"green\">1 („zieloną”)</font> dla punktów znajdujących sie w obszarze wewnąrz krzywej."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"source": [
|
||
"Wszysktie obserwacje możemy podzielić zatem na cztery grupy:\n",
|
||
" * **true positives (TP)** – prawidłowo sklasyfikowane pozytywne przykłady (<font color=\"green\">zielone kółka</font> w <font color=\"green\">wewnętrznym obszarze</font>)\n",
|
||
" * **true negatives (TN)** – prawidłowo sklasyfikowane negatywne przykłady (<font color=\"red\">czerwone krzyżyki</font> w <font color=\"red\">zewnętrznym obszarze</font>)\n",
|
||
" * **false positives (FP)** – negatywne przykłady sklasyfikowane jako pozytywne (<font color=\"red\">czerwone krzyżyki</font> w <font color=\"green\">wewnętrznym obszarze</font>)\n",
|
||
" * **false negatives (FN)** – pozytywne przykłady sklasyfikowane jako negatywne (<font color=\"green\">zielone kółka</font> w <font color=\"red\">zewnętrznym obszarze</font>)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"source": [
|
||
"Innymi słowy:\n",
|
||
"\n",
|
||
"<img width=\"50%\" src=\"https://blog.aimultiple.com/wp-content/uploads/2019/07/positive-negative-true-false-matrix.png\">"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "skip"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"TP = 5\n",
|
||
"TN = 35\n",
|
||
"FP = 3\n",
|
||
"FN = 6\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# Obliczmy TP, TN, FP i FN\n",
|
||
"\n",
|
||
"tp = 0\n",
|
||
"tn = 0\n",
|
||
"fp = 0\n",
|
||
"fn = 0\n",
|
||
"\n",
|
||
"for i in range(len(Y_expected)):\n",
|
||
" if Y_expected[i] == 1 and Y_predicted[i] == 1:\n",
|
||
" tp += 1\n",
|
||
" elif Y_expected[i] == 0 and Y_predicted[i] == 0:\n",
|
||
" tn += 1\n",
|
||
" elif Y_expected[i] == 0 and Y_predicted[i] == 1:\n",
|
||
" fp += 1\n",
|
||
" elif Y_expected[i] == 1 and Y_predicted[i] == 0:\n",
|
||
" fn += 1\n",
|
||
" \n",
|
||
"print('TP =', tp)\n",
|
||
"print('TN =', tn)\n",
|
||
"print('FP =', fp)\n",
|
||
"print('FN =', fn)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"source": [
|
||
"Możemy teraz zdefiniować następujące metryki:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"#### Dokładność (*accuracy*)\n",
|
||
"$$ \\mbox{accuracy} = \\frac{\\mbox{przypadki poprawnie sklasyfikowane}}{\\mbox{wszystkie przypadki}} = \\frac{TP + TN}{TP + TN + FP + FN} $$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"source": [
|
||
"Dokładność otrzymujemy przez podzielenie liczby przypadków poprawnie sklasyfikowanych przez liczbę wszystkich przypadków:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Accuracy: 0.8163265306122449\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"accuracy = (tp + tn) / (tp + tn + fp + fn)\n",
|
||
"print('Accuracy:', accuracy)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"source": [
|
||
"**Uwaga:** Nie zawsze dokładność będzie dobrą miarą, zwłaszcza gdy klasy są bardzo asymetryczne!\n",
|
||
"\n",
|
||
"*Przykład:* Wyobraźmy sobie test na koronawirusa, który **zawsze** zwraca wynik negatywny. Jaką przydatność będzie miał taki test w praktyce? Żadną. A jaka będzie jego *dokładność*? Policzmy:\n",
|
||
"$$ \\mbox{accuracy} \\, = \\, \\frac{\\mbox{szacowana liczba osób zdrowych na świecie}}{\\mbox{populacja Ziemi}} \\, \\approx \\, \\frac{7\\,700\\,000\\,000 - 600\\,000}{7\\,700\\,000\\,000} \\, \\approx \\, 0.99992 $$\n",
|
||
"(zaokrąglone dane z 27 marca 2020)\n",
|
||
"\n",
|
||
"Powyższy wynik jest tak wysoki, ponieważ zdecydowana większość osób na świecie nie jest zakażona, więc biorąc losowego Ziemianina możemy w ciemno strzelać, że nie ma koronawirusa.\n",
|
||
"\n",
|
||
"W tym przypadku duża różnica w liczności obu zbiorów (zakażeni/niezakażeni) powoduje, że *accuracy* nie jest dobrą metryką.\n",
|
||
"\n",
|
||
"Dlatego dysponujemy również innymi metrykami:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"#### Precyzja (*precision*)\n",
|
||
"$$ \\mbox{precision} = \\frac{TP}{TP + FP} $$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Precision: 0.625\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"precision = tp / (tp + fp)\n",
|
||
"print('Precision:', precision)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"source": [
|
||
"Precyzja określa, jaka część przykładów sklasyfikowanych jako pozytywne to faktycznie przykłady pozytywne."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"#### Pokrycie (czułość, *recall*)\n",
|
||
"$$ \\mbox{recall} = \\frac{TP}{TP + FN} $$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 16,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Recall: 0.45454545454545453\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"recall = tp / (tp + fn)\n",
|
||
"print('Recall:', recall)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"source": [
|
||
"Pokrycie mówi nam, jaka część przykładów pozytywnych została poprawnie sklasyfikowana."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"#### *$F$-measure* (*$F$-score*)\n",
|
||
"$$ F = \\frac{2 \\cdot \\mbox{precision} \\cdot \\mbox{recall}}{\\mbox{precision} + \\mbox{recall}} $$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 17,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"F-score: 0.5263157894736842\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"fscore = (2 * precision * recall) / (precision + recall)\n",
|
||
"print('F-score:', fscore)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"source": [
|
||
"$F$-_measure_ jest kompromisem między precyzją a pokryciem (a ściślej: jest średnią harmoniczną precyzji i pokrycia)."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"source": [
|
||
"$F$-_measure_ jest szczególnym przypadkiem ogólniejszej miary:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"source": [
|
||
"*$F_\\beta$-measure*:\n",
|
||
"$$ F_\\beta = \\frac{(1 + \\beta) \\cdot \\mbox{precision} \\cdot \\mbox{recall}}{\\beta^2 \\cdot \\mbox{precision} + \\mbox{recall}} $$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Dla $\\beta = 1$ otrzymujemy:\n",
|
||
"$$ F_1 \\, = \\, \\frac{(1 + 1) \\cdot \\mbox{precision} \\cdot \\mbox{recall}}{1^2 \\cdot \\mbox{precision} + \\mbox{recall}} \\, = \\, \\frac{2 \\cdot \\mbox{precision} \\cdot \\mbox{recall}}{\\mbox{precision} + \\mbox{recall}} \\, = \\, F $$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## 4.3. Obserwacje odstające"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"source": [
|
||
"**Obserwacje odstające** (*outliers*) – to wszelkie obserwacje posiadające nietypową wartość.\n",
|
||
"\n",
|
||
"Mogą być na przykład rezultatem błędnego pomiaru albo pomyłki przy wprowadzaniu danych do bazy, ale nie tylko.\n",
|
||
"\n",
|
||
"Obserwacje odstające mogą niekiedy znacząco wpłynąć na parametry modelu, dlatego ważne jest, żeby takie obserwacje odrzucić zanim przystąpi się do tworzenia modelu."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"source": [
|
||
"W poniższym przykładzie można zobaczyć wpływ obserwacji odstających na wynik modelowania na przykładzie danych dotyczących cen mieszkań zebranych z ogłoszeń na portalu Gratka.pl: tutaj przykładem obserwacji odstającej może być ogłoszenie, w którym podano cenę w tys. zł zamiast ceny w zł."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 18,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Przydatne funkcje\n",
|
||
"\n",
|
||
"def h_linear(Theta, x):\n",
|
||
" \"\"\"Funkcja regresji liniowej\"\"\"\n",
|
||
" return x * Theta\n",
|
||
"\n",
|
||
"def linear_regression(theta):\n",
|
||
" \"\"\"Ta funkcja zwraca funkcję regresji liniowej dla danego wektora parametrów theta\"\"\"\n",
|
||
" return lambda x: h_linear(theta, x)\n",
|
||
"\n",
|
||
"def cost(theta, X, y):\n",
|
||
" \"\"\"Wersja macierzowa funkcji kosztu\"\"\"\n",
|
||
" m = len(y)\n",
|
||
" J = 1.0 / (2.0 * m) * ((X * theta - y).T * (X * theta - y))\n",
|
||
" return J.item()\n",
|
||
"\n",
|
||
"def gradient(theta, X, y):\n",
|
||
" \"\"\"Wersja macierzowa gradientu funkcji kosztu\"\"\"\n",
|
||
" return 1.0 / len(y) * (X.T * (X * theta - y)) \n",
|
||
"\n",
|
||
"def gradient_descent(fJ, fdJ, theta, X, y, alpha=0.1, eps=10**-5):\n",
|
||
" \"\"\"Algorytm gradientu prostego (wersja macierzowa)\"\"\"\n",
|
||
" current_cost = fJ(theta, X, y)\n",
|
||
" logs = [[current_cost, theta]]\n",
|
||
" while True:\n",
|
||
" theta = theta - alpha * fdJ(theta, X, y)\n",
|
||
" current_cost, prev_cost = fJ(theta, X, y), current_cost\n",
|
||
" if abs(prev_cost - current_cost) > 10**15:\n",
|
||
" print('Algorithm does not converge!')\n",
|
||
" break\n",
|
||
" if abs(prev_cost - current_cost) <= eps:\n",
|
||
" break\n",
|
||
" logs.append([current_cost, theta]) \n",
|
||
" return theta, logs\n",
|
||
"\n",
|
||
"def plot_data(X, y, xlabel, ylabel):\n",
|
||
" \"\"\"Wykres danych (wersja macierzowa)\"\"\"\n",
|
||
" fig = plt.figure(figsize=(16*.6, 9*.6))\n",
|
||
" ax = fig.add_subplot(111)\n",
|
||
" fig.subplots_adjust(left=0.1, right=0.9, bottom=0.1, top=0.9)\n",
|
||
" ax.scatter([X[:, 1]], [y], c='r', s=50, label='Dane')\n",
|
||
" \n",
|
||
" ax.set_xlabel(xlabel)\n",
|
||
" ax.set_ylabel(ylabel)\n",
|
||
" ax.margins(.05, .05)\n",
|
||
" plt.ylim(y.min() - 1, y.max() + 1)\n",
|
||
" plt.xlim(np.min(X[:, 1]) - 1, np.max(X[:, 1]) + 1)\n",
|
||
" return fig\n",
|
||
"\n",
|
||
"def plot_regression(fig, fun, theta, X):\n",
|
||
" \"\"\"Wykres krzywej regresji (wersja macierzowa)\"\"\"\n",
|
||
" ax = fig.axes[0]\n",
|
||
" x0 = np.min(X[:, 1]) - 1.0\n",
|
||
" x1 = np.max(X[:, 1]) + 1.0\n",
|
||
" L = [x0, x1]\n",
|
||
" LX = np.matrix([1, x0, 1, x1]).reshape(2, 2)\n",
|
||
" ax.plot(L, fun(theta, LX), linewidth='2',\n",
|
||
" label=(r'$y={theta0:.2}{op}{theta1:.2}x$'.format(\n",
|
||
" theta0=float(theta[0][0]),\n",
|
||
" theta1=(float(theta[1][0]) if theta[1][0] >= 0 else float(-theta[1][0])),\n",
|
||
" op='+' if theta[1][0] >= 0 else '-')))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Wczytanie danych (mieszkania) przy pomocy biblioteki pandas\n",
|
||
"\n",
|
||
"alldata = pandas.read_csv('data_flats_with_outliers.tsv', sep='\\t',\n",
|
||
" names=['price', 'isNew', 'rooms', 'floor', 'location', 'sqrMetres'])\n",
|
||
"data = np.matrix(alldata[['price', 'sqrMetres']])\n",
|
||
"\n",
|
||
"m, n_plus_1 = data.shape\n",
|
||
"n = n_plus_1 - 1\n",
|
||
"Xn = data[:, 0:n]\n",
|
||
"\n",
|
||
"Xo = np.matrix(np.concatenate((np.ones((m, 1)), Xn), axis=1)).reshape(m, n + 1)\n",
|
||
"yo = np.matrix(data[:, -1]).reshape(m, 1)\n",
|
||
"\n",
|
||
"Xo /= np.amax(Xo, axis=0)\n",
|
||
"yo /= np.amax(yo, axis=0)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAmwAAAFoCAYAAADq7KeuAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAcJUlEQVR4nO3dfbBkZ10n8O9vJkOik9EISQjkBbQyC6IrAa9JkNRuYMWFKZboLu7E3ZKIWQIKFi/KGrVKFP9YastXJJCKvGZXcXwDUsUARgqNKTaGSUyAvOBMAco4kYSXDTcDiZnMs390z2a46Zn0vbdv93N7Pp+qW919znNO/+65p3u+85xznlOttQAA0K8Nsy4AAICjE9gAADonsAEAdE5gAwDonMAGANA5gQ0AoHMzC2xVdWZVfayq7qiq26rq1SPaVFW9uar2VNUnq+qZs6gVAGCWjpvhex9I8nOttZurakuSm6rq2tba7Ye1eUGSrcOf85K8bfgIAHDMmFkPW2vtrtbazcPni0nuSHL6kmYXJbm6DdyQ5KSqesKUSwUAmKkuzmGrqicneUaSv10y6/QkXzjs9d48MtQBAMy1WR4STZJU1YlJ/izJa1prX1s6e8QiI++lVVWXJbksSTZv3vz9T33qUydaJwDAat10001faq2dstzlZhrYqmpTBmHtD1prfz6iyd4kZx72+owk+0atq7V2VZKrkmRhYaHt2rVrwtUCAKxOVf3DSpab5VWileQdSe5orf3WEZpdk+Qlw6tFz09yb2vtrqkVCQDQgVn2sD07yU8k+VRV3TKc9ktJzkqS1tqVSXYm2ZZkT5KvJ3npDOoEAJipmQW21tr1GX2O2uFtWpJXTqciAIA+dXGVKAAARyawAQB0TmADAOicwAYA0DmBDQCgcwIbAEDnBDYAgM4JbAAAnRPYAAA6J7ABAHROYAMA6JzABgDQOYENAKBzAhsAQOcENgCAzglsAACdE9gAADonsAEAdE5gAwDonMAGANA5gQ0AoHMCGwBA5wQ2AIDOCWwAAJ0T2AAAOiewAQB0TmADAOicwAYA0DmBDQCgcwIbAEDnBDYAgM4JbAAAnRPYAAA6J7ABAHROYAMA6JzABgDQOYENAKBzAhsAQOcENgCAzglsAACdE9gAADonsAEAdE5gAwDo3EwDW1W9s6rurqpPH2H+hVV1b1XdMvz5lWnXCAAwa8fN+P3fneQtSa4+Spu/aa29cDrlAAD0Z6Y9bK2165J8ZZY1AAD0bj2cw/asqrq1qj5UVd8z62IAAKZt1odEH83NSZ7UWruvqrYleX+SraMaVtVlSS5LkrPOOmt6FQIArLGue9haa19rrd03fL4zyaaqOvkIba9qrS201hZOOeWUqdYJALCWug5sVXVaVdXw+bkZ1Pvl2VYFADBdMz0kWlXvTXJhkpOram+SNyTZlCSttSuTvDjJT1fVgSTfSHJxa63NqFwAgJmYaWBrrf34o8x/SwbDfgAAHLO6PiQKAIDABgDQPYENAKBzAhsAQOcENgCAzglsAACdE9gAADonsAEAdE5gAwDonMAGANA5gQ0AoHMCGwBA5wQ2AIDOCWwAAJ0T2AAAOiewAQB0TmADAOicwAYA0DmBDQCgcwIbAEDnBDYAgM4JbAAAnRPYAAA6J7ABAHROYAMA6JzABgDQOYENAKBzAhsAQOcENgCAzglsAACdE9gAADonsAEAdE5gAwDonMAGANA5gQ0AoHMCGwBA5wQ2AIDOCWwAAJ0T2AAAOiewAQB0TmADAOicwAYA0DmBDQCgczMNbFX1zqq6u6o+fYT5VVVvrqo9VfXJqnrmtGuEri0uJm9/e/ILvzB4XFycdUUArIHjZvz+707yliRXH2H+C5JsHf6cl+Rtw0fg+uuTbduSgweT/fuTzZuT170u2bkzueCCWVcHwATNtIettXZdkq8cpclFSa5uAzckOamqnjCd6qBji4uDsLa4OAhryeDx0PT77pttfQBMVO/nsJ2e5AuHvd47nAbHth07Bj1roxw8OJgPwNzoPbDViGltZMOqy6pqV1Xtuueee9a4LJix3bsf7llbav/+ZM+e6dYDwJrqPbDtTXLmYa/PSLJvVMPW2lWttYXW2sIpp5wyleJgZrZuHZyzNsrmzcnZZ0+3HgDWVO+B7ZokLxleLXp+kntba3fNuiiYue3bkw1H+Phu2DCYD8DcmOlVolX13iQXJjm5qvYmeUOSTUnSWrsyyc4k25LsSfL1JC+dTaXQmS1bBleDLr1KdMOGwfQTT5x1hQBM0EwDW2vtxx9lfkvyyimVA+vLBRck+/YNLjDYs2dwGHT7dmENYA7Nehw2YDVOPDG59NJZVwHAGuv9HDYAgGOewAYA0DmBDQCgcwIbAEDnBDYAgM4JbAAAnRPYAAA6J7ABAHROYAMA6JzABgDQOYENAKBzAhsAQOcENgCAzglsAACdE9gAADonsAEAdE5gAwDonMAGANA5gQ0AoHMCGwBA546bdQHAKi0uJjt2JLt3J1u3Jtu3J1u2zLoqACZIYIP17Prrk23bkoMHk/37k82bk9e9Ltm5M7nggllXB8CEOCQK69Xi4iCsLS4OwloyeDw0/b77ZlsfABMjsMF6tWPHoGdtlIMHB/MBmAsCG6xXu3c/3LO21P79yZ49060HgDUjsMF6tXXr4Jy1UTZvTs4+e7r1ALBmBDZYr7ZvTzYc4SO8YcNgPgBzQWCD9WrLlsHVoFu2PNzTtnnzw9NPPHG29QEwMYb1gPXsgguSffsGFxjs2TM4DLp9u7AGMGcENljvTjwxufTSh18vLiZvf7uBdAHmiMAG88RAugBzyTlsMC8MpAswtwQ2mBcG0gWYWwIbzAsD6QLMLYEN5oWBdAHmlsAG88JAugBza+yrRKvqO5JsTXLCoWmttevWoihgBQ4NmLv0KtENGwykC7DOjRXYquq/JXl1kjOS3JLk/CT/J8lz1640YNkMpAswl8btYXt1kh9IckNr7TlV9dQkv7Z2ZQErtnQgXQDWvXHPYbu/tXZ/klTV8a21O5M8Ze3KAgDgkHF72PZW1UlJ3p/k2qr6apJ9a1cWsGKLi4NDom5NBTA3qrW2vAWq/m2Sb0/y4dbav6xJVau0sLDQdu3aNesyYPpG3Zrq0EUHbk0FMHNVdVNrbWG5y409rEdVbayqJyb5XAYXHpy23Dcbsc7nV9VnqmpPVV0+Yv6FVXVvVd0y/PmV1b4nzC23pgKYW+NeJfqzSd6Q5ItJDt37piX5vpW+cVVtTHJFkucl2ZvkE1V1TWvt9iVN/6a19sKVvg/MtcMPf/7zPycPPji63YMPDtq5GAFgXVrOVaJPaa19eYLvfW6SPa21zyZJVf1RkouSLA1swChLD38ed1xy4MDotvffn9zuowWwXo17SPQLSe6d8HufPlzvIXuH05Z6VlXdWlUfqqrvOdLKquqyqtpVVbvuueeeCZcKnRl1+PNIYe2QL0/y/1sATNO4PWyfTfJXVfXBJA8cmtha+61VvHeNmLb0CoibkzyptXZfVW3L4CrVraNW1lq7KslVyeCig1XUBf3bsWPQs7Ycj3vc2tQCwJobt4ftH5Ncm+QxSbYc9rMae5OcedjrM7JkqJDW2tdaa/cNn+9MsqmqTl7l+8L6t3v3wz1r4zjhhORpT1u7egBYU2P1sLXWfi1Jqmpza20Z/0oc1SeSbK2q70zyT0kuTvJfDm9QVacl+WJrrVXVuRkETMd1YOvWZOPG5KGHxmu/aZObvwOsY2P1sFXVs6rq9iR3DF8/vareupo3bq0dSPKqJB8ZrvePW2u3VdUrquoVw2YvTvLpqro1yZuTXNyWO3AczKNt28YLa5s3P3xTePcTBVi3xj2H7XeS/Psk1yRJa+3Wqvo3q33z4WHOnUumXXnY87ckectq3wfmzu/+7qO3qUouuyx54xuFNYB1buyBc1trX1gyacxjMcBELS4mv/M7j96uteS3fzv5+MfXviYA1tTYw3pU1Q8maVX1mKr6+QwPjwJTtmPH+OeuJcmLXuQuBwDr3LiB7RVJXpnBOGl7k5yT5GfWqijgKP7u75YX2A4cGIQ8ANatcQPbbyZ5VWvt8a21U5P8bJLfWLuygCP62MeW1/6hh5Ibb1ybWgCYinED2/e11r566MXw+TPWpiTgiBYXkzvvXP5yV12VvHVVF3YDMEPjBrYNVfUdh15U1WMz/hWmwKS85z2DiwlW4pWvHNwgHoB1ZzmHRD9eVb9eVW9M8vEk/3PtygJG+sAHVrf85ZdPpg4ApmrcOx1cXVW7kjw3g3uA/sfW2u1rWhnwSHv3rm75231sAdajsQ9rDgOab3uYpeXe8H2pTZsmUwcAUzX2wLlAB+6/f3XLP/OZk6kDgKkS2GC9WFxc3SHRxzwmOeecydUDwNQIbLAeXH99cvrpqzskevzxyfbtk6sJgKkR2KB3i4vJtm2Dx5U6/vhk5043gQdYp4ylBr3bsWN1PWsnnJB87nPJaadNriYApkoPG/Ru9+5k//6VLbt5c3LttcIawDonsEHvtm4dBK+VuOii5IILJlsPAFMnsEHvtm9PNqzwo/rZz062FgBmQmCD3m3ZMrhgYCVOOmmytQAwEwIbrAdPf/rKlvvrv07uu2+ytQAwdQIbrAevfe3KlnvwwcFVpgCsawIbrAfveMfKljtwILnttkdO37cvueSS5LzzBo/79q2uPgDWlMAG8+5tbxvcKeGQt751cNeEq69Obrxx8Hj66YPpAHSpWmuzrmHiFhYW2q5du2ZdBkxO1eqW37Jl0Iv2ta8NwtmR3HWXMdsA1lBV3dRaW1jucnrY4Fhw8ODgXLZf/MWjt7v88unUA8CyuDUV9G4S55ft35/s2ZPceefR233mM6t/L1iJxcXBfyp27x4MFr19+6BnGEgisEH/Xv3q1a9j48bk7LMH4e/GG4/c7ilPWf17wXJdf32ybdugJ3j//sGdPV73usH4g+7UAUmcwwb9e9zjkq98ZfXrWVx0Dhv9WVwc7JOLi4+cd+jcyxNPnH5dsEacwwbz6oEHVr+O7/3ewT96T3xicsUVo9tccYWwxvTt2DHoWRvl0LmXgMAG3Ttaj9i4Lrzw4ec/8zODnrRLLknOP3/weNddg+kwbbt3Dw6DjnLo3EvAOWzQvZe9LHn961e+fFVyzjnfPO2005J3v3tVZcFEbN06OGdtVGjbvHlw7iWghw269/KXr275jRsHV9xBj7ZvTzYc4Z+iDRvsuzAksEHvRp2MvRx/+qdO2qZfW7YMrgbdsmXQo5YMHg9Nt+9CEodEoX+PNtjt0TzpSclFF02uFlgLF1wwuBp0x47BOWtnnz3oWRPW4P8T2KB3jzbY7dHcd9/k6oC1dOKJaT/1U980qR185LBTowaiGjU81eh2I6aNaDnuaFdL2427rkn/DqMarrSWSW/LMSet+G8z9rZc4d903DqWU8tKzWVg+/yX9ucn3/Xw4KCT3rCr+oOOsey49U7yQzpot7IP6uja1v4LdJJ/12l8SMf/Gy5xwWvSnv7NwauNuLVoyyMntqrkTR9NNjw8b9xtOcokv8zH35YjKxnzPUetb4X7+ST/pquoY9BuZf+Q9fKdASzfXAa2xQcO5K8+c8+sy4DJ2PRtyWO/beXL/9/7J1cLTFmN+M/JiEmpEQ1Htxu1vvHeZJz1jVrXpH+H8WtbWS2r2ZajWo7/+49q9+jrG/d3H2Xke66wjnFr+YexKnukuQxsT37ct+bKn/yBb544ow/9uH/QpRMn/qFfxQdmnA/grLblKCv+0E/8QzqyuuWv7zd+I/n933/kl+qIbosa0b9RL3958vr/Pnixa1dy6U+lHnoo+cY3km/5lsGVeO96V7LwyIG3R3+ZTe7vutLPx6h1LWd9K/0dxv6HvaPvm0mua+K/w7gfapgjtcJRmuYysG05YVOe89RTZ10GTMbWM5KvrvAG8Mcfn9z6t8n7/nBwr8Yfe+E3X3V67/DxxS90CyCAjhnWA3q3mnGoHngg+fCHk9e8Jvmu70oefHB0O7cAAujaXPawwVzZsmX16zjSrX8On+8WQADd0sMGvVvtwLnjcAsggK4JbNC7aRyqdAsggK45JAq92717cus64YTB48aNg8OgmzcPwppbAAF0baaBraqen+R3k2xM8vbW2puWzK/h/G1Jvp7kJ1trN0+9UJilrVsnt65Nm5K///vkgx90CyCAdWRmga2qNia5IsnzkuxN8omquqa1dvthzV6QZOvw57wkbxs+wrFj+/bkZS9b/nIbNgzGWVvak3baacmll06+TgDWzCx72M5Nsqe19tkkqao/SnJRksMD20VJrm6D+5/cUFUnVdUTWmt3Tb9cmJGVXiV68cXJc5+rJw1gDswysJ2e5AuHvd6bR/aejWpzepJHBLaquizJZUly1llnTbRQWJd+8zcHvWkArHuzvEp01D1Jlt5XZ5w2g4mtXdVaW2itLZxyyimrLg66snHj8tqfd56wBjBHZhnY9iY587DXZyRZev+dcdrA/PuTP1le+/e/f23qAGAmZhnYPpFka1V9Z1U9JsnFSa5Z0uaaJC+pgfOT3Ov8NY5JP/qj418tesUVetcA5szMzmFrrR2oqlcl+UgGw3q8s7V2W1W9Yjj/yiQ7MxjSY08Gw3q8dFb1wswdGo7jhS/85unHHZc89anJ939/8qY3CWsAc6gGF2DOl4WFhbZr165ZlwEA8E2q6qbW2sJyl3NrKgCAzglsAACdE9gAADonsAEAdE5gAwDonMAGANA5gQ0AoHMCGwBA5wQ2AIDOCWwAAJ0T2AAAOiewAQB0TmADAOicwAYA0DmBDQCgcwIbAEDnBDYAgM4JbAAAnRPYAAA6J7ABAHROYAMA6JzABgDQOYENAKBzAhsAQOcENgCAzglsAACdE9gAADonsAEAdE5gAwDonMAGANA5gQ0AoHMCGwBA5wQ2AIDOCWwAAJ0T2AAAOiewAQB0TmADAOicwAYA0DmBDQCgcwIbAEDnBDYAgM4JbAAAnRPYAAA6d9ws3rSqHptkR5InJ/l8kv/cWvvqiHafT7KY5KEkB1prC9OrEgCgD7PqYbs8yUdba1uTfHT4+kie01o7R1gDAI5VswpsFyV5z/D5e5L8yIzqAADo3qwC2+Nba3clyfDx1CO0a0n+oqpuqqrLjrbCqrqsqnZV1a577rlnwuUCAMzOmp3DVlV/meS0EbN+eRmreXZrbV9VnZrk2qq6s7V23aiGrbWrklyVJAsLC23ZBQMAdGrNAltr7YeONK+qvlhVT2it3VVVT0hy9xHWsW/4eHdVvS/JuUlGBjYAgHk1q0Oi1yS5ZPj8kiQfWNqgqjZX1ZZDz5P8cJJPT61CAIBOzCqwvSnJ86pqd5LnDV+nqp5YVTuHbR6f5PqqujXJjUk+2Fr78EyqBQCYoZmMw9Za+3KSfzdi+r4k24bPP5vk6VMuDQCgO+50AADQOYENAKBzAhsAQOcENgCAzglsAACdE9gAADonsAEAdE5gAwDonMAGANA5gQ0AoHMCGwBA5wQ2AIDOCWwAAJ0T2AAAOiewAQB0TmADAOicwAYA0DmBDQCgcwIbAEDnBDYAgM4JbAAAnRPYAAA6J7ABAHROYAMA6JzABgDQOYENAKBzAhsAQOcENgCAzglsAACdE9gAADonsAEAdE5gAwDonMAGANA5gQ0AoHMCGwBA5wQ2AIDOCWwAAJ0T2AAAOiewAQB0TmADAOicwAYA0DmBDQCgcwIbAEDnBDYAgM4JbAAAnRPYAAA6V621WdcwcVV1T5J/mHUdx5CTk3xp1kUcQ2zv6bK9p8v2ni7be/qe0lrbstyFjluLSmattXbKrGs4llTVrtbawqzrOFbY3tNle0+X7T1dtvf0VdWulSznkCgAQOcENgCAzglsTMJVsy7gGGN7T5ftPV2293TZ3tO3om0+lxcdAADMEz1sAACdE9hYtqp6bFVdW1W7h4/fcYR2n6+qT1XVLSu9KuZYVlXPr6rPVNWeqrp8xPyqqjcP53+yqp45izrnxRjb+8Kqune4P99SVb8yizrnQVW9s6rurqpPH2G+fXuCxtje9u0Jqqozq+pjVXVHVd1WVa8e0WbZ+7jAxkpcnuSjrbWtST46fH0kz2mtneOy8eWpqo1JrkjygiRPS/LjVfW0Jc1ekGTr8OeyJG+bapFzZMztnSR/M9yfz2mtvXGqRc6Xdyd5/lHm27cn6905+vZO7NuTdCDJz7XWvjvJ+UleOYnvb4GNlbgoyXuGz9+T5EdmWMu8OjfJntbaZ1tr/5LkjzLY7oe7KMnVbeCGJCdV1ROmXeicGGd7MyGtteuSfOUoTezbEzTG9maCWmt3tdZuHj5fTHJHktOXNFv2Pi6wsRKPb63dlQx2zCSnHqFdS/IXVXVTVV02termw+lJvnDY67155Ad+nDaMZ9xt+ayqurWqPlRV3zOd0o5J9u3ps2+vgap6cpJnJPnbJbOWvY/P5Z0OWL2q+sskp42Y9cvLWM2zW2v7qurUJNdW1Z3D/+nx6GrEtKWXdI/ThvGMsy1vTvKk1tp9VbUtyfszOJzB5Nm3p8u+vQaq6sQkf5bkNa21ry2dPWKRo+7jetgYqbX2Q6217x3x84EkXzzUdTt8vPsI69g3fLw7yfsyOOzEePYmOfOw12ck2beCNoznUbdla+1rrbX7hs93JtlUVSdPr8Rjin17iuzbk1dVmzIIa3/QWvvzEU2WvY8LbKzENUkuGT6/JMkHljaoqs1VteXQ8yQ/nGTkFUqM9IkkW6vqO6vqMUkuzmC7H+6aJC8ZXm10fpJ7Dx2qZtkedXtX1WlVVcPn52bw/fnlqVd6bLBvT5F9e7KG2/IdSe5orf3WEZotex93SJSVeFOSP66qS5P8Y5IfS5KqemKSt7fWtiV5fJL3Db8Djkvyh621D8+o3nWntXagql6V5CNJNiZ5Z2vttqp6xXD+lUl2JtmWZE+Sryd56azqXe/G3N4vTvLTVXUgyTeSXNyMPL4iVfXeJBcmObmq9iZ5Q5JNiX17LYyxve3bk/XsJD+R5FNVdctw2i8lOStZ+T7uTgcAAJ1zSBQAoHMCGwBA5wQ2AIDOCWwAAJ0T2AAAOiewAYxQVecMR31f6fIfn2Q9wLFNYAMY7ZwMxkl6hKp61DEsW2s/OPGKgGOWcdiAuTW88fKHk1yf5PwktyZ5V5JfS3Jqkv+a5LYkv5fkX2cwyPOvJvlQBgNafkuSf0ryP5J8d5InJnlyki9lMBDm/0qyefh2r2qtfbyq3pjkRcNpj01yU2vtR9fslwSOCQIbMLeGgW1PkmdkEMw+kUFouzSDUPXSJLcnub219r+r6qQkNw7b/1iShdbaq4br+tUk/yHJBa21b1TVtyY52Fq7v6q2Jnlva23hsPfenOS6JK9trV03hV8XmGNuTQXMu8+11j6VJFV1W5KPttZaVX0qg96yM5K8qKp+ftj+hAxvITPCNa21bwyfb0rylqo6J8lDSf7VkrbvSvJuYQ2YBIENmHcPHPb84GGvD2bwHfhQkv/UWvvM4QtV1Xkj1rX/sOevTfLFJE/P4Hzg+w9b9peTfL219nurrh4gLjoA+EiSn62qSpKqesZw+mKSLUdZ7tuT3NVaO5jBjZ43DpfflsHh1lesWcXAMUdgA451v57B4c1PVtWnh6+T5GNJnlZVt1TV9hHLvTXJJVV1QwaHQw/1vr0+yWlJbhgu+9trWz5wLHDRAQBA5/SwAQB0TmADAOicwAYA0DmBDQCgcwIbAEDnBDYAgM4JbAAAnRPYAAA69/8AXoLYKINHTj4AAAAASUVORK5CYII=\n",
|
||
"text/plain": [
|
||
"<Figure size 691.2x388.8 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"fig = plot_data(Xo, yo, xlabel=u'metraż', ylabel=u'cena')\n",
|
||
"theta_start = np.matrix([0.0, 0.0]).reshape(2, 1)\n",
|
||
"theta, logs = gradient_descent(cost, gradient, theta_start, Xo, yo, alpha=0.01)\n",
|
||
"plot_regression(fig, h_linear, theta, Xo)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"source": [
|
||
"Na powyższym przykładzie obserwacja odstająca jawi sie jako pojedynczy punkt po prawej stronie wykresu. Widzimy, że otrzymana krzywa regresji zamiast odwzorowywać ogólny trend, próbuje „dopasować się” do tej pojedynczej obserwacji.\n",
|
||
"\n",
|
||
"Dlatego taką obserwację należy usunąć ze zbioru danych (zobacz ponizej)."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Odrzućmy obserwacje odstające\n",
|
||
"alldata_no_outliers = [\n",
|
||
" (index, item) for index, item in alldata.iterrows() \n",
|
||
" if item.price > 10000 and item.sqrMetres < 1000]\n",
|
||
"\n",
|
||
"# Alternatywnie można to zrobić w następujący sposób\n",
|
||
"alldata_no_outliers = alldata.loc[(alldata['price'] > 10000) & (alldata['sqrMetres'] < 1000)]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 22,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data = np.matrix(alldata_no_outliers[['price', 'sqrMetres']])\n",
|
||
"\n",
|
||
"m, n_plus_1 = data.shape\n",
|
||
"n = n_plus_1 - 1\n",
|
||
"Xn = data[:, 0:n]\n",
|
||
"\n",
|
||
"Xo = np.matrix(np.concatenate((np.ones((m, 1)), Xn), axis=1)).reshape(m, n + 1)\n",
|
||
"yo = np.matrix(data[:, -1]).reshape(m, 1)\n",
|
||
"\n",
|
||
"Xo /= np.amax(Xo, axis=0)\n",
|
||
"yo /= np.amax(yo, axis=0)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "subslide"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAmwAAAFoCAYAAADq7KeuAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3de5BcZ3nn8e8zmtFtZmSbsoxsWeayaO2YXWxYxZjgWi4JBKvYOAvEgs3GCsuW17GcNbdUTLIFhCSLUxtMAsgYr+PY3k3AW4GAKwgIIUk5TkJAJjbgu+IkWEjYigHP6K6Rnv3j9Eg9Mz0zPZfu82rm+6nq6tPnnO5+Z1ot/fSe533fyEwkSZJUrp66GyBJkqSpGdgkSZIKZ2CTJEkqnIFNkiSpcAY2SZKkwhnYJEmSCldbYIuIdRHxFxHxUEQ8EBHXtjgnIuIjEbEjIr4ZES+po62SJEl16q3xvUeAd2XmNyJiELg3Ir6cmQ82nXMpsL5xeynw8ca9JEnSolFbD1tm7s7MbzS2h4GHgLXjTrsMuCMrXwVOjYgzu9xUSZKkWhVRwxYRzwVeDPzduENrgSeaHu9kYqiTJEla0Oq8JApARAwAnwbenplD4w+3eErLtbQi4krgSoD+/v5/d955581rOyVJkubq3nvv/ZfMXD3T59Ua2CKijyqs/UFmfqbFKTuBdU2PzwZ2tXqtzLwZuBlgw4YNuX379nlurSRJ0txExD/P5nl1jhIN4PeAhzLzhklOuwu4ojFa9GLgmczc3bVGSpIkFaDOHraXAz8HfCsi7mvs+xXgHIDMvAnYBmwEdgD7gbfW0E5JkqRa1RbYMvMeWteoNZ+TwJbutEiSJKlMRYwSlSRJ0uQMbJIkSYUzsEmSJBXOwCZJklQ4A5skSVLhDGySJEmFM7BJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVDgDmyRJUuEMbJIkSYUzsEmSJBXOwCZJklQ4A5skSVLhDGySJEmFM7BJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVDgDmyRJUuEMbJIkSYUzsEmSJBXOwCZJklQ4A5skSVLhDGySJEmFM7BJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVDgDmyRJUuEMbJIkSYUzsEmSJBXOwCZJklQ4A5skSVLhDGySJEmFM7BJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVLhaA1tE3BoRT0XEtyc5/sqIeCYi7mvc3tvtNkqSJNWtt+b3vw34GHDHFOf8VWa+vjvNkSRJKk+tPWyZeTfw/TrbIEmSVLqToYbtZRFxf0R8ISJeONlJEXFlRGyPiO179uzpZvskSZI6qvTA9g3gOZl5AfBR4LOTnZiZN2fmhszcsHr16q41UJIkqdOKDmyZOZSZexvb24C+iDi95mZJkiR1VdGBLSLWREQ0ti+iau/T9bZKkiSpu2odJRoRnwReCZweETuB9wF9AJl5E/Am4BciYgQ4ALw5M7Om5kqSJNWi1sCWmW+Z5vjHqKb9kCRJWrSKviQqSZIkA5skSVLxDGySJEmFM7BJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVDgDmyRJUuEMbJIkSYUzsEmSJBXOwCZJklQ4A5skSVLhDGySJEmFM7BJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVDgDmyRJUuEMbJIkSYUzsEmSJBXOwCZJklQ4A5skSVLhDGySJEmFM7BJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVDgDmyRJUuEMbJIkSYUzsEmSJBXOwCZJklQ4A5skSVLhDGySJEmFM7BJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVDgDmyRJUuEMbJIkSYWrNbBFxK0R8VREfHuS4xERH4mIHRHxzYh4SbfbKKlmw8Nwyy3wy79c3Q8P190iSeq63prf/zbgY8Adkxy/FFjfuL0U+HjjXtJicM89sHEjHDsG+/ZBfz+8852wbRtcckndrZOkrqm1hy0z7wa+P8UplwF3ZOWrwKkRcWZ3WiepVsPDVVgbHq7CGlT3o/v37q23fZLURaXXsK0Fnmh6vLOxT9JCd+edVc9aK8eOVcclaZEoPbBFi33Z8sSIKyNie0Rs37NnT4ebJanjHnvsRM/aePv2wY4d3W2PJNWo9MC2E1jX9PhsYFerEzPz5szckJkbVq9e3ZXGSeqg9eurmrVW+vvhBS/obnskqUalB7a7gCsao0UvBp7JzN11N0pSF2zaBD2T/BXV01Mdl6RFotZRohHxSeCVwOkRsRN4H9AHkJk3AduAjcAOYD/w1npaKqnrBger0aDjR4n29FT7BwbqbqEkdU2tgS0z3zLN8QS2dKk5kkpzySWwa1c1wGDHjuoy6KZNhjVJi07d87BJ0tQGBuBtb6u7FZJUq9Jr2CRJkhY9A5skSVLhDGySJEmFM7BJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVDgDmyRJUuEMbJIkSYUzsEmSJBXOwCZJklQ4A5skSVLhDGySJEmFM7BJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVDgDmyRJUuEMbJIkSYXrrbsBknRSGB6GO++Exx6D9eth0yYYHKy7VZIWCQObJE3nnntg40Y4dgz27YP+fnjnO2HbNrjkkrpbJ2kR8JKoJE1leLgKa8PDVViD6n50/9699bZP0qJgYJOkqdx5Z9Wz1sqxY9VxSeqwti+JRsRpwHpg+ei+zLy7E42SpGI89tiJnrXx9u2DHTu62x5Ji1JbgS0i/itwLXA2cB9wMfC3wKs71zRJKsD69VXNWqvQ1t8PL3hB99skadFp95LotcCPAv+cma8CXgzs6VirJKkUmzZBzyR/Vfb0VMclqcPaDWwHM/MgQEQsy8yHgXM71yxJKsTgYDUadHCw6lGD6n50/8BAve2TtCi0W8O2MyJOBT4LfDkifgDs6lyzJKkgl1wCu3ZVAwx27Kgug27aZFiT1DWRmTN7QsQrgFOAL2bm4Y60ao42bNiQ27dvr7sZkiRJY0TEvZm5YabPm8ko0SXAs4F/bOxaA3xnpm8oSZKkmWl3lOgvAu8DngRGJyRK4EUdapckqRQuyyXVrt0etmuBczPz6U42RpJUGJflkorQ7ijRJ4BnOtkQSVJhXJZLKka7PWyPA38ZEZ8HDo3uzMwbOtIqSVL92lmW621v626bpEWq3cD2ncZtaeMmSVroXJZLKkZbgS0zfw0gIvozc5JvryRpQXFZLqkYbdWwRcTLIuJB4KHG4wsi4saOtkySVC+X5ZKK0e6gg98BfhJ4GiAz7wf+facaJUkqgMtyScVoe+LczHwiIpp3HZ3/5kiSiuKyXFIR2g1sT0TEjwEZEUuB/07j8qgkaYEbGHA0qFSzdi+JXgVsAdYCO4ELgas71ShJkiSd0G4P24eAazLzBwARcVpj33/pVMMkSQVxeSqpVu0GtheNhjWAzPxBRLy4Q22SJJXE5amk2rV7SbSn0asGQEQ8ixkMWJhMRLwuIh6JiB0RcV2L46+MiGci4r7G7b1zfU9J0gy4PJVUhJlcEv2biPgjIIHLgd+cyxtHxBJgK/Aaqrq4r0fEXZn54LhT/yozXz+X95IkzZLLU0lFaHelgzsiYjvwaiCAN7QIVjN1EbAjMx8HiIhPAZcBc31dSYuZtVbzy+WppCLMZB62B5nfMLUWeKLp8U7gpS3Oe1lE3A/sAt6dmQ/MYxskLSSdrLXatQve8x54+GE47zz44AfhrLPmp90lW7du6uNnn92ddkiLXLs1bJ0QLfbluMffAJ6TmRcAHwU+O+mLRVwZEdsjYvuePXvmsZmSTgqdrLW68UZYuxbuuAO+9rXqfu3aar8kdUGdgW0n0Pxft7OpetGOy8yhzNzb2N4G9EXE6a1eLDNvzswNmblh9erVnWqzpFK1U2s1G7t2wZYtrY9t2QLf+97sXvdk8cQTUx/fubM77ZAWuTmP9JyDrwPrI+J5wHeBNwP/qfmEiFgDPJmZGREXUQXMp7veUkllaq5Xu//+ztRavec9Ux+/7jq47bbZvfbJYP366tJyq99tf3+1VJW0yB05eozhgyMMHThS3R88wtCBIwwdPHJ8/1DjfrZqC2yZORIR1wBfApYAt2bmAxFxVeP4TcCbgF+IiBHgAPDmzBx/2VTSYjS+Xm3p0snPnUuwePjhqY8/8sjsXvdksWlTVQfYSk9PdVw6yR08crQRskYYPngiXI0NXJMf33+488ur19nDNnqZc9u4fTc1bX8M+Fi32yVpHnVi1GZzvdqow4cnP38uweK886q6tcmce+7sXne2uj0KdnCwGrQxfjBHT0+130XgVbPMZP/h8YGr2m4VuIYagWt4dN/BEQ6PTFJO0aaegFUr+hhc3suq5X3VbUUvg03bq5ZXxy//rdm9RyzEDqsNGzbk9u3b626GpFajNkf/oZ/LqM1bboG3v33yS6DLlsGhQ/Pzfrt2VQMMJrN7N6xZM7vXnqmpfp8XXNDZILd3b/X6O3ZUvZWbNhnWNC+OHUuGD7W6nNgIXwdO7Dt+fFxv19Fjc8syfUuCU1b0NQJWL6tW9B0PWNV279hANu54/9IlRLQaSzlRRNybmRtm2sZae9gkLWCtesFGA9YrXlH9g//bvz27qTGmmhsM4NWvrgLMfASLs86CrVtbDzzYurV7YW2q3+dP/mQV3DI7t3TUwIAT5KqlmdRvDR2cGL72Hhphrn1HK/qWNPVotQpcfS2Pj24v6+1pO3DVxcAmqTOmG7X5yU9Wt61b4eqrZ/ba0xXCv/GNcPnlVRt+/ddP9DiNtmumvVBXXw1veEM1wOCRR6rLoNdf372wBlP/PvfvH/t49PfymtfAVVfBC1/oBMKaVAn1W4PLeqcNV+OPj+4bXN7H0t46J73oDgObpPbNpH5qul6wUVu2VGFouvDT/N7r1sFk/xvu6amOr1079tLhtddWPVA9PbPrhVqzpt7RoO3+PpsdPAi/8zsu1r6ANddvTReumnu5hpt6u7pZvzU+cK1a3sfA8l6W9JTdu1UCA5uk9sx0FYF166qRm1MNBhg13dQYrd47E1asGBvAenrgj/4I3vSm1pcOm43u27ixqlMrvR5rql7F6ZxsP+siUkr91om6rM7Wb2n2DGySpjdV/dSP/zh86EOwefOJ3rZ77qnmL2snrAE8OMWqd1O998BAdWly584T9Wqf+tTklw5bOVkWMJ9qeo12dftnXQTLeY3Wb00arrpUvzUxXE0dvlY1ha+ToX5LBjZJ7ZiqfurwYXjXu+Dd765qx172suof6ZksBfX3f1+FvFY9dVO9dyYsX14FgVEzvXR4sixg3jy9xsgIHDhQXRZesqS6HTo0/Wt082e98caxAzVGl/SaTc1iB43Wb40vjp+ueH70eF31W6PHF0v9lgxskppNVqM2XQga7Un7wz+sLkm227M2amSkGtl5ww1je+pg6vduFUBmeunwZJqt/5JL4P3vrwIyVIF1ZKS6LV0KfX1T/9zd+FmHh+ETn4Bf+qXWx9utWWxDKfVbg+NqsqYdmWj9lmbBedgkVVrViUVUvSH33w9/+Zft9eLMxbJlVfBoroubas61/n743d8de4lveLgacNB8CXUqg4MnT13XdHPCfehD8N3vVr1YrT6rTv+so3+G9u+Ho1P0PG3eDLfddrx+a/LLiaMTnzZPeDq2t6sT9Vtja7V6xwayccet39JMzXYeNgObpJmHnE5rDha7dsHznz+zADLZIIXxo0TnYxLfbtq8ubqsONXx226bfsLieVotYUz91tPPMHTZGxk61sPwsn6GlvcztKzpNvp4+QDDq05j6NTTrd/SouTEuZJmb6o6sTocPQqvfS388IdVqFiyZOzx5cury3+TLY10ySVVkBs/Mz+c3LP1t7uu6WQ//8DAmDB38OBhhk9bzdAH/hdDWz/B0LkvnNDbNaP6rcv+R/s/y8ERAAaW9c5oZKL1W1qsDGySZjfHVyft3w9/+7cnHo+MTDzn0UenroMa7VE7duzE9uBg+aNBp9K0rmkC+/uWj+3NetErGL7vuyfC1Zk/xtBpF1Xh684HGNp3iOEHHmboihsZWt7P4d6lJ177r/fBX0+xZmoLY+q3nn6KVd95nMFD+1l1aC+rDu5j1aF9DB7ad3x71cHq8Sl3/zmr1p1l/ZY0AwY2aaEbHobbb4c/+ZPq8etfP7Gwfy5zfNVhyRL4/OdPhK/xl/jOOacasXrkSHUpddkyeMc74AtfKOby56zqt160meErLzl+afFoz5KJL/yp+6Z+49NO1MD1HT1yIkwdOcCq561j8F89Z3b1W7fcAlv/5/R/hrZuhXOfM8PfliRr2KSF7J57qkuLBw6M3b9yJXzpSyfCS2k1bO143evgRS+qtm+88cQ6mitXTlyqadTKlfDkk/NyGXRkdP3EycLVmHm3RiaMZJyP+q3lRw6eCFxnncHgc9dNXTx/682c8nufOP6cZSOHGdO/dd11Y6dImYmp/gxFwFveUg2K6OZyXlKBrGGTNNbwMFx66cSwBlWgufRS2L27Ci+Dg2Oni+iE3t7WlzZn68//HL74xYn7Jwtro8duvx22bOHgkaPtzyzf4vh8zL810/qt48eHvs/gB97L0kcentm6puvXwMEfTD7idi5TfjTPEzfZYAdJs2YPm7QQPfpoFcgef3zycyJg1So4/3z42Z+Fa67pbJuWLatC2zxcdm1Zv7Wsf+zj5QMMLVvZtN3P8LKVDPWfytCyfg7H3IrVx8+/1TJc3ft3rPqD28fWbx3ax6rfeD8DV/+37tdvTdULNl9Tfuzde3IP7JA6zGk9mhjYtKi9853w4Q/X3YoTli6tbp/+9PE1Po8R7F26ohGuBhha3ghTje0J4WvZQHW8KYy1rN+agb6jI1WQWn0aq04bnFirtazRm3U8kI09Pu38W9PNmbZ7dz2XB6eb8kNSRxnYmhjYtGg9+mh1iazDRqJnQm/W8cC1YoChpSur+76VDK0crALYGWcx1LeCocNH2bt0BTnHHq4x9VuTjEYcPbbq4F4GD+3nlKbRi8frtzo1mWy7c6bVwV4wqTbWsEmCn//5tk47uKRvTOAa35s1XY/X/qUrZt/GZdXdQNP0D9VUEFWwOh6+mgLY2DBWha+lx+apHq5TC6K3O2daHQYGTu7pTaRFyMAmnWQykwNHjrYujl/6HIYuPnvizPKjtVyNADZm/q1Z6Dl2dEzIOr49JoBNnIvrlMb5A4cPsCQLmai3UwuiN82Z1lIXekIlLRwGNqnLjh1L9h5uTO/QYrqHEwtXjzveNFpx0vUTL/7PbbWhef6t6Xqzqn37j2+vOrSP/sMHWDDTnXZqQfQPfnDqS6LXXz//7ylpwTKwSTM0fv6t8eFqqGX4muf5t/p6Wk5kuurQPgZvu6VlbdcpTb1fy0cOLZzANVc9PSeWrZpPZ51VTRK7ZcvEY1u3Oh+ZpBkxsGnROTQyyeXESSY/HX98vuffmjC56RQzy7e1fuI3PwMfvn3ObVywRueDax4d2amC+6uvhje8oZqQ9pFHZjZnmiQ1MbDppDK+fmu6meVbBa7DI3OrnWqef2tw2fThavT4KY35uTq+fuINN1TTaPzWb3XuPZpFMOcuw27p76+mFjnzzO6Njlyzpr7RoJIWDKf1UFeNr98af7lwQi3XoYnha2Sy+q029S2JFgGrxeSnxwPZ2OP9S3vpKX3B6uFhOP10OHx44rH5XnFg8+bq9e68c/rXHe3Vuv76qtdpdC6wFStar8gwlRUrqjVFR5ek6u8/sdj7wYOtn9OpKTwkqU3Ow9bEwNY5JdVvDTYt6TPd5cRTmiY/Xd7XM/WEpwvFPfdU620ePAhHj1bhZvnyajmnz3xmfibXbQ5Au3fD858/eVi6/PJqXdPRXq3xc4E980zrpbHe8Y5qYfFWE71eeOHE+cQAfuM3qp8volr83clhJRXCwNbEwDa55vqttkYmHjjSsfqtyWu1Ws8sP7i8l8HlvSzrndsM94vKVBOk/sM/wBVXwIMPwg9/eOI5vb1VwDvttKrH6siRKihlnrj8OVkAGj+Lfl9fFRQ/97kqrE3ne99rXe81m4lenRxWUoEMbE0WamCbaf1Wq0B2qMb6rcHlvQws66V3ydxmuFcHtBNu2g1ABiVJmpSBrUmpga2E+q3enjgenk5cTpwYvk7q+i1Jkgrl0lRd0Fy/1epyYevw1Z36rZbF843jp6xo9HYtpvotSZIWkEUV2OZSvzV88Aj7rN+SJEk1WJCB7Tvf388Vt35t3uu3Ihg39cPk4WqV9VuSJGmeLMjA9syBI9z96J4J+9ut35owF9fyqoje+i1JklSHBRnY1p22kv/91h+1fkuSJC0ICzKwnbqyj1ede0bdzZAkSZoXFlRJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVDgDmyRJUuEMbJIkSYUzsEmSJBXOwCZJklQ4A5skSVLhDGySJEmFM7BJkiQVrtbAFhGvi4hHImJHRFzX4nhExEcax78ZES+po52SJEl1qi2wRcQSYCtwKXA+8JaIOH/caZcC6xu3K4GPd7WRkiRJBaizh+0iYEdmPp6Zh4FPAZeNO+cy4I6sfBU4NSLO7HZDJUmS6lRnYFsLPNH0eGdj30zPASAiroyI7RGxfc+ePfPaUEmSpDrVGdiixb6cxTnVzsybM3NDZm5YvXr1nBsnSZJUijoD205gXdPjs4FdszhHkiRpQaszsH0dWB8Rz4uIpcCbgbvGnXMXcEVjtOjFwDOZubvbDZUkSapTb11vnJkjEXEN8CVgCXBrZj4QEVc1jt8EbAM2AjuA/cBb62qvJElSXWoLbACZuY0qlDXvu6lpO4Et3W6XJElSSVzpQJIkqXAGNkmSpMIZ2CRJkgpnYJMkSSqcgU2SJKlwBjZJkqTCGdgkSZIKZ2CTJEkqnIFNkiSpcAY2SZKkwhnYJEmSCmdgkyRJKpyBTZIkqXAGNkmSpMIZ2CRJkgpnYJMkSSqcgU2SJKlwBjZJkqTCGdgkSZIKZ2CTJEkqnIFNkiSpcAY2SZKkwhnYJEmSCmdgkyRJKpyBTZIkqXAGNkmSpMIZ2CRJkgpnYJMkSSqcgU2SJKlwBjZJkqTCGdgkSZIKZ2CTJEkqnIFNkiSpcAY2SZKkwhnYJEmSCmdgkyRJKpyBTZIkqXAGNkmSpMIZ2CRJkgpnYJMkSSqcgU2SJKlwBjZJkqTCGdgkSZIKZ2CTJEkqnIFNkiSpcAY2SZKkwhnYJEmSCtdbx5tGxLOAO4HnAv8EXJ6ZP2hx3j8Bw8BRYCQzN3SvlZIkSWWoq4ftOuArmbke+Erj8WRelZkXGtYkSdJiVVdguwy4vbF9O/DTNbVDkiSpeHUFtmdn5m6Axv0Zk5yXwJ9GxL0RcWXXWidJklSQjtWwRcSfAWtaHPrVGbzMyzNzV0ScAXw5Ih7OzLsneb8rgSsBzjnnnBm3V5IkqVQdC2yZ+ROTHYuIJyPizMzcHRFnAk9N8hq7GvdPRcQfAxcBLQNbZt4M3AywYcOGnGv7JUmSSlHXJdG7gM2N7c3A58afEBH9ETE4ug28Fvh211ooSZJUiLoC2/XAayLiMeA1jcdExFkRsa1xzrOBeyLifuBrwOcz84u1tFaSJKlGtczDlplPAz/eYv8uYGNj+3Hggi43TZIkqTiudCBJklQ4A5skSVLhDGySJEmFM7BJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVDgDmyRJUuEMbJIkSYUzsEmSJBXOwCZJklQ4A5skSVLhDGySJEmFM7BJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVDgDmyRJUuEMbJIkSYWLzKy7DfMuIvYA/1x3O2p2OvAvdTdikfMzKIOfQ/38DMrg51CGczNzcKZP6u1ES+qWmavrbkPdImJ7Zm6oux2LmZ9BGfwc6udnUAY/hzJExPbZPM9LopIkSYUzsEmSJBXOwLZw3Vx3A+RnUAg/h/r5GZTBz6EMs/ocFuSgA0mSpIXEHjZJkqTCGdgWiIh4VkR8OSIea9yfNsl5/xQR34qI+2Y7UkVjRcTrIuKRiNgREde1OB4R8ZHG8W9GxEvqaOdC18bn8MqIeKbxZ/++iHhvHe1cyCLi1oh4KiK+Pclxvwsd1sZn4PegwyJiXUT8RUQ8FBEPRMS1Lc6Z8XfBwLZwXAd8JTPXA19pPJ7MqzLzQod3z11ELAG2ApcC5wNviYjzx512KbC+cbsS+HhXG7kItPk5APxV48/+hZn5ga42cnG4DXjdFMf9LnTebUz9GYDfg04bAd6VmT8CXAxsmY9/FwxsC8dlwO2N7duBn66xLYvJRcCOzHw8Mw8Dn6L6LJpdBtyRla8Cp0bEmd1u6ALXzuegDsvMu4HvT3GK34UOa+MzUIdl5u7M/EZjexh4CFg77rQZfxcMbAvHszNzN1R/WIAzJjkvgT+NiHsj4squtW7hWgs80fR4JxO/mO2co7lp93f8soi4PyK+EBEv7E7T1MTvQhn8HnRJRDwXeDHwd+MOzfi7sCBXOlioIuLPgDUtDv3qDF7m5Zm5KyLOAL4cEQ83/kem2YkW+8YPvW7nHM1NO7/jbwDPycy9EbER+CzV5Qh1j9+F+vk96JKIGAA+Dbw9M4fGH27xlCm/C/awnUQy8ycy89+0uH0OeHK0O7Vx/9Qkr7Grcf8U8MdUl5I0ezuBdU2PzwZ2zeIczc20v+PMHMrMvY3tbUBfRJzevSYKvwu183vQHRHRRxXW/iAzP9PilBl/FwxsC8ddwObG9mbgc+NPiIj+iBgc3QZeC7QcSaS2fR1YHxHPi4ilwJupPotmdwFXNEYFXQw8M3r5WvNm2s8hItZERDS2L6L6++/prrd0cfO7UDO/B53X+P3+HvBQZt4wyWkz/i54SXThuB74fxHxNuA7wM8ARMRZwC2ZuRF4NgCr5ogAAAKGSURBVPDHje9qL/CHmfnFmtq7IGTmSERcA3wJWALcmpkPRMRVjeM3AduAjcAOYD/w1rrau1C1+Tm8CfiFiBgBDgBvTmcOn1cR8UnglcDpEbETeB/QB34XuqWNz8DvQee9HPg54FsRcV9j368A58DsvwuudCBJklQ4L4lKkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkktRMSFjZngZ/v8v5nP9kha3AxsktTahVTzJE0QEdPOYZmZPzbvLZK0aDkPm6QFq7Hw8heBe4CLgfuB3wd+DTgD+FngAeCjwL+lmlD6/cAXqCa0XAF8F/gg8CPAWcBzgX+hmgjz/wD9jbe7JjP/JiI+APxUY9+zgHsz8z927IeUtCgY2CQtWI3AtgN4MVUw+zpVaHsbVah6K/Ag8GBm/t+IOBX4WuP8nwE2ZOY1jdd6P/AfgEsy80BErASOZebBiFgPfDIzNzS9dz9wN/COzLy7Cz+upAXMpakkLXT/mJnfAoiIB4CvZGZGxLeoesvOBn4qIt7dOH85jSVkWrgrMw80tvuAj0XEhcBR4F+PO/f3gdsMa5Lmg4FN0kJ3qGn7WNPjY1R/Bx4F3piZjzQ/KSJe2uK19jVtvwN4EriAqh74YNNzfxXYn5kfnXPrJQkHHUjSl4BfjIgAiIgXN/YPA4NTPO8UYHdmHqNa6HlJ4/kbqS63XtWxFktadAxskha7X6e6vPnNiPh24zHAXwDnR8R9EbGpxfNuBDZHxFepLoeO9r79ErAG+GrjuR/ubPMlLQYOOpAkSSqcPWySJEmFM7BJkiQVzsAmSZJUOAObJElS4QxskiRJhTOwSZIkFc7AJkmSVDgDmyRJUuH+P85m5/l5ILrVAAAAAElFTkSuQmCC\n",
|
||
"text/plain": [
|
||
"<Figure size 691.2x388.8 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"fig = plot_data(Xo, yo, xlabel=u'metraż', ylabel=u'cena')\n",
|
||
"theta_start = np.matrix([0.0, 0.0]).reshape(2, 1)\n",
|
||
"theta, logs = gradient_descent(cost, gradient, theta_start, Xo, yo, alpha=0.01)\n",
|
||
"plot_regression(fig, h_linear, theta, Xo)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "notes"
|
||
}
|
||
},
|
||
"source": [
|
||
"Na powyższym wykresie widać, że po odrzuceniu obserwacji odstających otrzymujemy dużo bardziej „wiarygodną” krzywą regresji."
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"celltoolbar": "Slideshow",
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.8.3"
|
||
},
|
||
"livereveal": {
|
||
"start_slideshow_at": "selected",
|
||
"theme": "white"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 4
|
||
}
|