1
0
uczenie-maszynowe/wyk/09_NB_i_KNN.ipynb

2208 lines
975 KiB
Plaintext
Raw Normal View History

2022-12-09 15:06:17 +01:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Uczenie maszynowe\n",
"# 9. Przegląd metod uczenia nadzorowanego część 1"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 9.1. Naiwny klasyfikator bayesowski"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Naiwny klasyfikator bayesowski jest algorytmem dla problemu klasyfikacji wieloklasowej.\n",
"* Naszym celem jest znalezienie funkcji uczącej $f \\colon x \\mapsto y$, gdzie $y$ oznacza jedną ze zdefiniowanych wcześniej klas.\n",
"* Klasyfikacja probabilistyczna polega na wskazaniu klasy o najwyższym prawdopodobieństwie:\n",
"$$ \\hat{y} = \\mathop{\\arg \\max}_y P( y \\,|\\, x ) $$\n",
"* Naiwny klasyfikator bayesowski należy do rodziny klasyfikatorów probabilistycznych"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img style=\"float: right;\" src=\"https://upload.wikimedia.org/wikipedia/commons/d/d4/Thomas_Bayes.gif\">\n",
"\n",
"**Thomas Bayes** (wymowa: /beɪz/) (17021761) angielski matematyk i duchowny"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Twierdzenie Bayesa wzór ogólny\n",
"\n",
"$$ P( Y \\,|\\, X ) = \\frac{ P( X \\,|\\, Y ) \\cdot P( Y ) }{ P ( X ) } $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"source": [
"Twierdzenie Bayesa opisuje związek między prawdopodobieństwami warunkowymi dwóch zdarzeń warunkujących się nawzajem."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Twierdzenie Bayesa\n",
"(po zastosowaniu wzoru na prawdopodobieństwo całkowite)\n",
"\n",
"$$ \\underbrace{P( y_k \\,|\\, x )}_\\textrm{ prawd. a posteriori } = \\frac{ \\overbrace{ P( x \\,|\\, y_k )}^\\textrm{ model klasy } \\cdot \\overbrace{P( y_k )}^\\textrm{ prawd. a priori } }{ \\underbrace{\\sum_{i} P( x \\,|\\, y_i ) \\, P( y_i )}_\\textrm{wyrażenie normalizacyjne} } $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"source": [
" * W tym przypadku „zdarzenie $x$” oznacza, że cechy wejściowe danej obserwacji przyjmują wartości opisane wektorem $x$.\n",
" * „Zdarzenie $y_k$” oznacza, że dana obserwacja należy do klasy $y_k$.\n",
" * **Model klasy** $y_k$ opisuje rozkład prawdopodobieństwa cech obserwacji należących do tej klasy.\n",
" * **Prawdopodobieństwo *a priori*** to prawdopodobienstwo, że losowa obserwacja należy do klasy $y_k$.\n",
" * **Prawdopodobieństwo *a posteriori*** to prawdopodobieństwo, którego szukamy: że obserwacja opisana wektorem cech $x$ należy do klasy $y_k$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Rola wyrażenia normalizacyjnego w twierdzeniu Bayesa\n",
"\n",
" * Wartość wyrażenia normalizacyjnego nie wpływa na wynik klasyfikacji.\n",
"\n",
"**Przykład**: obserwacja nietypowa ma małe prawdopodobieństwo względem dowolnej klasy, wyrażenie normalizacyjne sprawia, że to prawdopodobieństwo staje się porównywalne z prawdopodobieństwami typowych obserwacji, ale nie wpływa na klasyfikację!"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Klasyfikatory dyskryminatywne a generatywne\n",
"\n",
"* Klasyfikatory generatywne tworzą model rozkładu prawdopodobieństwa dla każdej z klas.\n",
"* Klasyfikatory dyskryminatywne wyznaczają granicę klas (*decision boundary*) bezpośrednio.\n",
"* Naiwny klasyfikator bayesowski jest klasyfikatorem generatywnym (ponieważ wyznacza $P( x \\,|\\, y )$).\n",
"* Wszystkie klasyfikatory generatywne są probabilistyczne, ale nie na odwrót.\n",
"* Regresja logistyczna jest przykładem klasyfikatora dyskryminatywnego."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Założenie niezależności dla naiwnego klasyfikatora bayesowskiego\n",
"\n",
"* Naiwny klasyfikator bayesowski jest *naiwny*, ponieważ zakłada, że poszczególne cechy są niezależne od siebie:\n",
"$$ P( x_1, \\ldots, x_n \\,|\\, y ) \\,=\\, \\prod_{i=1}^n P( x_i \\,|\\, x_1, \\ldots, x_{i-1}, y ) \\,=\\, \\prod_{i=1}^n P( x_i \\,|\\, y ) $$\n",
"* To założenie jest bardzo przydatne ze względów obliczeniowych, ponieważ bardzo często mamy do czynienia z ogromną liczbą cech (bitmapy, słowniki itp.)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Naiwny klasyfikator bayesowski przykład"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 2,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Przydtne importy\n",
"\n",
"import ipywidgets as widgets\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import pandas\n",
"\n",
"%matplotlib inline"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 3,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# Wczytanie danych (gatunki kosaćców)\n",
"\n",
"data_iris = pandas.read_csv('iris.csv')\n",
"data_iris_setosa = pandas.DataFrame()\n",
"data_iris_setosa['dł. płatka'] = data_iris['pl'] # \"pl\" oznacza \"petal length\"\n",
"data_iris_setosa['szer. płatka'] = data_iris['pw'] # \"pw\" oznacza \"petal width\"\n",
"data_iris_setosa['Iris setosa?'] = data_iris['Gatunek'].apply(lambda x: 1 if x=='Iris-setosa' else 0)\n",
"\n",
"m, n_plus_1 = data_iris_setosa.values.shape\n",
"n = n_plus_1 - 1\n",
"Xn = data_iris_setosa.values[:, 0:n].reshape(m, n)\n",
"\n",
"X = np.matrix(np.concatenate((np.ones((m, 1)), Xn), axis=1)).reshape(m, n_plus_1)\n",
"Y = np.matrix(data_iris_setosa.values[:, 2]).reshape(m, 1)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 4,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"liczba przykładów: {0: 100, 1: 50}\n",
"prior probability: {0: 0.6666666666666666, 1: 0.3333333333333333}\n"
]
}
],
"source": [
"classes = [0, 1]\n",
"count = [sum(1 if y == c else 0 for y in Y.T.tolist()[0]) for c in classes]\n",
"prior_prob = [float(count[c]) / float(Y.shape[0]) for c in classes]\n",
"\n",
"print('liczba przykładów: ', {c: count[c] for c in classes})\n",
"print('prior probability:', {c: prior_prob[c] for c in classes})"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 5,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Wykres danych (wersja macierzowa)\n",
"def plot_data_for_classification(X, Y, xlabel, ylabel): \n",
" fig = plt.figure(figsize=(16*.6, 9*.6))\n",
" ax = fig.add_subplot(111)\n",
" fig.subplots_adjust(left=0.1, right=0.9, bottom=0.1, top=0.9)\n",
" X = X.tolist()\n",
" Y = Y.tolist()\n",
" X1n = [x[1] for x, y in zip(X, Y) if y[0] == 0]\n",
" X1p = [x[1] for x, y in zip(X, Y) if y[0] == 1]\n",
" X2n = [x[2] for x, y in zip(X, Y) if y[0] == 0]\n",
" X2p = [x[2] for x, y in zip(X, Y) if y[0] == 1]\n",
" ax.scatter(X1n, X2n, c='r', marker='x', s=50, label='Dane')\n",
" ax.scatter(X1p, X2p, c='g', marker='o', s=50, label='Dane')\n",
" \n",
" ax.set_xlabel(xlabel)\n",
" ax.set_ylabel(ylabel)\n",
" ax.margins(.05, .05)\n",
" return fig"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 6,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0cAAAHvCAYAAACfaqQpAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABVmElEQVR4nO3de3xU1b3///dMJgnRkGis3GSiRAheQKVgY7goHmO9cDRof9XyrUCpWuVSuaht7bfVU2qLtYo9fWBA+/VS47FE2yJWLcjFUAgEuVbwhoglqVy0ggnBNJNk9u+PfRIyyWQyO5k9s2fm9Xw85qGz92dWPrN2bNcna++1XIZhGAIAAACAJOeOdQIAAAAA4AQURwAAAAAgiiMAAAAAkERxBAAAAACSKI4AAAAAQBLFEQAAAABIojgCAAAAAEkURwAAAAAgSfLEOoFo8/v9OnDggHr37i2XyxXrdAAAAADYyDAMHTt2TAMGDJDbHXpuKOmKowMHDsjr9cY6DQAAAABRVF1drYEDB4aMSbriqHfv3pLMzsnKyopxNgAAAADsVFtbK6/X21oHhJJ0xVHLrXRZWVkURwAAAECSCOeRGhZkAAAAAABRHAEAAACAJIojAAAAAJBEcQQAAAAAkiiOAAAAAEASxREAAAAASKI4AgAAAABJFEcAAAAAIIniCAAAAAAkURwBAAAAgCSKIwAAALTn8/XsfKRZzcdp+SNuUBwBAADghLIyafhwqbo6+PnqavN8WZkz83Fa/ogrMS2OFixYoIsvvli9e/dWnz59NHHiRH3wwQchP/Pss8/K5XIFvHr16hWljAEAABKYzyfdf7+0Z480fnzHAqO62jy+Z48ZZ/cMjNV86uqclT/iTkyLo3Xr1mnmzJmqrKzUqlWr1NjYqK9//es6fvx4yM9lZWXp4MGDra/9+/dHKWMAAIAElpYmrV4t5eVJ+/YFFhgthcW+feb51avNeCflk5nprPwRd1yGYRixTqLFZ599pj59+mjdunW69NJLg8Y8++yzmjNnjr744otu/Yza2lplZ2erpqZGWVlZPcgWAAAgQbUvJEpLpcmTT7wvL5e8Xufm47T8EVNWxv+OeuaopqZGkpSTkxMyrq6uTmeeeaa8Xq+Ki4v1zjvvdBrb0NCg2tragBcAAABC8HrNAqJlBmbMmNgWFlbzcVr+iBuOKY78fr/mzJmjMWPGaNiwYZ3GDR06VE8//bSWL1+u559/Xn6/X6NHj9Y///nPoPELFixQdnZ268vLfwwAAABd83rNGZe2SktjV1hYzcdp+SMuOOa2uunTp+uvf/2rNmzYoIEDB4b9ucbGRp177rmaNGmSfv7zn3c439DQoIaGhtb3tbW18nq93FYHAAAQSttb01rEcubFaj5Oyx8xE3e31c2aNUuvvvqq3nzzTUuFkSSlpqZqxIgR2rt3b9Dz6enpysrKCngBAAAghPbP7FRUBF/kwKn5OC1/xI2YFkeGYWjWrFlatmyZ1q5dq0GDBlluo7m5Wbt27VL//v1tyBAAACDJtC8sysul0aMDn+GJZoFhNR+n5Y+4EtPiaObMmXr++ef1wgsvqHfv3jp06JAOHTqk+vr61pgpU6bovvvua30/f/58vfHGG9q3b5+2b9+uW265Rfv379dtt90Wi68AAACQOHw+qago+OIF7Rc5KCqKzj5HVvKpq3NW/og7MS2OFi9erJqaGo0fP179+/dvfZW12bG4qqpKBw8ebH1/9OhR3X777Tr33HN17bXXqra2Vhs3btR5550Xi68AAACQONLSpPnzpfz80KvA5eebcdHY58hKPpmZzsofcccxCzJEC/scAQAAdMHnC104dHU+1vk4LX/EVNwtyAAAAAAH6apwiHZhYTUfp+WPuEFxBAAAAACiOAIAAAAASRRHAAAAACCJ4ggAACSrrpZxTrRlnuvqenYeSAIURwAAIPmUlUnDh3e+EWh1tXm+zfYicW32bCknR9q8Ofj5zZvN87NnRzcvwGFYyhsAACQXn88sfPbs6bhRqGQWRuPHmxuF5udLu3bF9+pmdXVm4dPYKHk80oYNUkHBifObN0tjx0pNTVJqqnTkiLlfEJAgWMobAACgM2lp0urVZmG0b59ZCLXMILUtjPLyzLh4Lowks9BZv94sjJqazEKoZQapbWHk8ZhxFEZIYhRHAAAg+Xi95oxR2wJp48bAwqj9jFI8KygwZ4zaFkhPPBFYGLWfUQKSELfVAQCA5NV2pqhFohVGbbWdKWpBYYQEx211AAAA4fB6pdLSwGOlpYlZGElmAbRoUeCxRYsojID/RXEEAACSV3W1NHly4LHJkztfxS7ebd4szZoVeGzWrM5XsQOSDMURAABITu0XX6ioCL5IQ6Jov/jCkiXBF2kAkhjFEQAASD7tC6Pycmn06I6LNCRKgdS+MNqwQbrjjo6LNFAgIclRHAEAgOTi80lFRcFXpWu/il1RkRkfz+rqpHHjgq9K134Vu3HjzHggSVEcAQCA5JKWJs2fb27wGmxVupYCKT/fjEuEfY6mTzc3eA22Kl1LgZSaasaxzxGSGEt5AwCA5OTzhS58ujofb+rqQhc+XZ0H4hRLeQMAAHSlq8InkQojqevCh8IIoDgCAAAAAIniCAAAAAAkURwBAIBk1dUqdO3POy3eTnbnEu/tI7gE6HeKIwAAkHzKyqThwzvfx6i62jxfVubMeDvZnUu8t4/gEqXfjSRTU1NjSDJqampinQoAAIiFhgbDyM83DMkw8vIMo6oq8HxVlXlcMuOOHXNWfEND5PukhdW+sZpLvLeP4Bze71bG/xRHAAAg+bQdrLUdzMXLcSf1TbK1j+Ac3O8URyFQHAEAAMMwOg7aKipCD+KcFm8nu3OJ9/YRnEP73cr4n01gAQBA8qqulsaPl/btO3EsL08qL5e8XufH28nuXOK9fQTnwH5nE1gAAIBweL1SaWngsdLSzgdxTou3k925xHv7CC7O+53iCAAAJK/qamny5MBjkyeHXnHLSfF2sjuXeG8fwcV5v1McAQCA5NT29p+8PKmiwvznvn3m8faDOafF28nuXOK9fQSXCP1u+xNQDsOCDAAAwHGrzzlppa94X03OSX2ZTBzc76xWFwLFEQAASY59jjoX7/sQOXy/nYTl8H6nOAqB4ggAABhLl5qDtM7+il1VZZ5futSZ8XayO5d4bx/BObjfWco7BJbyBgAAkiSfT0pLC/+80+LtZHcu8d4+gnNov7OUNwAAQFe6GqS1P++0eDvZnUu8t4/gEqDfKY4AAAAAQBRHAAAAACCJ4ggAACA5+Hw9Ox+rthFZXKuQKI4AAAASXVmZNHx455twVleb58vKnNU2Iotr1SVWqwMAAEhkPp854N2zR8rLk8rLJa/3xPnqamn8eGnfPik/X9q1K/wH5+1sG5GVxNeK1eoAAABgSkuTVq82B8T79pkD4JaZg7YD4rw8M87KgNjOthFZXKuwUBwBAAAkOq/XnCloOzDeuDFwQNx+JsEJbSOyuFZd4rY6AACAZNF2hqBFpAbEdraNyEqya8VtdQAAAOjI65VKSwOPlZZGZkBsZ9uILK5VpyiOAAAAkkV1tTR5cuCxyZM7X73MKW0jsrhWnaI4AgAASAbtH7qvqAj+cL7T2kZkca1CojgCAABIdO0HxOXl0ujRHR/O787A2M62EVlcqy5RHAEAACQyn08qKgq+Gln71cuKisx4J7SNyOJahYXiCAAAIJGlpUnz55sbewZbjaxlYJyfb8ZZ3efIrrYRWVyrsLCUNwAAQDLw+UIPeLs6H6u2EVlJeK1YyhsAAACBuhrw9mRAbGfbiCyuVUgURwAAAAAgiiMAAAAAkERxBAAAAACSKI4AAACcoaulk9uftzPeattW2d1+MqEvI4riCAAAINbKyqThwzvffLO62jxfVmZ/vNW2rbK7/WRCX0YcS3kDAADEks9nDmD37Om4OadkDnDHjzc358zPl7Ztk0aOtCd+yBDz2Icfhtf2rl3WVjez+l2ttp9M6MuwsZQ3AABAvEhLk1avNge4+/aZA9qWmYC2A9y8PDMuM9O++DVrzFe4bVsdbFv9rkk
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(X, Y, xlabel=u'dł. płatka', ylabel=u'szer. płatka')"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 7,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"średnia: [matrix([[1. , 4.906, 1.676]]), matrix([[1. , 1.464, 0.244]])]\n",
"odchylenie standardowe: [matrix([[0. , 0.8214402 , 0.42263933]]), matrix([[0. , 0.17176728, 0.10613199]])]\n",
"(1, 3)\n"
]
}
],
"source": [
"XY = np.column_stack((X, Y))\n",
"XY_split = [XY[np.where(XY[:,3] == c)[0]] for c in classes]\n",
"X_split = [XY_split[c][:,0:3] for c in classes]\n",
"Y_split = [XY_split[c][:,3] for c in classes]\n",
"\n",
"X_mean = [np.mean(X_split[c], axis=0) for c in classes]\n",
"X_std = [np.std(X_split[c], axis=0) for c in classes]\n",
"print('średnia: ', X_mean) \n",
"print('odchylenie standardowe: ', X_std)\n",
"\n",
"print(X_std[0].shape)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 8,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Rysowanie średnich\n",
"def draw_means(fig, means, xmin=0.0, xmax=7.0, ymin=0.0, ymax=7.0):\n",
" class_color = {0: 'r', 1: 'g'}\n",
" classes = range(len(means))\n",
" ax = fig.axes[0]\n",
" mean_x1 = [means[c].item(0, 1) for c in classes]\n",
" mean_x2 = [means[c].item(0, 2) for c in classes]\n",
" for c in classes:\n",
" ax.plot([mean_x1[c], mean_x1[c]], [xmin, xmax],\n",
" color=class_color.get(c, 'c'), linestyle='dashed')\n",
" ax.plot([ymin, ymax], [mean_x2[c], mean_x2[c]],\n",
" color=class_color.get(c, 'c'), linestyle='dashed') "
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 9,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"from scipy.stats import norm\n",
"\n",
"# Prawdopodobieństwo klasy dla pojedynczej cechy\n",
"# Uwaga: jeżeli odchylenie standardowe dla danej cechy jest równe 0, \n",
"# to nie można określić prawdopodbieństwa klasy!\n",
"def prob(x, c, feature, mean, std):\n",
" sd = std[c].item(0, feature)\n",
" if sd == 0:\n",
" print('Nie można określić prawdopodobieństwa klasy dla cechy {}.!'.format(feature))\n",
" return norm(mean[c].item(0, feature), sd).pdf(x)\n",
"\n",
"# Prawdopodobieństwo klasy\n",
"# Uwaga: tu bierzemy iloczyn dwóch cech (1. i 2.), w ogólności może być ich więcej\n",
"def class_prob(x, c, mean, std, features=[1, 2]):\n",
" result = 1\n",
" for feature in features:\n",
" result *= prob(x[feature], c, feature, mean, std)\n",
" return result"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 10,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(1, 3)\n",
"[matrix([[0. , 0.8214402 , 0.42263933]]), matrix([[0. , 0.17176728, 0.10613199]])]\n",
"[matrix([[1. , 4.906, 1.676]]), matrix([[1. , 1.464, 0.244]])]\n",
"[[1.57003335e-06 1.61965173e-23 3.09005273e-08]]\n"
]
}
],
"source": [
"print(X_std[0].shape)\n",
"print(X_std)\n",
"print(X_mean)\n",
"\n",
"X_prob_0=class_prob(X, 0, X_mean, X_std)\n",
"print(X_prob_0)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 11,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Wykres prawdopodobieństw klas\n",
"def plot_prob(fig, X_mean, X_std, classes, xmin=0.0, xmax=7.0, ymin=0.0, ymax=7.0):\n",
" class_color = {0: 'r', 1: 'g'}\n",
" ax = fig.axes[0]\n",
" x1, x2 = np.meshgrid(np.arange(xmin, xmax, 0.02),\n",
" np.arange(xmin, xmax, 0.02))\n",
" for c in classes:\n",
" fun1 = lambda x: prob(x, c, 1, X_mean, X_std)\n",
" fun2 = lambda x: prob(x, c, 2, X_mean, X_std)\n",
" p = fun1(x1) * fun2(x2)\n",
" plt.contour(x1, x2, p, levels=np.arange(0.0, 1.0, 0.1),\n",
" colors=class_color.get(c, 'c'), lw=3)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 12,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/1042079336.py:11: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(x1, x2, p, levels=np.arange(0.0, 1.0, 0.1),\n"
]
},
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAzoAAAHvCAYAAACc3aoBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAACaQklEQVR4nOzdd3xTVf8H8E9Gd2lLaQutFFr23kNAlmxRQRQVUXD/RBQUQUUfBRygKIgTHI/geBCVR0BxIFsECpShsimrZbeMtkl3cn5/nOc2SZu0SZv0punn/XrdV9Obk3tPbsrlfHPO+R6NEEKAiIiIiIjIh2jVrgAREREREZG7MdAhIiIiIiKfw0CHiIiIiIh8DgMdIiIiIiLyOQx0iIiIiIjI5zDQISIiIiIin8NAh4iIiIiIfA4DHSIiIiIi8jl6tStQGWazGefOnUOtWrWg0WjUrg4REREREXmQEALZ2dmIi4uDVlt2n021DnTOnTuH+Ph4tatBRERERERVKC0tDfXr1y+zTLUOdGrVqgVAvtGwsDCVa0NERERERJ6UlZWF+Pj44jigLNU60FGGq4WFhTHQISIiIiKqIZyZtsJkBERERERE5HMY6BARERERkc9hoENERERERD6HgQ4REREREfkcBjpERERERORzGOgQEREREZHPYaBDREREREQ+h4EOERERERH5HAY6RERERETkcxjoEBERERGRz2GgQ0REREREPoeBDhERERER+RwGOkRERERE5HNUDXQSEhKg0WhKbRMnTlSzWkREREREVM3p1Tz5rl27YDKZin/fv38/Bg0ahNGjR6tYKyIiIiIiqu5UDXSio6Ntfn/jjTfQuHFj9O3bV6UaERERERGRL1A10LFWUFCAr7/+GlOmTIFGo7FbJj8/H/n5+cW/Z2VlVVX1iIiIiIioGvGaZAQrV67EtWvXcP/99zssM2fOHISHhxdv8fHxVVdBqnGMBUZoZmmgmaWBscCodnWIiEhhNAIajdyMvD8TkX1eE+j8+9//xrBhwxAXF+ewzPTp05GZmVm8paWlVWENiYiIiIiouvCKoWunT5/GunXr8MMPP5RZLiAgAAEBAVVUKyIiIiIiqq68ItBZvHgxYmJiMHz4cLWrQlTMT+eHuQPnFj8mIiIv4ecHzJ1reUxEZIdGCCHUrIDZbEZiYiLGjBmDN954w6XXZmVlITw8HJmZmQgLC/NQDYmIiIiIyBu40v5XfY7OunXrkJqaigcffFDtqhARERERkY9Qfeja4MGDoXKnEpFdJrMJe87vAQB0iu0EnVanco2IiAgAYDIBe+T9GZ06ATren4moNNUDHSJvlVeUh26fdQMAGKYbEOIfonKNiIgIAJCXB3ST92cYDEAI789EVJrqQ9eIiIiIiIjcjYEOERERERH5HAY6RERERETkcxjoEBERERGRz2GgQ0REREREPoeBDhERERER+RymlyZywE/nhxl9ZxQ/JiIiL+HnB8yYYXlMRGSHRlTj1TqzsrIQHh6OzMxMhIWFqV0dIiIiIiLyIFfa/xy6RkREREREPodD14gcMAszDqUfAgC0jG4JrYbfCxAReQWzGTgk789o2RLQ8v5MRKUx0CFyILcwF20WtgEAGKYbEOIfonKNiIgIAJCbC7SR92cYDEAI789EVBq/AiEiIiIiIp/DQIeIiIiIiHwOAx0iIiIiIvI5DHSIiIiIiMjnMNAhIiIiIiKfw0CHiIiIiIh8DtNLEzngp/PD1B5Tix8TEZGX8PMDpk61PCYiskMjhBBqV6KisrKyEB4ejszMTISFhaldHSIiIiIi8iBX2v8cukZERERERD6HQ9eIHDALM1IzUwEADcIbQKvh9wJERF7BbAZS5f0ZDRoAWt6fiag0BjpEDuQW5iLx3UQAgGG6ASH+ISrXiIiIAAC5uUCivD/DYABCeH8motL4FQgREREREfkcBjpERERERORzGOgQEREREZHPYaBDREREREQ+h4EOERERERH5HAY6RERERETkc5hemsgBvVaPx7s8XvyYiIi8hF4PPP645TERkR0aIYRQuxIVlZWVhfDwcGRmZiIsLEzt6hARERERkQe50v7n0DUiIiIiIvI57O8lckAIgYycDABAVHAUNBqNyjUiIiIAgBBAhrw/IyoK4P2ZiOxgoEPkQE5hDmLejgEAGKYbEOIfonKNiIgIAJCTA8TI+zMMBiCE92ciKo1D14iIiIiIyOcw0CEiIiIiIp/DQIeIiIiIiHwOAx0iIiIiIvI5DHSIiIiIiMjnMNAhIiIiIiKfw/TSRA7otXqMbz+++DEREXkJvR4YP97ymIjIDo0QQqhdiYrKyspCeHg4MjMzERYWpnZ1iIiIiIjIg1xp/3PoGhERERER+Rz29xI5IIRATmEOACDYLxgajUblGhEREQBACCBH3p8RHAzw/kxEdrBHh8iBnMIchM4JReic0OKAh4iIvEBODhAaKrcc3p+JyD4GOkRERERE5HMY6BARERERkc9hoENERERERD5H9UDn7NmzuPfee1GnTh0EBQWhbdu2SE5OVrtaRERERERUjamade3q1avo1asX+vfvj19//RXR0dE4duwYateurWa1iIiIiIiomlM10HnzzTcRHx+PxYsXF+9LTExUsUZEREREROQLVA10fvzxRwwZMgSjR4/G5s2bcd111+Hxxx/HI488Yrd8fn4+8vPzi3/PysqqqqpSDaTT6nBHqzuKHxMRkZfQ6YA77rA8JiKyQyOEEGqdPDAwEAAwZcoUjB49Grt27cLkyZOxaNEijB8/vlT5mTNnYtasWaX2Z2ZmIiwszOP1JSIiIiIi9WRlZSE8PNyp9r+qgY6/vz+6dOmCbdu2Fe+bNGkSdu3ahe3bt5cqb69HJz4+noEOEREREVEN4Eqgo2rWtdjYWLRq1cpmX8uWLZGammq3fEBAAMLCwmw2IiIiIiKiklQNdHr16oUjR47Y7Dt69CgaNmyoUo2ILIwFRmhmaaCZpYGxwKh2dYiISGE0AhqN3Iy8PxORfaoGOk8//TSSkpIwe/ZspKSkYOnSpfjkk08wceJENatFRERERETVnKqBTteuXbFixQp88803aNOmDV599VUsWLAAY8eOVbNaRERERERUzamaXhoAbr75Ztx8881qV4OIiIiIiHyIqj06REREREREnsBAh4iIiIiIfA4DHSIiIiIi8jmqz9Eh8lY6rQ43Nb2p+DEREXkJnQ646SbLYyIiOzRCCKF2JSrKlZVRiYiIiIioenOl/c+ha0RERERE5HMY6BARERERkc9hoEPkgLHAiJDZIQiZHQJjgVHt6hARkcJoBEJC5Gbk/ZmI7GMyAqIy5BTmqF0FIiKyJ4f3ZyIqG3t0iIiIiIjI5zDQISIiIiIin8NAh4iIiIiIfA4DHSIiIiIi8jkMdIiIiIiIyOcw6xqRA1qNFn0b9i1+TEREXkKrBfr2tTwmIrJDI4QQaleiorKyshAeHo7MzEyEhYWpXR0iIiIiIvIgV9r//BqEiIiIiIh8DgMdIiIiIiLyOQx0iBwwFhgR/VY0ot+KhrHAqHZ1iIhIYTQC0dFyM/L+TET2MRkBURkycjLUrgIREdmTwfszEZWNPTpERERERORzGOgQEREREZHPYaBDREREREQ+h4EOERERERH5HAY6RERERETkc5h1jcgBrUaLLnFdih8TEZGX0GqBLl0sj4mI7NAIIYTalaiorKwshIeHIzMzE2FhYWpXh4iIiIiIPMiV9j+/BiEiIiIiIp/DQIeIiIiIiHwOAx0iB3IKc5CwIAEJCxKQU5ijdnWIiEiRkwMkJMgth/dnIrKPyQiIHBBC4HTm6eLHRETkJYQATp+2PCYisoM9OkRERERE5HMY6BARERERkc9hoENERERERD6HgQ4REREREfkcBjpERERERORzmHWNyAGNRoNW0a2KHxMRkZfQaIBWrSyPiYjsYKBD5ECwXzAOPH5A7WoQEVFJwcHAAd6fiahsHLpGREREREQ+h4EOERERERH5HAY6RA7kFOag9Uet0fqj1sgpzFG7OkREpMjJAVq3llsO789EZB/n6BA5IITAwfS
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(X, Y, xlabel=u'dł. płatka', ylabel=u'szer. płatka')\n",
"draw_means(fig, X_mean)\n",
"plot_prob(fig, X_mean, X_std, classes)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 13,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Prawdopodobieństwo a posteriori\n",
"def posterior_prob(x, c):\n",
" normalizer = sum(class_prob(x, c, X_mean, X_std)\n",
" * prior_prob[c]\n",
" for c in classes)\n",
" return (class_prob(x, c, X_mean, X_std) \n",
" * prior_prob[c]\n",
" / normalizer)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"source": [
"Aby teraz przewidzieć klasę $y$ dla dowolnego zestawu cech $x$, wystarczy sprawdzić, dla której klasy prawdopodobieństwo *a posteriori* jest większe:"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 14,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"# Funkcja klasyfikująca (funkcja predykcji)\n",
"def predict_class(x):\n",
" p = [posterior_prob(x, c) for c in classes]\n",
" if p[1] > p[0]:\n",
" return 1\n",
" else:\n",
" return 0"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 15,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1\n",
"0\n"
]
}
],
"source": [
"x = [1, 2.0, 0.5] # długość płatka: 2.0, szerokość płatka: 0.5\n",
"y = predict_class(x)\n",
"print(y) # 1 To prawdopodobnie jest Iris setosa\n",
"\n",
"x = [1, 2.5, 1.0] # długość płatka: 2.5, szerokość płatka: 1.0\n",
"y = predict_class(x)\n",
"print(y) # 0 To prawdopodobnie nie jest Iris setosa"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"source": [
"Zobaczmy, jak to wygląda na wykresie. Narysujemy w tym celu granicę między klasą 1 a 0:"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 16,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"# Wykres granicy klas dla naiwnego Bayesa\n",
"def plot_decision_boundary_bayes(fig, X_mean, X_std, xmin=0.0, xmax=7.0, ymin=0.0, ymax=7.0):\n",
" ax = fig.axes[0]\n",
" x1, x2 = np.meshgrid(np.arange(xmin, xmax, 0.02),\n",
" np.arange(ymin, ymax, 0.02))\n",
" p = [posterior_prob([1, x1, x2], c) for c in classes]\n",
" p_diff = p[1] - p[0]\n",
" plt.contour(x1, x2, p_diff, levels=[0.0], colors='c', lw=3);"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 17,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/4022158666.py:8: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(x1, x2, p_diff, levels=[0.0], colors='c', lw=3);\n"
]
},
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAzoAAAHvCAYAAACc3aoBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABPpElEQVR4nO3deXxU5d3///ckk4SEkEBYg0mAyKaiiKCIYMWCVfC2qP1WS91qvXsrYlFxuW+9W1zaitWK3vfvFlBr3VqE3vZG6CKoKJQtkbWCVRZFEtlCCGQjTLbz+2OYZJJMkjnDnJwzh9fz8ZiHzjnXXPnMmdied65zPuMxDMMQAAAAALhInN0FAAAAAEC0EXQAAAAAuA5BBwAAAIDrEHQAAAAAuA5BBwAAAIDrEHQAAAAAuA5BBwAAAIDrEHQAAAAAuI7X7gJORX19vfbv368uXbrI4/HYXQ4AAAAACxmGofLycvXt21dxcW2v2cR00Nm/f7+ys7PtLgMAAABAByosLFRWVlabY2I66HTp0kWS/42mpaXZXA0AAAAAK5WVlSk7O7shB7QlpoNO4HK1tLQ0gg4AAABwmgjnthWaEQAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHduDzr59+3TzzTere/fuSk5O1rnnnquNGzfaXRYAAACAGOa184cfPXpUY8eO1eWXX6733ntPPXv21K5du9StWzc7ywIAAAAQ42wNOr/+9a+VnZ2t1157rWHbgAEDbKwIAAAAgBvYeuna0qVLNWrUKH3/+99Xr169NGLECL3yyiutjvf5fCorK2vyAAAAAIDmbA06X331lebNm6dBgwZp+fLlmjZtmmbMmKE33ngj5PjZs2crPT294ZGdnd3BFQMAAACIBR7DMAy7fnhiYqJGjRqldevWNWybMWOGNmzYoPXr17cY7/P55PP5Gp6XlZUpOztbpaWlSktL65CaAQAAANijrKxM6enpYZ3/27qik5mZqbPPPrvJtrPOOksFBQUhxyclJSktLa3JAwAAAACaszXojB07Vjt27GiybefOnerXr59NFQEAAABwA1uDzv3336+8vDw99dRT2r17txYsWKCXX35Z06dPt7MsAAAAADHO1qBz4YUXavHixXr77bc1bNgw/eIXv9ALL7ygm266yc6yAAAAAMQ4W5sRnCozNyMBAAAAiG0x04wAAAAAAKxA0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5ja9B5/PHH5fF4mjyGDh1qZ0kAAAAAXMBrdwHnnHOOPvzww4bnXq/tJQEAAACIcbanCq/Xqz59+thdBgAAAAAXsf0enV27dqlv377Kzc3VTTfdpIKCglbH+nw+lZWVNXkAAAAAQHO2Bp3Ro0fr9ddf17JlyzRv3jzt2bNHl156qcrLy0OOnz17ttLT0xse2dnZHVwxAAAAgFjgMQzDsLuIgGPHjqlfv36aM2eO7rjjjhb7fT6ffD5fw/OysjJlZ2ertLRUaWlpHVkqAAAAgA5WVlam9PT0sM7/bb9HJ1jXrl01ePBg7d69O+T+pKQkJSUldXBVAAAAAGKN7ffoBKuoqNCXX36pzMxMu0sBAAAAEMNsDToPPvigVq1apa+//lrr1q3Tddddp/j4eE2dOtXOsgAAAADEOFsvXfvmm280depUHTlyRD179tS4ceOUl5ennj172lkWAAAAgBhna9BZuHChnT8eAAAAgEs56h4dAAAAAIgGgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdbyQvqqys1KpVq1RQUKDq6uom+2bMmBGVwgAAAAAgUqaDzpYtWzR58mQdP35clZWVysjIUHFxsVJSUtSrVy+CDgAAAADbmb507f7779c111yjo0ePKjk5WXl5edq7d69Gjhyp3/zmNxEX8vTTT8vj8ei+++6LeA4AAAAAkCIIOlu3btUDDzyguLg4xcfHy+fzKTs7W88884weffTRiIrYsGGDXnrpJZ133nkRvR4AAAAAgpkOOgkJCYqL87+sV69eKigokCSlp6ersLDQdAEVFRW66aab9Morr6hbt26mXw8AAAAAzZkOOiNGjNCGDRskSZdddplmzZqlP/zhD7rvvvs0bNgw0wVMnz5dV199tSZOnNjuWJ/Pp7KysiYPAAAAAGjOdNB56qmnlJmZKUn61a9+pW7dumnatGk6fPiwXnrpJVNzLVy4UJs3b9bs2bPDGj979mylp6c3PLKzs82WDwAAAOA04DEMw4jWZFVVVUpOTg5rbGFhoUaNGqUPPvig4d6c8ePH6/zzz9cLL7wQ8jU+n08+n6/heVlZmbKzs1VaWqq0tLRTrh8AAACAc5WVlSk9PT2s83/TKzqttY+urKzU5MmTw55n06ZNKioq0gUXXCCv1yuv16tVq1bpv//7v+X1elVXV9fiNUlJSUpLS2vyAAAAAIDmTH+Pzl//+ld169ZNTzzxRMO2iooKTZo0ydQ8EyZM0LZt25psu/322zV06FD9+7//u+Lj482WBgAAAACSIgg677//vi699FJ169ZN9913n8rLy3XllVfK6/XqvffeC3ueLl26tGhe0LlzZ3Xv3j2ipgYAAAAAEGA66Jx55platmyZLr/8csXFxentt99WUlKS/vrXv6pz585W1AgAAAAApkTcjGD9+vW64oorNHr
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(X, Y, xlabel=u'dł. płatka', ylabel=u'szer. płatka')\n",
"plot_decision_boundary_bayes(fig, X_mean, X_std)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Dla porównania: regresja logistyczna na tych samych danych"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 18,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"def powerme(x1,x2,n):\n",
" X = []\n",
" for m in range(n+1):\n",
" for i in range(m+1):\n",
" X.append(np.multiply(np.power(x1,i),np.power(x2,(m-i))))\n",
" return np.hstack(X)\n",
"\n",
"# Funkcja logistyczna\n",
"def safeSigmoid(x, eps=0):\n",
" y = 1.0/(1.0 + np.exp(-x))\n",
" if eps > 0:\n",
" y[y < eps] = eps\n",
" y[y > 1 - eps] = 1 - eps\n",
" return y\n",
"\n",
"# Funkcja hipotezy dla regresji logistycznej\n",
"def h(theta, X, eps=0.0):\n",
" return safeSigmoid(X*theta, eps)\n",
"\n",
"# Funkcja kosztu dla regresji logistycznej\n",
"def J(h,theta,X,y, lamb=0):\n",
" m = len(y)\n",
" f = h(theta, X, eps=10**-7)\n",
" j = -np.sum(np.multiply(y, np.log(f)) + \n",
" np.multiply(1 - y, np.log(1 - f)), axis=0)/m\n",
" if lamb > 0:\n",
" j += lamb/(2*m) * np.sum(np.power(theta[1:],2))\n",
" return j\n",
"\n",
"# Gradient funkcji kosztu\n",
"def dJ(h,theta,X,y,lamb=0):\n",
" g = 1.0/y.shape[0]*(X.T*(h(theta,X)-y))\n",
" if lamb > 0:\n",
" g[1:] += lamb/float(y.shape[0]) * theta[1:] \n",
" return g\n",
"\n",
"# Funkcja klasyfikująca\n",
"def classifyBi(theta, X):\n",
" prob = h(theta, X)\n",
" return prob"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 19,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# Przygotowanie danych dla wielomianowej regresji logistycznej\n",
"\n",
"data = np.matrix(data_iris_setosa)\n",
"\n",
"Xpl = powerme(data[:, 1], data[:, 0], n)\n",
"Ypl = np.matrix(data[:, 2]).reshape(m, 1)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 20,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Metoda gradientu prostego dla regresji logistycznej\n",
"def GD(h, fJ, fdJ, theta, X, y, alpha=0.01, eps=10**-3, maxSteps=10000):\n",
" errorCurr = fJ(h, theta, X, y)\n",
" errors = [[errorCurr, theta]]\n",
" while True:\n",
" # oblicz nowe theta\n",
" theta = theta - alpha * fdJ(h, theta, X, y)\n",
" # raportuj poziom błędu\n",
" errorCurr, errorPrev = fJ(h, theta, X, y), errorCurr\n",
" # kryteria stopu\n",
" if abs(errorPrev - errorCurr) <= eps:\n",
" break\n",
" if len(errors) > maxSteps:\n",
" break\n",
" errors.append([errorCurr, theta]) \n",
" return theta, errors"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 21,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"theta = [[ 4.01960795]\n",
" [ 3.89499137]\n",
" [ 0.18747599]\n",
" [-1.3524039 ]\n",
" [-2.00123783]\n",
" [-0.87625505]]\n"
]
}
],
"source": [
"# Uruchomienie metody gradientu prostego dla regresji logistycznej\n",
"theta_start = np.matrix(np.zeros(Xpl.shape[1])).reshape(Xpl.shape[1], 1)\n",
"theta, errors = GD(h, J, dJ, theta_start, Xpl, Ypl, \n",
" alpha=0.1, eps=10**-7, maxSteps=100000)\n",
"print(r'theta = {}'.format(theta))"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 22,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Wykres granicy klas\n",
"def plot_decision_boundary(fig, theta, Xpl, xmin=0.0, xmax=7.0):\n",
" ax = fig.axes[0]\n",
" xx, yy = np.meshgrid(np.arange(xmin, xmax, 0.02),\n",
" np.arange(xmin, xmax, 0.02))\n",
" l = len(xx.ravel())\n",
" C = powerme(yy.reshape(l, 1), xx.reshape(l, 1), n)\n",
" z = classifyBi(theta, C).reshape(int(np.sqrt(l)), int(np.sqrt(l)))\n",
"\n",
" plt.contour(xx, yy, z, levels=[0.5], colors='m', lw=3);"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 23,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/2795780436.py:10: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(xx, yy, z, levels=[0.5], colors='m', lw=3);\n"
]
},
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAzoAAAHvCAYAAACc3aoBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABZtElEQVR4nO3deXhU5d3/8c8kk4RAFghLSEjCHlBxQVBAULFgFa0Faa1S99rWBeqCS6tPi9r2EYsVrT8F1Fq3VqG1UqitoCJEtsSwKaAQ9oSQhCWQjTBZ5vz+mGeykWVOyOTMnLxf1zWX5Mw9d75zJuD55j7ncxyGYRgCAAAAABsJsboAAAAAAGhrNDoAAAAAbIdGBwAAAIDt0OgAAAAAsB0aHQAAAAC2Q6MDAAAAwHZodAAAAADYDo0OAAAAANtxWl3AmXC73Tp06JCio6PlcDisLgcAAACAHxmGoZKSEiUmJiokpPk1m6BudA4dOqTk5GSrywAAAADQjnJycpSUlNTsmKBudKKjoyV53mhMTIzF1QAAAADwp+LiYiUnJ9f0Ac0J6kbHe7paTEwMjQ4AAADQQfhy2QphBAAAAABsh0YHAAAAgO3Q6AAAAACwHRodAAAAALZDowMAAADAdmh0AAAAANgOjQ4AAAAA26HRAQAAAGA7NDoAAAAAbIdGBwAAAIDt0OgAAAAAsB0aHQAAAAC2Q6MDAAAAwHZodAAAAADYDo0OAAAAANuh0QEAAABgOzQ6AAAAAGyHRgcAAACA7dDoAAAAALAdGh0AAAAAtkOjAwAAAMB2aHQAAAAA2A6NDgAAAADbodEBAAAAYDs0OgAAAABsh0YHAAAAgO3Q6AAAAACwHRodAAAAALZDowMAAADAdmh0AAAAANgOjQ4AAAAA26HRAQAAAGA7NDoAAAAAbIdGBwAAAIDt0OgAAAAAsB0aHQAAAAC2Q6MDAAAAwHZodAAAAADYDo0OAAAAANuh0QEAAABgOzQ6AAAAAGyHRgcAAACA7dDoAAAAALAdGh0AAAAAtkOjAwAAAMB2aHQAAAAA2A6NDgAAAADbodEBAAAAYDs0OgAAAABsh0YHAAAAgO3Q6AAAAACwHRodAAAAALZDowMAAADAdixvdHJzc3XLLbeoe/fuioyM1LnnnqsNGzZYXRYAAACAIOa08psfP35cY8eO1RVXXKGPP/5YPXv21K5du9StWzcrywIAAAAQ5CxtdP7whz8oOTlZb775Zs22/v37W1gRAAAAADuw9NS1pUuXauTIkbrhhhvUq1cvDR8+XK+//nqT410ul4qLi+s9AAAAAKAhSxudvXv3av78+Ro8eLCWL1+ue++9V/fff7/efvvtRsfPnj1bsbGxNY/k5OR2rhgAAABAMHAYhmFY9c3Dw8M1cuRIrVu3rmbb/fffr8zMTK1fv/608S6XSy6Xq+br4uJiJScnq6ioSDExMe1SMwAAAABrFBcXKzY21qfjf0tXdBISEnT22WfX23bWWWcpOzu70fERERGKiYmp9wAAAACAhixtdMaOHaudO3fW25aVlaW+fftaVBEAAAAAO7C00XnooYeUnp6uZ555Rrt379Z7772n1157TdOnT7eyLAAAAABBztJG56KLLtLixYv1/vvva9iwYfrd736nF198UTfffLOVZQEAAAAIcpaGEZwpMxcjAQAAAAhuQRNGAAAAAAD+QKMDAAAAwHZodAAAAADYDo0OAAAAANuh0QEAAABgOzQ6AAAAAGyHRgcAAACA7dDoAAAAALAdGh0AAAAAtkOjAwAAAMB2aHQAAAAA2A6NDgAAAADbodEBAAAAYDs0OgAAAABsh0YHAAAAgO3Q6AAAAACwHRodAAAAALZDowMAAADAdmh0AAAAANgOjQ4AAAAA26HRAQAAAGA7NDoAAAAAbIdGBwAAAIDt0OgAAAAAsB0aHQAAAAC2Q6MDAAAAwHZodAAAAADYDo0OAAAAANuh0QEAAABgOzQ6AAAAAGyHRgcAAACA7dDoAAAAALAdGh0AAAAAtkOjAwAAAMB2aHQAAAAA2A6NDgAAAADbodEBAAAAYDs0OgAAAABsh0YHAAAAgO3Q6AAAAACwHRodAAAAALZDowMAAADAdmh0AAAAANgOjQ4AAAAA26HRAQAAAGA7NDoAAAAAbIdGBwAAAIDt0OgAAAAAsB0aHQAAAAC2Q6MDAAAAwHZodAAAAADYjqWNzlNPPSWHw1HvMXToUCtLAgAAAGADTqsLOOecc/TZZ5/VfO10Wl4SAAAAgCBneVfhdDrVu3dvq8sAAAAAYCOWX6Oza9cuJSYmasCAAbr55puVnZ3d5FiXy6Xi4uJ6DwAAAABoyNJGZ9SoUXrrrbe0bNkyzZ8/X/v27dOll16qkpKSRsfPnj1bsbGxNY/k5OR2rhgAAABAMHAYhmFYXYTXiRMn1LdvX82dO1d33XXXac+7XC65XK6ar4uLi5WcnKyioiLFxMS0Z6kAAAAA2llxcbFiY2N9Ov63/Bqdurp27arU1FTt3r270ecjIiIUERHRzlUBAAAACDaWX6NTV2lpqfbs2aOEhASrSwEAAAAQxCxtdB555BGlpaVp//79Wrduna6//nqFhoZq2rRpVpYFAAAAIMhZeurawYMHNW3aNB07dkw9e/bUuHHjlJ6erp49e1pZFgAAAIAgZ2mjs3DhQiu/PQAAAACbCqhrdAAAAACgLdDoAAAAALAdGh0AAAAAtkOjAwAAAMB2aHQAAAAA2A6NDgAAAADbodEBAAAAYDs0OgAAAABsh0YHAAAAgO3Q6AAAAACwHRodAAAAALZDowMAAADAdmh0AAAAANgOjQ4AAAAA26HRAQAAAGA7NDoAAAAAbIdGBwAAAIDt0OgAAAAAsB0aHQAAAAC2Q6MDAAAAwHZodAAAAADYDo0OAAAAANuh0QEAAABgOzQ6AAAAAGyHRgcAAACA7dDoAAAAALAdGh0AAAAAtkOjAwAAAMB2aHQAAAAA2A6NDgAAAADbodEBAAAAYDs0OgAAAABsh0YHAAAAgO3Q6AAAAACwHRodAAAAALZDowMAAADAdmh0AAAAANgOjQ4AAAAA26HRAQAAAGA7NDoAAAAAbIdGBwAAAIDt0OgAAAAAsB0aHQAAAAC2Q6MDAAAAwHZodAAAAADYDo0OAAAAANuh0QEAAABgOzQ6AAAAAGyHRgcAAACA7Thb86KysjKlpaUpOztbFRUV9Z67//7726QwAAAAAGgt043O5s2bdc011+jkyZMqKytTXFycjh49qs6dO6tXr140OgAAAAAsZ/rUtYceekjXXXedjh8/rsjISKWnp+vAgQMaMWKE/vjHP7a6kGeffVYOh0MPPvhgq+cAAAAAAKkVjc6WLVv08MMPKyQkRKGhoXK5XEpOTtacOXP0xBNPtKqIzMxMvfrqqzrvvPNa9XoAAAAAqMt0oxMWFqaQEM/LevXqpezsbElSbGyscnJyTBdQWlqqm2++Wa+//rq6detm+vUAAAAA0JDpRmf48OHKzMyUJF1++eWaNWuW/va3v+nBBx/UsGHDTBcwffp0XXvttZo4cWKLY10ul4qLi+s9AAAAAKAh043OM888o4SEBEnS//7v/6pbt2669957deTIEb366qum5lq4cKE2bdqk2bNn+zR+9uzZio2NrXkkJyebLR8AAABAB+AwDMNoq8nKy8sVGRnp09icnByNHDlSn376ac21OePHj9cFF1ygF198sdHXuFwuuVyumq+Li4uVnJysoqIixcTEnHH9AAAAAAJXcXGxYmNjfTr+N72i01R8dFlZma655hqf59m4caMOHz6sCy+8UE6nU06nU2lpaXrppZfkdDpVXV192msiIiIUExNT7wEAAAAADZm+j85//vMfdevWTU8//XTNttLSUk2aNMnUPBMmTNDWrVvrbbvzzjs1dOhQ/fKXv1RoaKjZ0gAAAABAUisanU8++USXXnqpunXrpgcffFAlJSW66qqr5HQ69fHHH/s8T3R09GnhBV26dFH37t1bFWoAAAAAAF6mG52BAwdq2bJluuKKKxQSEqL3339fERER+s9//qMuXbr4o0YAAAA
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(Xpl, Ypl, xlabel=u'dł. płatka', ylabel=u'szer. płatka')\n",
"plot_decision_boundary(fig, theta, Xpl)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 24,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/2795780436.py:10: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(xx, yy, z, levels=[0.5], colors='m', lw=3);\n",
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/4022158666.py:8: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(x1, x2, p_diff, levels=[0.0], colors='c', lw=3);\n"
]
},
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAzoAAAHvCAYAAACc3aoBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABj7ElEQVR4nO3deXxU5dn/8e8kk4TsIUA2krAHVFRWWa1YaN0eC7W1lrrXti5YF1xa/bWobR+1Wpf2qeJSq9ZWwVoRaiu4gmzBsAkoW9gSspKE7GGyzPn9MWQly5yQycycfN6v17xkzrnnnmvOGfBcc9/3dWyGYRgCAAAAAAsJ8HYAAAAAANDTSHQAAAAAWA6JDgAAAADLIdEBAAAAYDkkOgAAAAAsh0QHAAAAgOWQ6AAAAACwHBIdAAAAAJZj93YAp8PpdCo3N1eRkZGy2WzeDgcAAACABxmGoYqKCiUlJSkgoPMxG79OdHJzc5WSkuLtMAAAAAD0ouzsbCUnJ3faxq8TncjISEmuDxoVFeXlaAAAAAB4Unl5uVJSUprygM74daLTOF0tKiqKRAcAAADoI9xZtkIxAgAAAACWQ6IDAAAAwHJIdAAAAABYDokOAAAAAMsh0QEAAABgOSQ6AAAAACyHRAcAAACA5ZDoAAAAALAcEh0AAAAAlkOiAwAAAMBySHQAAAAAWA6JDgAAAADLIdEBAAAAYDkkOgAAAAAsh0QHAAAAgOWQ6AAAAACwHBIdAAAAAJZDogMAAADAckh0AAAAAFgOiQ4AAAAAyyHRAQAAAGA5JDoAAAAALIdEBwAAAIDlkOgAAAAAsBwSHQAAAACWQ6IDAAAAwHJIdAAAAABYDokOAAAAAMsh0QEAAABgOSQ6AAAAACyHRAcAAACA5ZDoAAAAALAcEh0AAAAAlkOiAwAAAMBySHQAAAAAWA6JDgAAAADLIdEBAAAAYDkkOgAAAAAsh0QHAAAAgOWQ6AAAAACwHBIdAAAAAJZDogMAAADAckh0AAAAAFgOiQ4AAAAAyyHRAQAAAGA5JDoAAAAALIdEBwAAAIDlkOgAAAAAsBwSHQAAAACWQ6IDAAAAwHJIdAAAAABYDokOAAAAAMsh0QEAAABgOV5PdHJycnTNNddowIABCg0N1dlnn63Nmzd7OywAAAAAfszuzTc/fvy4ZsyYoQsvvFAffPCBBg0apP3796t///7eDAsAAACAn/NqovP73/9eKSkpevXVV5u2DRs2zIsRAQAAALACr05dW7FihSZNmqQrr7xScXFxGj9+vF5++eUO2zscDpWXl7d6AAAAAEBbXk10Dh48qMWLF2vUqFFatWqVbr31Vt1xxx16/fXX223/2GOPKTo6uumRkpLSyxEDAAAA8Ac2wzAMb715cHCwJk2apA0bNjRtu+OOO5SRkaGNGzee0t7hcMjhcDQ9Ly8vV0pKisrKyhQVFdUrMQMAAADwjvLyckVHR7t1/e/VEZ3ExESdeeaZrbadccYZysrKard9SEiIoqKiWj0AAAAAoC2vJjozZszQ3r17W23bt2+fhgwZ4qWIAAAAAFiBVxOdu+++W+np6Xr00UeVmZmpN998Uy+99JIWLFjgzbAAAAAA+DmvJjqTJ0/WsmXL9NZbb2ns2LH67W9/q2effVZXX321N8MCAAAA4Oe8WozgdJlZjAQAAADAv/lNMQIAAAAA8AQSHQAAAACWQ6IDAAAAwHJIdAAAAABYDokOAAAAAMsh0QEAAABgOSQ6AAAAACyHRAcAAACA5ZDoAAAAALAcEh0AAAAAlkOiAwAAAMBySHQAAAAAWA6JDgAAAADLIdEBAAAAYDkkOgAAAAAsh0QHAAAAgOWQ6AAAAACwHBIdAAAAAJZDogMAAADAckh0AAAAAFgOiQ4AAAAAyyHRAQAAAGA5JDoAAAAALIdEBwAAAIDlkOgAAAAAsBwSHQAAAACWQ6IDAAAAwHJIdAAAAABYDokOAAAAAMsh0QEAAABgOSQ6AAAAACyHRAcAAACA5ZDoAAAAALAcEh0AAAAAlkOiAwAAAMBySHQAAAAAWA6JDgAAAADLIdEBAAAAYDkkOgAAAAAsh0QHAAAAgOWQ6AAAAACwHBIdAAAAAJZDogMAAADAckh0AAAAAFgOiQ4AAAAAyyHRAQAAAGA5JDoAAAAALIdEBwAAAIDlkOgAAAAAsBwSHQAAAACWQ6IDAAAAwHK8mug8/PDDstlsrR5jxozxZkgAAAAALMDu7QDOOussffzxx03P7XavhwQAAADAz3k9q7Db7UpISPB2GAAAAAAsxOtrdPbv36+kpCQNHz5cV199tbKysjps63A4VF5e3uoBAAAAAG15NdGZMmWKXnvtNa1cuVKLFy/WoUOHdP7556uioqLd9o899piio6ObHikpKb0cMQAAAAB/YDMMw/B2EI1KS0s1ZMgQPf3007rppptO2e9wOORwOJqel5eXKyUlRWVlZYqKiurNUAEAAAD0svLyckVHR7t1/e/1NTotxcTEKC0tTZmZme3uDwkJUUhISC9HBQAAAMDfeH2NTkuVlZU6cOCAEhMTvR0KAAAAAD/m1UTn3nvv1Zo1a3T48GFt2LBB3/3udxUYGKj58+d7MywAAAAAfs6rU9eOHj2q+fPnq7i4WIMGDdLMmTOVnp6uQYMGeTMsAAAAAH7Oq4nOkiVLvPn2AAAAACzKp9boAAAAAEBPINEBAAAAYDkkOgAAAAAsh0QHAAAAgOWQ6AAAAACwHBIdAAAAAJZDogMAAADAckh0AAAAAFgOiQ4AAAAAyyHRAQAAAGA5JDoAAAAALIdEBwAAAIDlkOgAAAAAsBwSHQAAAACWQ6IDAAAAwHJIdAAAAABYDokOAAAAAMsh0QEAAABgOSQ6AAAAACyHRAcAAACA5ZDoAAAAALAcEh0AAAAAlkOiAwAAAMBySHQAAAAAWA6JDgAAAADLIdEBAAAAYDkkOgAAAAAsh0QHAAAAgOWQ6AAAAACwHBIdAAAAAJZDogMAAADAckh0AAAAAFgOiQ4AAAAAyyHRAQAAAGA5JDoAAAAALIdEBwAAAIDlkOgAAAAAsBwSHQAAAACWQ6IDAAAAwHJIdAAAAABYDokOAAAAAMsh0QEAAABgOSQ6AAAAACyHRAcAAACA5ZDoAAAAALAcEh0AAAAAlkOiAwAAAMBySHQAAAAAWA6JDgAAAADLsXfnRVVVVVqzZo2ysrJUW1vbat8dd9zRI4EBAAAAQHeZTnS2bdumSy+9VNXV1aqqqlJsbKyKiooUFhamuLg4Eh0AAAAAXmd66trdd9+tyy+/XMePH1doaKjS09N15MgRTZw4UX/4wx+6Hcjjjz8um82mu+66q9t9AAAAAIDUjURn+/btuueeexQQEKDAwEA5HA6lpKToiSee0IMPPtitIDIyMvTiiy/qnHPO6dbrAQAAAKAl04lOUFCQAgJcL4uLi1NWVpYkKTo6WtnZ2aYDqKys1NVXX62XX35Z/fv3N/16AAAAAGjLdKIzfvx4ZWRkSJIuuOACLVq0SP/4xz901113aezYsaYDWLBggS677DLNmTOny7YOh0Pl5eWtHgAAAADQlulE59FHH1ViYqIk6X//93/Vv39/3XrrrTp27JhefPFFU30tWbJEW7du1WOPPeZW+8cee0zR0dFNj5SUFLPhAwAAAOgDbIZhGD3VWU1NjUJDQ91qm52drUmTJumjjz5qWpsza9YsjRs3Ts8++2y7r3E4HHI4HE3Py8vLlZKSorKyMkVFRZ12/AAAAAB8V3l5uaKjo926/jc9otNR+eiqqipdeumlbvezZcsWFRYWasKECbLb7bLb7VqzZo3+9Kc/yW63q6Gh4ZTXhISEKCoqqtUDAAAAANoyfR+d//znP+rfv78eeeSRpm2VlZW65JJLTPUze/Zs7dy5s9W2G2+8UWPGjNEvfvELBQYGmg0NAAAAACR1I9H58MMPdf7556t///666667VFFRoYsuukh2u10ffPCB2/1ERkaeUrwgPDxcAwYM6FZRAwAAAABoZDrRGTFihFauXKkLL7xQAQEBeuuttxQSEqL//Oc/Cg8P90SMAAAAAGB
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(Xpl, Ypl, xlabel=u'dł. płatka', ylabel=u'szer. płatka')\n",
"plot_decision_boundary(fig, theta, Xpl)\n",
"plot_decision_boundary_bayes(fig, X_mean, X_std)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Inny przykład"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 25,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# Wczytanie danych (gatunki kosaćców)\n",
"\n",
"data_iris = pandas.read_csv('iris.csv')\n",
"data_iris_versicolor = pandas.DataFrame()\n",
"data_iris_versicolor['dł. płatka'] = data_iris['pl'] # \"pl\" oznacza \"petal length\"\n",
"data_iris_versicolor['szer. płatka'] = data_iris['pw'] # \"pw\" oznacza \"petal width\"\n",
"data_iris_versicolor['Iris versicolor?'] = data_iris['Gatunek'].apply(lambda x: 1 if x=='Iris-versicolor' else 0)\n",
"\n",
"m, n_plus_1 = data_iris_versicolor.values.shape\n",
"n = n_plus_1 - 1\n",
"Xn = data_iris_versicolor.values[:, 0:n].reshape(m, n)\n",
"\n",
"X = np.matrix(np.concatenate((np.ones((m, 1)), Xn), axis=1)).reshape(m, n_plus_1)\n",
"Y = np.matrix(data_iris_setosa.values[:, 2]).reshape(m, 1)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 26,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"liczba przykładów: {0: 100, 1: 50}\n",
"prior probability: {0: 0.6666666666666666, 1: 0.3333333333333333}\n"
]
}
],
"source": [
"classes = [0, 1]\n",
"count = [sum(1 if y == c else 0 for y in Y.T.tolist()[0]) for c in classes]\n",
"prior_prob = [float(count[c]) / float(Y.shape[0]) for c in classes]\n",
"\n",
"print('liczba przykładów: ', {c: count[c] for c in classes})\n",
"print('prior probability:', {c: prior_prob[c] for c in classes})"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 27,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0cAAAHvCAYAAACfaqQpAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABVmElEQVR4nO3de3xU1b3///dMJgnRkGis3GSiRAheQKVgY7goHmO9cDRof9XyrUCpWuVSuaht7bfVU2qLtYo9fWBA+/VS47FE2yJWLcjFUAgEuVbwhoglqVy0ggnBNJNk9u+PfRIyyWQyO5k9s2fm9Xw85qGz92dWPrN2bNcna++1XIZhGAIAAACAJOeOdQIAAAAA4AQURwAAAAAgiiMAAAAAkERxBAAAAACSKI4AAAAAQBLFEQAAAABIojgCAAAAAEkURwAAAAAgSfLEOoFo8/v9OnDggHr37i2XyxXrdAAAAADYyDAMHTt2TAMGDJDbHXpuKOmKowMHDsjr9cY6DQAAAABRVF1drYEDB4aMSbriqHfv3pLMzsnKyopxNgAAAADsVFtbK6/X21oHhJJ0xVHLrXRZWVkURwAAAECSCOeRGhZkAAAAAABRHAEAAACAJIojAAAAAJBEcQQAAAAAkiiOAAAAAEASxREAAAAASKI4AgAAAABJFEcAAAAAIIniCAAAAAAkURwBAAAAgCSKIwAAALTn8/XsfKRZzcdp+SNuUBwBAADghLIyafhwqbo6+PnqavN8WZkz83Fa/ogrMS2OFixYoIsvvli9e/dWnz59NHHiRH3wwQchP/Pss8/K5XIFvHr16hWljAEAABKYzyfdf7+0Z480fnzHAqO62jy+Z48ZZ/cMjNV86uqclT/iTkyLo3Xr1mnmzJmqrKzUqlWr1NjYqK9//es6fvx4yM9lZWXp4MGDra/9+/dHKWMAAIAElpYmrV4t5eVJ+/YFFhgthcW+feb51avNeCflk5nprPwRd1yGYRixTqLFZ599pj59+mjdunW69NJLg8Y8++yzmjNnjr744otu/Yza2lplZ2erpqZGWVlZPcgWAAAgQbUvJEpLpcmTT7wvL5e8Xufm47T8EVNWxv+OeuaopqZGkpSTkxMyrq6uTmeeeaa8Xq+Ki4v1zjvvdBrb0NCg2tragBcAAABC8HrNAqJlBmbMmNgWFlbzcVr+iBuOKY78fr/mzJmjMWPGaNiwYZ3GDR06VE8//bSWL1+u559/Xn6/X6NHj9Y///nPoPELFixQdnZ268vLfwwAAABd83rNGZe2SktjV1hYzcdp+SMuOOa2uunTp+uvf/2rNmzYoIEDB4b9ucbGRp177rmaNGmSfv7zn3c439DQoIaGhtb3tbW18nq93FYHAAAQSttb01rEcubFaj5Oyx8xE3e31c2aNUuvvvqq3nzzTUuFkSSlpqZqxIgR2rt3b9Dz6enpysrKCngBAAAghPbP7FRUBF/kwKn5OC1/xI2YFkeGYWjWrFlatmyZ1q5dq0GDBlluo7m5Wbt27VL//v1tyBAAACDJtC8sysul0aMDn+GJZoFhNR+n5Y+4EtPiaObMmXr++ef1wgsvqHfv3jp06JAOHTqk+vr61pgpU6bovvvua30/f/58vfHGG9q3b5+2b9+uW265Rfv379dtt90Wi68AAACQOHw+qago+OIF7Rc5KCqKzj5HVvKpq3NW/og7MS2OFi9erJqaGo0fP179+/dvfZW12bG4qqpKBw8ebH1/9OhR3X777Tr33HN17bXXqra2Vhs3btR5550Xi68AAACQONLSpPnzpfz80KvA5eebcdHY58hKPpmZzsofcccxCzJEC/scAQAAdMHnC104dHU+1vk4LX/EVNwtyAAAAAAH6apwiHZhYTUfp+WPuEFxBAAAAACiOAIAAAAASRRHAAAAACCJ4ggAACSrrpZxTrRlnuvqenYeSAIURwAAIPmUlUnDh3e+EWh1tXm+zfYicW32bCknR9q8Ofj5zZvN87NnRzcvwGFYyhsAACQXn88sfPbs6bhRqGQWRuPHmxuF5udLu3bF9+pmdXVm4dPYKHk80oYNUkHBifObN0tjx0pNTVJqqnTkiLlfEJAgWMobAACgM2lp0urVZmG0b59ZCLXMILUtjPLyzLh4Lowks9BZv94sjJqazEKoZQapbWHk8ZhxFEZIYhRHAAAg+Xi95oxR2wJp48bAwqj9jFI8KygwZ4zaFkhPPBFYGLWfUQKSELfVAQCA5NV2pqhFohVGbbWdKWpBYYQEx211AAAA4fB6pdLSwGOlpYlZGElmAbRoUeCxRYsojID/RXEEAACSV3W1NHly4LHJkztfxS7ebd4szZoVeGzWrM5XsQOSDMURAABITu0XX6ioCL5IQ6Jov/jCkiXBF2kAkhjFEQAASD7tC6Pycmn06I6LNCRKgdS+MNqwQbrjjo6LNFAgIclRHAEAgOTi80lFRcFXpWu/il1RkRkfz+rqpHHjgq9K134Vu3HjzHggSVEcAQCA5JKWJs2fb27wGmxVupYCKT/fjEuEfY6mTzc3eA22Kl1LgZSaasaxzxGSGEt5AwCA5OTzhS58ujofb+rqQhc+XZ0H4hRLeQMAAHSlq8InkQojqevCh8IIoDgCAAAAAIniCAAAAAAkURwBAIBk1dUqdO3POy3eTnbnEu/tI7gE6HeKIwAAkHzKyqThwzvfx6i62jxfVubMeDvZnUu8t4/gEqXfjSRTU1NjSDJqampinQoAAIiFhgbDyM83DMkw8vIMo6oq8HxVlXlcMuOOHXNWfEND5PukhdW+sZpLvLeP4Bze71bG/xRHAAAg+bQdrLUdzMXLcSf1TbK1j+Ac3O8URyFQHAEAAMMwOg7aKipCD+KcFm8nu3OJ9/YRnEP73cr4n01gAQBA8qqulsaPl/btO3EsL08qL5e8XufH28nuXOK9fQTnwH5nE1gAAIBweL1SaWngsdLSzgdxTou3k925xHv7CC7O+53iCAAAJK/qamny5MBjkyeHXnHLSfF2sjuXeG8fwcV5v1McAQCA5NT29p+8PKmiwvznvn3m8faDOafF28nuXOK9fQSXCP1u+xNQDsOCDAAAwHGrzzlppa94X03OSX2ZTBzc76xWFwLFEQAASY59jjoX7/sQOXy/nYTl8H6nOAqB4ggAABhLl5qDtM7+il1VZZ5futSZ8XayO5d4bx/BObjfWco7BJbyBgAAkiSfT0pLC/+80+LtZHcu8d4+gnNov7OUNwAAQFe6GqS1P++0eDvZnUu8t4/gEqDfKY4AAAAAQBRHAAAAACCJ4ggAACA5+Hw9Ox+rthFZXKuQKI4AAAASXVmZNHx455twVleb58vKnNU2Iotr1SVWqwMAAEhkPp854N2zR8rLk8rLJa/3xPnqamn8eGnfPik/X9q1K/wH5+1sG5GVxNeK1eoAAABgSkuTVq82B8T79pkD4JaZg7YD4rw8M87KgNjOthFZXKuwUBwBAAAkOq/XnCloOzDeuDFwQNx+JsEJbSOyuFZd4rY6AACAZNF2hqBFpAbEdraNyEqya8VtdQAAAOjI65VKSwOPlZZGZkBsZ9uILK5VpyiOAAAAkkV1tTR5cuCxyZM7X73MKW0jsrhWnaI4AgAASAbtH7qvqAj+cL7T2kZkca1CojgCAABIdO0HxOXl0ujRHR/O787A2M62EVlcqy5RHAEAACQyn08qKgq+Gln71cuKisx4J7SNyOJahYXiCAAAIJGlpUnz55sbewZbjaxlYJyfb8ZZ3efIrrYRWVyrsLCUNwAAQDLw+UIPeLs6H6u2EVlJeK1YyhsAAACBuhrw9mRAbGfbiCyuVUgURwAAAAAgiiMAAAAAkERxBAAAAACSKI4AAACcoaulk9uftzPeattW2d1+MqEvI4riCAAAINbKyqThwzvffLO62jxfVmZ/vNW2rbK7/WRCX0YcS3kDAADEks9nDmD37Om4OadkDnDHjzc358zPl7Ztk0aOtCd+yBDz2Icfhtf2rl3WVjez+l2ttp9M6MuwsZQ3AABAvEhLk1avNge4+/aZA9qWmYC2A9y8PDMuM9O++DVrzFe4bVsdbFv9rkk
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(X, Y, xlabel=u'dł. płatka', ylabel=u'szer. płatka')"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 28,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"średnia: [matrix([[1. , 4.906, 1.676]]), matrix([[1. , 1.464, 0.244]])]\n",
"odchylenie standardowe: [matrix([[0. , 0.8214402 , 0.42263933]]), matrix([[0. , 0.17176728, 0.10613199]])]\n"
]
}
],
"source": [
"XY = np.column_stack((X, Y))\n",
"XY_split = [XY[np.where(XY[:,3] == c)[0]] for c in classes]\n",
"X_split = [XY_split[c][:,0:3] for c in classes]\n",
"Y_split = [XY_split[c][:,3] for c in classes]\n",
"\n",
"X_mean = [np.mean(X_split[c], axis=0) for c in classes]\n",
"X_std = [np.std(X_split[c], axis=0) for c in classes]\n",
"print('średnia: ', X_mean) \n",
"print('odchylenie standardowe: ', X_std)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 29,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/1042079336.py:11: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(x1, x2, p, levels=np.arange(0.0, 1.0, 0.1),\n"
]
},
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAzoAAAHvCAYAAACc3aoBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAACaQklEQVR4nOzdd3xTVf8H8E9Gd2lLaQutFFr23kNAlmxRQRQVUXD/RBQUQUUfBRygKIgTHI/geBCVR0BxIFsECpShsimrZbeMtkl3cn5/nOc2SZu0SZv0punn/XrdV9Obk3tPbsrlfHPO+R6NEEKAiIiIiIjIh2jVrgAREREREZG7MdAhIiIiIiKfw0CHiIiIiIh8DgMdIiIiIiLyOQx0iIiIiIjI5zDQISIiIiIin8NAh4iIiIiIfA4DHSIiIiIi8jl6tStQGWazGefOnUOtWrWg0WjUrg4REREREXmQEALZ2dmIi4uDVlt2n021DnTOnTuH+Ph4tatBRERERERVKC0tDfXr1y+zTLUOdGrVqgVAvtGwsDCVa0NERERERJ6UlZWF+Pj44jigLNU60FGGq4WFhTHQISIiIiKqIZyZtsJkBERERERE5HMY6BARERERkc9hoENERERERD6HgQ4REREREfkcBjpERERERORzGOgQEREREZHPYaBDREREREQ+h4EOERERERH5HAY6RERERETkcxjoEBERERGRz2GgQ0REREREPoeBDhERERER+RwGOkRERERE5HNUDXQSEhKg0WhKbRMnTlSzWkREREREVM3p1Tz5rl27YDKZin/fv38/Bg0ahNGjR6tYKyIiIiIiqu5UDXSio6Ntfn/jjTfQuHFj9O3bV6UaERERERGRL1A10LFWUFCAr7/+GlOmTIFGo7FbJj8/H/n5+cW/Z2VlVVX1iIiIiIioGvGaZAQrV67EtWvXcP/99zssM2fOHISHhxdv8fHxVVdBqnGMBUZoZmmgmaWBscCodnWIiEhhNAIajdyMvD8TkX1eE+j8+9//xrBhwxAXF+ewzPTp05GZmVm8paWlVWENiYiIiIiouvCKoWunT5/GunXr8MMPP5RZLiAgAAEBAVVUKyIiIiIiqq68ItBZvHgxYmJiMHz4cLWrQlTMT+eHuQPnFj8mIiIv4ecHzJ1reUxEZIdGCCHUrIDZbEZiYiLGjBmDN954w6XXZmVlITw8HJmZmQgLC/NQDYmIiIiIyBu40v5XfY7OunXrkJqaigcffFDtqhARERERkY9Qfeja4MGDoXKnEpFdJrMJe87vAQB0iu0EnVanco2IiAgAYDIBe+T9GZ06ATren4moNNUDHSJvlVeUh26fdQMAGKYbEOIfonKNiIgIAJCXB3ST92cYDEAI789EVJrqQ9eIiIiIiIjcjYEOERERERH5HAY6RERERETkcxjoEBERERGRz2GgQ0REREREPoeBDhERERER+RymlyZywE/nhxl9ZxQ/JiIiL+HnB8yYYXlMRGSHRlTj1TqzsrIQHh6OzMxMhIWFqV0dIiIiIiLyIFfa/xy6RkREREREPodD14gcMAszDqUfAgC0jG4JrYbfCxAReQWzGTgk789o2RLQ8v5MRKUx0CFyILcwF20WtgEAGKYbEOIfonKNiIgIAJCbC7SR92cYDEAI789EVBq/AiEiIiIiIp/DQIeIiIiIiHwOAx0iIiIiIvI5DHSIiIiIiMjnMNAhIiIiIiKfw0CHiIiIiIh8DtNLEzngp/PD1B5Tix8TEZGX8PMDpk61PCYiskMjhBBqV6KisrKyEB4ejszMTISFhaldHSIiIiIi8iBX2v8cukZERERERD6HQ9eIHDALM1IzUwEADcIbQKvh9wJERF7BbAZS5f0ZDRoAWt6fiag0BjpEDuQW5iLx3UQAgGG6ASH+ISrXiIiIAAC5uUCivD/DYABCeH8motL4FQgREREREfkcBjpERERERORzGOgQEREREZHPYaBDREREREQ+h4EOERERERH5HAY6RERERETkc5hemsgBvVaPx7s8XvyYiIi8hF4PPP645TERkR0aIYRQuxIVlZWVhfDwcGRmZiIsLEzt6hARERERkQe50v7n0DUiIiIiIvI57O8lckAIgYycDABAVHAUNBqNyjUiIiIAgBBAhrw/IyoK4P2ZiOxgoEPkQE5hDmLejgEAGKYbEOIfonKNiIgIAJCTA8TI+zMMBiCE92ciKo1D14iIiIiIyOcw0CEiIiIiIp/DQIeIiIiIiHwOAx0iIiIiIvI5DHSIiIiIiMjnMNAhIiIiIiKfw/TSRA7otXqMbz+++DEREXkJvR4YP97ymIjIDo0QQqhdiYrKyspCeHg4MjMzERYWpnZ1iIiIiIjIg1xp/3PoGhERERER+Rz29xI5IIRATmEOACDYLxgajUblGhEREQBACCBH3p8RHAzw/kxEdrBHh8iBnMIchM4JReic0OKAh4iIvEBODhAaKrcc3p+JyD4GOkRERERE5HMY6BARERERkc9hoENERERERD5H9UDn7NmzuPfee1GnTh0EBQWhbdu2SE5OVrtaRERERERUjamade3q1avo1asX+vfvj19//RXR0dE4duwYateurWa1iIiIiIiomlM10HnzzTcRHx+PxYsXF+9LTExUsUZEREREROQLVA10fvzxRwwZMgSjR4/G5s2bcd111+Hxxx/HI488Yrd8fn4+8vPzi3/PysqqqqpSDaTT6nBHqzuKHxMRkZfQ6YA77rA8JiKyQyOEEGqdPDAwEAAwZcoUjB49Grt27cLkyZOxaNEijB8/vlT5mTNnYtasWaX2Z2ZmIiwszOP1JSIiIiIi9WRlZSE8PNyp9r+qgY6/vz+6dOmCbdu2Fe+bNGkSdu3ahe3bt5cqb69HJz4+noEOEREREVEN4Eqgo2rWtdjYWLRq1cpmX8uWLZGammq3fEBAAMLCwmw2IiIiIiKiklQNdHr16oUjR47Y7Dt69CgaNmyoUo2ILIwFRmhmaaCZpYGxwKh2dYiISGE0AhqN3Iy8PxORfaoGOk8//TSSkpIwe/ZspKSkYOnSpfjkk08wceJENatFRERERETVnKqBTteuXbFixQp88803aNOmDV599VUsWLAAY8eOVbNaRERERERUzamaXhoAbr75Ztx8881qV4OIiIiIiHyIqj06REREREREnsBAh4iIiIiIfA4DHSIiIiIi8jmqz9Eh8lY6rQ43Nb2p+DEREXkJnQ646SbLYyIiOzRCCKF2JSrKlZVRiYiIiIioenOl/c+ha0RERERE5HMY6BARERERkc9hoEPkgLHAiJDZIQiZHQJjgVHt6hARkcJoBEJC5Gbk/ZmI7GMyAqIy5BTmqF0FIiKyJ4f3ZyIqG3t0iIiIiIjI5zDQISIiIiIin8NAh4iIiIiIfA4DHSIiIiIi8jkMdIiIiIiIyOcw6xqRA1qNFn0b9i1+TEREXkKrBfr2tTwmIrJDI4QQaleiorKyshAeHo7MzEyEhYWpXR0iIiIiIvIgV9r//BqEiIiIiIh8DgMdIiIiIiLyOQx0iBwwFhgR/VY0ot+KhrHAqHZ1iIhIYTQC0dFyM/L+TET2MRkBURkycjLUrgIREdmTwfszEZWNPTpERERERORzGOgQEREREZHPYaBDREREREQ+h4EOERERERH5HAY6RERERETkc5h1jcgBrUaLLnFdih8TEZGX0GqBLl0sj4mI7NAIIYTalaiorKwshIeHIzMzE2FhYWpXh4iIiIiIPMiV9j+/BiEiIiIiIp/DQIeIiIiIiHwOAx0iB3IKc5CwIAEJCxKQU5ijdnWIiEiRkwMkJMgth/dnIrKPyQiIHBBC4HTm6eLHRETkJYQATp+2PCYisoM9OkRERERE5HMY6BARERERkc9hoENERERERD6HgQ4REREREfkcBjpERERERORzmHWNyAGNRoNW0a2KHxMRkZfQaIBWrSyPiYjsYKBD5ECwXzAOPH5A7WoQEVFJwcHAAd6fiahsHLpGREREREQ+h4EOERERERH5HAY6RA7kFOag9Uet0fqj1sgpzFG7OkREpMjJAVq3llsO789EZB/n6BA5IITAwfS
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(X, Y, xlabel=u'dł. płatka', ylabel=u'szer. płatka')\n",
"draw_means(fig, X_mean)\n",
"plot_prob(fig, X_mean, X_std, classes)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 30,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/4022158666.py:8: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(x1, x2, p_diff, levels=[0.0], colors='c', lw=3);\n"
]
},
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAzoAAAHvCAYAAACc3aoBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABPpElEQVR4nO3deXxU5d3///ckk4SEkEBYg0mAyKaiiKCIYMWCVfC2qP1WS91qvXsrYlFxuW+9W1zaitWK3vfvFlBr3VqE3vZG6CKoKJQtkbWCVRZFEtlCCGQjTLbz+2OYZJJMkjnDnJwzh9fz8ZiHzjnXXPnMmdied65zPuMxDMMQAAAAALhInN0FAAAAAEC0EXQAAAAAuA5BBwAAAIDrEHQAAAAAuA5BBwAAAIDrEHQAAAAAuA5BBwAAAIDrEHQAAAAAuI7X7gJORX19vfbv368uXbrI4/HYXQ4AAAAACxmGofLycvXt21dxcW2v2cR00Nm/f7+ys7PtLgMAAABAByosLFRWVlabY2I66HTp0kWS/42mpaXZXA0AAAAAK5WVlSk7O7shB7QlpoNO4HK1tLQ0gg4AAABwmgjnthWaEQAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHYIOAAAAANch6AAAAABwHduDzr59+3TzzTere/fuSk5O1rnnnquNGzfaXRYAAACAGOa184cfPXpUY8eO1eWXX6733ntPPXv21K5du9StWzc7ywIAAAAQ42wNOr/+9a+VnZ2t1157rWHbgAEDbKwIAAAAgBvYeuna0qVLNWrUKH3/+99Xr169NGLECL3yyiutjvf5fCorK2vyAAAAAIDmbA06X331lebNm6dBgwZp+fLlmjZtmmbMmKE33ngj5PjZs2crPT294ZGdnd3BFQMAAACIBR7DMAy7fnhiYqJGjRqldevWNWybMWOGNmzYoPXr17cY7/P55PP5Gp6XlZUpOztbpaWlSktL65CaAQAAANijrKxM6enpYZ3/27qik5mZqbPPPrvJtrPOOksFBQUhxyclJSktLa3JAwAAAACaszXojB07Vjt27GiybefOnerXr59NFQEAAABwA1uDzv3336+8vDw99dRT2r17txYsWKCXX35Z06dPt7MsAAAAADHO1qBz4YUXavHixXr77bc1bNgw/eIXv9ALL7ygm266yc6yAAAAAMQ4W5sRnCozNyMBAAAAiG0x04wAAAAAAKxA0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5D0AEAAADgOgQdAAAAAK5ja9B5/PHH5fF4mjyGDh1qZ0kAAAAAXMBrdwHnnHOOPvzww4bnXq/tJQEAAACIcbanCq/Xqz59+thdBgAAAAAXsf0enV27dqlv377Kzc3VTTfdpIKCglbH+nw+lZWVNXkAAAAAQHO2Bp3Ro0fr9ddf17JlyzRv3jzt2bNHl156qcrLy0OOnz17ttLT0xse2dnZHVwxAAAAgFjgMQzDsLuIgGPHjqlfv36aM2eO7rjjjhb7fT6ffD5fw/OysjJlZ2ertLRUaWlpHVkqAAAAgA5WVlam9PT0sM7/bb9HJ1jXrl01ePBg7d69O+T+pKQkJSUldXBVAAAAAGKN7ffoBKuoqNCXX36pzMxMu0sBAAAAEMNsDToPPvigVq1apa+//lrr1q3Tddddp/j4eE2dOtXOsgAAAADEOFsvXfvmm280depUHTlyRD179tS4ceOUl5ennj172lkWAAAAgBhna9BZuHChnT8eAAAAgEs56h4dAAAAAIgGgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdgg4AAAAA1yHoAAAAAHAdbyQvqqys1KpVq1RQUKDq6uom+2bMmBGVwgAAAAAgUqaDzpYtWzR58mQdP35clZWVysjIUHFxsVJSUtSrVy+CDgAAAADbmb507f7779c111yjo0ePKjk5WXl5edq7d69Gjhyp3/zmNxEX8vTTT8vj8ei+++6LeA4AAAAAkCIIOlu3btUDDzyguLg4xcfHy+fzKTs7W88884weffTRiIrYsGGDXnrpJZ133nkRvR4AAAAAgpkOOgkJCYqL87+sV69eKigokCSlp6ersLDQdAEVFRW66aab9Morr6hbt26mXw8AAAAAzZkOOiNGjNCGDRskSZdddplmzZqlP/zhD7rvvvs0bNgw0wVMnz5dV199tSZOnNjuWJ/Pp7KysiYPAAAAAGjOdNB56qmnlJmZKUn61a9+pW7dumnatGk6fPiwXnrpJVNzLVy4UJs3b9bs2bPDGj979mylp6c3PLKzs82WDwAAAOA04DEMw4jWZFVVVUpOTg5rbGFhoUaNGqUPPvig4d6c8ePH6/zzz9cLL7wQ8jU+n08+n6/heVlZmbKzs1VaWqq0tLRTrh8AAACAc5WVlSk9PT2s83/TKzqttY+urKzU5MmTw55n06ZNKioq0gUXXCCv1yuv16tVq1bpv//7v+X1elVXV9fiNUlJSUpLS2vyAAAAAIDmTH+Pzl//+ld169ZNTzzxRMO2iooKTZo0ydQ8EyZM0LZt25psu/322zV06FD9+7//u+Lj482WBgAAAACSIgg677//vi699FJ169ZN9913n8rLy3XllVfK6/XqvffeC3ueLl26tGhe0LlzZ3Xv3j2ipgYAAAAAEGA66Jx55platmyZLr/8csXFxentt99WUlKS/vrXv6pz585W1AgAAAAApkTcjGD9+vW64oorNHr
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(X, Y, xlabel=u'dł. płatka', ylabel=u'szer. płatka')\n",
"plot_decision_boundary_bayes(fig, X_mean, X_std)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 31,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# Przygotowanie danych dla wielomianowej regresji logistycznej\n",
"\n",
"data = np.matrix(data_iris_versicolor)\n",
"\n",
"Xpl = powerme(data[:, 1], data[:, 0], n)\n",
"Ypl = np.matrix(data[:, 2]).reshape(m, 1)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 32,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2023-01-13 14:18:12 +01:00
"theta = [[-10.68893855]\n",
" [ 5.52671372]\n",
" [ 5.83188857]\n",
" [ -0.60754604]\n",
" [ -0.46068339]\n",
" [ -2.83369636]]\n"
2022-12-09 15:06:17 +01:00
]
}
],
"source": [
"# Uruchomienie metody gradientu prostego dla regresji logistycznej\n",
"theta_start = np.matrix(np.zeros(Xpl.shape[1])).reshape(Xpl.shape[1], 1)\n",
"theta, errors = GD(h, J, dJ, theta_start, Xpl, Ypl, \n",
" alpha=0.05, eps=10**-7, maxSteps=100000)\n",
"print(r'theta = {}'.format(theta))"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 33,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/2795780436.py:10: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(xx, yy, z, levels=[0.5], colors='m', lw=3);\n"
]
},
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAzoAAAHvCAYAAACc3aoBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABjPUlEQVR4nO3deXhU5d3/8c8kM9kXkpBAAmETEJVFBIyIVFTUqlUU2/pDq9baVhH3pa22dasVH61L+1RBq49ba6HVIqgVXEFAg2wRUJGdBMKWfWWyzPn9ccgyWeeETGbm8H5d11wh59w5+eaMwvnkPvf3OAzDMAQAAAAANhIW6AIAAAAAoLsRdAAAAADYDkEHAAAAgO0QdAAAAADYDkEHAAAAgO0QdAAAAADYDkEHAAAAgO0QdAAAAADYjjPQBRwNj8ej/Px8xcfHy+FwBLocAAAAAH5kGIbKy8uVkZGhsLCO52xCOujk5+crMzMz0GUAAAAA6EF5eXnq379/h2NCOujEx8dLMn/QhISEAFcDAAAAwJ/KysqUmZnZmAM6EtJBp+F2tYSEBIIOAAAAcIzwZdkKzQgAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2E7Ag87evXv1k5/8RCkpKYqOjtaoUaO0Zs2aQJcFAAAAIIQ5A/nNi4uLNWnSJJ111ll6//33lZqaqq1btyopKSmQZQEAAAAIcQENOv/zP/+jzMxMvfzyy43bBg8eHMCKAAAAANhBQG9dW7RokcaPH68f/ehHSktL09ixY/W3v/2t3fFut1tlZWVeLwAAAABoKaBBZ8eOHZozZ46GDRumJUuWaObMmbr11lv16quvtjl+9uzZSkxMbHxlZmb2cMUAAAAAQoHDMAwjUN88IiJC48eP1+eff9647dZbb9Xq1av1xRdftBrvdrvldrsbPy8rK1NmZqZKS0uVkJDQIzUDAAAACIyysjIlJib6dP0f0Bmd9PR0nXjiiV7bTjjhBOXm5rY5PjIyUgkJCV4vAAAAAGgpoEFn0qRJ+u6777y2bdmyRQMHDgxQRQAAAADsIKBB54477lB2drYeffRRbdu2TW+88YZeeOEFzZo1K5BlAQAAAAhxAQ06EyZM0IIFC/TPf/5TI0eO1B/+8Ac988wzuuqqqwJZFgAAAIAQF9BmBEfLymIkAAAAAKEtZJoRAAAAAIA/EHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtEHQAAAAA2A5BBwAAAIDtBDToPPjgg3I4HF6vESNGBLIkAAAAADbgDHQBJ510kj766KPGz53OgJcEAAAAIMQFPFU4nU717ds30GUAAAAAsJGAr9HZunWrMjIyNGTIEF111VXKzc1td6zb7VZZWZnXCwAAAABaCmjQycrK0iuvvKLFixdrzpw52rlzpyZPnqzy8vI2x8+ePVuJiYmNr8zMzB6uGAAAAEAocBiGYQS6iAYlJSUaOHCgnnrqKV1//fWt9rvdbrnd7sbPy8rKlJmZqdLSUiUkJPRkqQAAAAB6WFlZmRITE326/g/4Gp3mevXqpeHDh2vbtm1t7o+MjFRkZGQPVwUAAAAg1AR8jU5zFRUV2r59u9LT0wNdCgAAAIAQFtCgc/fdd2vZsmXatWuXPv/8c1122WUKDw/XjBkzAlkWAAAAgBAX0FvX9uzZoxkzZqiwsFCpqak644wzlJ2drdTU1ECWBQAAACDEBTTozJs3L5DfHgAAAIBNBdUaHQAAAADoDgQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgOwQdAAAAALZD0AEAAABgO86ufFFlZaWWLVum3Nxc1dTUeO279dZbu6UwAAAAAOgqy0Fn/fr1uvDCC1VVVaXKykolJyeroKBAMTExSktLI+gAAAAACDjLt67dcccduvjii1VcXKzo6GhlZ2dr9+7dGjdunP70pz91uZDHHntMDodDt99+e5ePAQAAAABSF4JOTk6O7rrrLoWFhSk8PFxut1uZmZl6/PHHdd9993WpiNWrV+v555/X6NGju/T1AAAAANCc5aDjcrkUFmZ+WVpamnJzcyVJiYmJysvLs1xARUWFrrrqKv3tb39TUlKS5a8HAAAAgJYsB52xY8dq9erVkqQzzzxT999/v/7xj3/o9ttv18iRIy0XMGvWLF100UWaOnVqp2PdbrfKysq8XgAAAADQkuWg8+ijjyo9PV2S9Mc//lFJSUmaOXOmDh06pOeff97SsebNm6d169Zp9uzZPo2fPXu2EhMTG1+ZmZlWywcAAABwDHAYhmF018Gqq6sVHR3t09i8vDyNHz9eH374YePanClTpujkk0/WM8880+bXuN1uud3uxs/LysqUmZmp0tJSJSQkHHX9AAAAAIJXWVmZEhMTfbr+tzyj01776MrKSl144YU+H2ft2rU6ePCgTjnlFDmdTjmdTi1btkx/+ctf5HQ6VV9f3+prIiMjlZCQ4PUCAAAAgJYsP0fnvffeU1JSkh566KHGbRUVFbrgggssHeecc87Rxo0bvbZdd911GjFihH79618rPDzcamkAAAAAIKkLQeeDDz7Q5MmTlZSUpNtvv13l5eU6//zz5XQ69f777/t8nPj4+FbNC2JjY5WSktKlpgYAAAAA0MBy0DnuuOO0ePFinXXWWQoLC9M///lPRUZG6r333lNsbKw/agQAAAAAS7rcjOCLL77Queeeq6ysLL377rs+NyH
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(Xpl, Ypl, xlabel=u'dł. płatka', ylabel=u'szer. płatka')\n",
"plot_decision_boundary(fig, theta, Xpl)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 34,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/2795780436.py:10: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(xx, yy, z, levels=[0.5], colors='m', lw=3);\n",
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/4022158666.py:8: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(x1, x2, p_diff, levels=[0.0], colors='c', lw=3);\n"
]
},
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAzoAAAHvCAYAAACc3aoBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABuL0lEQVR4nO3deXxU5dn/8e9s2feEQAJhlUUEAdlEpKKiVq2idPFHrVttq4h1rW21rVut+mhduihq9VFra6G1D6K1ghsgi2EPmyI7CVsgJJnsM0nm/P6Y7OuckMksfN6v13lN5px77rnmDMu5ct/3dSyGYRgCAAAAgDBiDXQAAAAAANDdSHQAAAAAhB0SHQAAAABhh0QHAAAAQNgh0QEAAAAQdkh0AAAAAIQdEh0AAAAAYYdEBwAAAEDYsQc6gJPh8Xh0+PBhxcfHy2KxBDocAAAAAH5kGIZKS0uVmZkpq7XjMZuQTnQOHz6srKysQIcBAAAAoAfl5eWpX79+HbYJ6UQnPj5ekveDJiQkBDgaAAAAAP5UUlKirKyshjygIyGd6NRPV0tISCDRAQAAAE4RvixboRgBAAAAgLBDogMAAAAg7JDoAAAAAAg7JDoAAAAAwg6JDgAAAICwQ6IDAAAAIOyQ6AAAAAAIOyQ6AAAAAMIOiQ4AAACAsEOiAwAAACDskOgAAAAACDskOgAAAADCDokOAAAAgLBDogMAAAAg7JDoAAAAAAg7JDoAAAAAwg6JDgAAAICwQ6IDAAAAIOyQ6AAAAAAIOyQ6AAAAAMIOiQ4AAACAsEOiAwAAACDskOgAAAAACDskOgAAAADCDokOAAAAgLBDogMAAAAg7JDoAAAAAAg7JDoAAAAAwg6JDgAAAICwQ6IDAAAAIOyQ6AAAAAAIOyQ6AAAAAMIOiQ4AAACAsEOiAwAAACDskOgAAAAACDskOgAAAADCDokOAAAAgLBDogMAAAAg7JDoAAAAAAg7JDoAAAAAwg6JDgAAAICwQ6IDAAAAIOyQ6AAAAAAIOyQ6AAAAAMIOiQ4AAACAsEOiAwAAACDskOgAAAAACDskOgAAAADCDokOAAAAgLBDogMAAAAg7JDoAAAAAAg7JDoAAAAAwg6JDgAAAICwE/BE59ChQ/rBD36g1NRURUdHa/To0Vq/fn2gwwIAAAAQwuyBfPOioiJNnTpV559/vj788EP16tVLu3btUnJyciDDAgAAABDiApro/M///I+ysrL0+uuvN+wbNGhQACMCAAAAEA4COnXtvffe04QJE/Td735X6enpGjdunP7yl7+0297lcqmkpKTZBgAAAAAtBTTR2bt3r+bNm6ehQ4dqyZIlmjNnju644w69+eabbbZ/4oknlJiY2LBlZWX1cMQAAAAAQoHFMAwjUG8eERGhCRMmaPXq1Q377rjjDq1bt05ffPFFq/Yul0sul6vheUlJibKysuR0OpWQkNAjMQMAAAAIjJKSEiUmJvp0/R/QEZ2MjAyNHDmy2b7TTz9dubm5bbaPjIxUQkJCsw0AAAAAWgpoojN16lR9/fXXzfbt3LlTAwYMCFBEAAAAAMJBQBOdu+++W9nZ2Xr88ce1e/duvf3223rllVc0d+7cQIYFAAAAIMQFNNGZOHGiFi5cqH/84x8aNWqUfvvb3+r555/XtddeG8iwAAAAAIS4gBYjOFlmFiMBAAAACG0hU4wAAAAAAPyBRAcAAABA2CHRAQAAABB2SHQAAAAAhB0SHQAAAABhh0QHAAAAQNgh0QEAAAAQdkh0AAAAAIQdEh0AAAAAYYdEBwAAAEDYIdEBAAAAEHZIdAAAAACEHRIdAAAAAGGHRAcAAABA2CHRAQAAABB2SHQAAAAAhB0SHQAAAABhh0QHAAAAQNgh0QEAAAAQdkh0AAAAAIQdEh0AAAAAYYdEBwAAAEDYIdEBAAAAEHZIdAAAAACEHRIdAAAAAGGHRAcAAABA2CHRAQAAABB2SHQAAAAAhB0SHQAAAABhh0QHAAAAQNgh0QEAAAAQdkh0AAAAAIQdEh0AAAAAYYdEBwAAAEDYIdEBAAAAEHZIdAAAAACEHRIdAAAAAGGHRAcAAABA2CHRAQAAABB2SHQAAAAAhB0SHQAAAABhh0QHAAAAQNgh0QEAAAAQdkh0AAAAAIQdEh0AAAAAYYdEBwAAAEDYIdEBAAAAEHZIdAAAAACEHRIdAAAAAGGHRAcAAABA2CHRAQAAABB2AproPPzww7JYLM22ESNGBDIkAAAAAGHAHugAzjjjDH3yyScNz+32gIcEAAAAIMQFPKuw2+3q06dPoMMAAAAAEEYCvkZn165dyszM1ODBg3XttdcqNze33bYul0slJSXNNgAAAABoKaCJzuTJk/XGG29o8eLFmjdvnvbt26dp06aptLS0zfZPPPGEEhMTG7asrKwejhgAAABAKLAYhmEEOoh6xcXFGjBggJ599lndfPPNrY67XC65XK6G5yUlJcrKypLT6VRCQkJPhgoAAACgh5WUlCgxMdGn6/+Ar9FpKikpScOGDdPu3bvbPB4ZGanIyMgejgoAAABAqAn4Gp2mysrKtGfPHmVkZAQ6FAAAAAAhLKCJzs9+9jMtX75c+/fv1+rVq3X11VfLZrNp9uzZgQwLAAAAQIgL6NS1gwcPavbs2Tpx4oR69eqlc889V9nZ2erVq1cgwwIAAAAQ4gKa6MyfPz+Qbw8AAAAgTAXVGh0AAAAA6A4kOgAAAADCDokOAAAAgLBDogMAAAAg7JDoAAAAAAg7JDoAAAAAwg6JDgAAAICwQ6IDAAAAIOyQ6AAAAAAIOyQ6AAAAAMIOiQ4AAACAsEOiAwAAACDskOgAAAAACDskOgAAAADCDokOAAAAgLBDogMAAAAg7JDoAAAAAAg7JDoAAAAAwg6JDgAAAICwQ6IDAAAAIOyQ6AAAAAAIOyQ6AAAAAMIOiQ4AAACAsEOiAwAAACDskOgAAAAACDskOgAAAADCDokOAAAAgLBDogMAAAAg7JDoAAAAAAg7JDoAAAAAwg6JDgAAAICwQ6IDAAAAIOyQ6AAAAAAIOyQ6AAAAAMIOiQ4AAACAsEOiAwAAACDskOgAAAAACDskOgAAAADCDokOAAAAgLBDogMAAAAg7JDoAAAAAAg7JDoAAAAAwg6JDgAAAICwQ6IDAAAAIOyQ6AAAAAAIOyQ6AAAAAMIOiQ4AAACAsEOiAwAAACDskOgAAAAACDv2rryovLxcy5cvV25urtxud7Njd9xxR7cEBgAAAABdZTrR2bRpky677DJVVFSovLxcKSkpKigoUExMjNLT00l0AAAAAASc6alrd999t6644goVFRUpOjpa2dnZOnDggMaPH6/f//73XQ7kySeflMVi0V133dXlPgAAAABA6kKik5OTo3vvvVdWq1U2m00ul0tZWVl66qmn9MADD3QpiHXr1unll1/WmWee2aXXAwAAAEBTphMdh8Mhq9X7svT0dOXm5kqSEhMTlZeXZzqAsrIyXXvttfrLX/6i5ORk068HAAAAgJZMJzrjxo3TunXrJEnnnXeeHnzwQf3973/XXXfdpVGjRpkOYO7cubr88ss1Y8aMTtu6XC6VlJQ02wAAAACgJdOJzuOPP66MjAxJ0u9+9zslJydrzpw5On78uF5++WVTfc2fP18bN27UE0884VP7J554QomJiQ1bVlaW2fABAAAAnAIshmEY3dVZZWWloqOjfWqbl5enCRMm6OOPP25YmzN9+nSNHTtWzz//fJuvcblccrlcDc9LSkqUlZUlp9OphISEk44fAAAAQPAqKSlRYmKiT9f/pkd02isfXV5erssuu8znfjZs2KBjx47prLPOkt1ul91u1/Lly/XHP/5RdrtdtbW1rV4TGRmphISEZhsAAAAAtGT6PjoffPCBkpOT9cgjjzTsKysr06WXXmqqnwsvvFBbt25ttu+mm27SiBEj9Itf/EI2m81saAAAAAAgqQuJzkcffaRp06YpOTlZd911l0pLS3XJJZfIbrfrww8/9Lmf+Pj4VsULYmNjlZqa2qWiBgAAAABQz3SiM2TIEC1evFjnn3++rFar/vGPfygyMlI
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(Xpl, Ypl, xlabel=u'dł. płatka', ylabel=u'szer. płatka')\n",
"plot_decision_boundary(fig, theta, Xpl)\n",
"plot_decision_boundary_bayes(fig, X_mean, X_std)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Kiedy naiwny Bayes nie działa?"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 35,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# Wczytanie danych\n",
"import pandas\n",
"import numpy as np\n",
"\n",
"alldata = pandas.read_csv('bayes_nasty.tsv', sep='\\t')\n",
"data = np.matrix(alldata)\n",
"\n",
"m, n_plus_1 = data.shape\n",
"n = n_plus_1 - 1\n",
"Xn = data[:, 1:]\n",
"\n",
"Xbn = np.matrix(np.concatenate((np.ones((m, 1)), Xn), axis=1)).reshape(m, n_plus_1)\n",
"Xbnp = powerme(data[:, 1], data[:, 2], n)\n",
"Ybn = np.matrix(data[:, 0]).reshape(m, 1)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 36,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1wAAAHvCAYAAABAJN42AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABrZ0lEQVR4nO3dfXxU5Z338e9MQkI0DsgCibRBpZqgWxSFEkMAuW9mBc1uI7gttDYoS3URtBWsD/ReoWK7sT7AvmoNtN6KNttqqrcUo5YKaGoSwoMoFSmQUrSJDwlVlgzBNJNkzv3HNCOTTCYzyTycM/N5v17zUs65ZnLNmZOT85vfdf0um2EYhgAAAAAAEWePdwcAAAAAIFERcAEAAABAlBBwAQAAAECUEHABAAAAQJQQcAEAAABAlBBwAQAAAECUEHABAAAAQJSkxrsDicDj8eijjz7SWWedJZvNFu/uAAAAAIgiwzB08uRJjRkzRnZ78BwWAVcEfPTRR8rJyYl3NwAAAADEUGNjo774xS8GbUPAFQFnnXWWJO8Bdzgcce4NAAAAgGhyuVzKycnxxQHBEHBFQPcwQofDQcAFAAAAJIlQphNRNAMAAAAAooSACwAAAACihIALAAAAAKKEgAsAAAAAooSACwAAAACihIALAAAAAKKEgAsAAAAAooSACwAAAACihIALAAAAAKKEgAsAAAAAooSACwAAAACihIALAODP7R7c/mQSj2PF5wMAlmKpgOuNN97Qv/zLv2jMmDGy2Wz6zW9+0+9zqqqqdPnllys9PV0XXHCBnnrqqV5tHnvsMZ133nkaOnSo8vPztXv37sh3HgCsoKJCmjBBamwMvL+x0bu/oiK2/TKjeBwrPh8AsBxLBVynTp3SpZdeqsceeyyk9u+9956Kior0v/7X/9K+fft0++2369vf/rZ+97vf+dpUVFRoxYoVWr16td566y1deumlmj17to4dOxattwEA5uR2S6tWSfX10syZvW/qGxu92+vrve2SOZMSj2PF5wMAlmQzDMOIdycGwmazadOmTbr22mv7bHP33Xfr5Zdf1rvvvuvbtmDBAp04cUJbtmyRJOXn5+srX/mKfvrTn0qSPB6PcnJydNttt+mee+4JqS8ul0vDhg1TS0uLHA7HwN8UAMRb90370aPSuHFSVZWUk9P39mQWj2PF5wMAphDO/b+lMlzhqqurk9Pp9Ns2e/Zs1dXVSZLcbrf27t3r18Zut8vpdPraBNLe3i6Xy+X3AICEkJPjvVkfN8578z5zprRjBzfzgcTjWPH5AIDlJHTA1dTUpKysLL9tWVlZcrlcamtr0yeffKKurq6AbZqamvp83dLSUg0bNsz3yOEPG4BE0vOmvrCQm/m+xONY8fkAgKUkdMAVLStXrlRLS4vv0djX5GUAsKqcHKm83H9beTk384HE41jx+QCAZSR0wJWdna3m5ma/bc3NzXI4HMrIyNDIkSOVkpISsE12dnafr5ueni6Hw+H3AICE0tgolZT4bysp6bs6XjKLx7Hi8wEAy0jogKugoEDbt2/327Z161YVFBRIktLS0jRp0iS/Nh6PR9u3b/e1AYCk07MAQ22t/5whbuo/F49jxecDAJZiqYCrtbVV+/bt0759+yR5y77v27dPDQ0NkrxD/RYuXOhrv2TJEh09elR33XWXDh06pLKyMv3617/W8uXLfW1WrFihxx9/XE8//bQOHjyoW265RadOndKiRYti+t4AwBQCVbubOrV3oQZu6uNzrPh8AMRYW0ebmlub1dbRFu+uWJdhIa+//rohqdfjhhtuMAzDMG644Qbjyiuv7PWciRMnGmlpaca4ceOMjRs39nrdRx991Bg7dqyRlpZmTJkyxdi5c2dY/WppaTEkGS0tLQN8ZwBgAu3thpGbaxiSYYwbZxgNDf77Gxq82yVvu/b2+PTTDOJxrPh8AMRQ9V+qjbnPzjXs99kN/UCG/T67MffZuUbNX2ri3TVTCOf+37LrcJkJ63ABSBgVFd5Fc7dtC1yAobFRcjqlNWuk+fNj3z8zicex4vMBEAPr96zXsleWKcWeok5Pp297qj1VXZ4ulRWVacnkJXHsYfyFc/9PwBUBBFwAEorbLaWlDXx/MonHseLzARBFNQ01mrFxhgz1HSLYZFP1omoVji2MYc/MhYWPAQAD19/NOjfzn4vHseLzARBFa+vWKsWeErRNij1F63aui1GPrI+ACwAAAIDaOtq0+fBmv2GEgXR6OrXp0CYKaYSIgAsAAAAIIlkq9bnaXfIYnpDaegyPXO2uKPcoMRBwAQCA+HG7B7cfiKKahhrNq5inzNJMZT+SrczSTM2rmKfahtp4dy0qHOkO2W2hhQd2m12OdGoXhIKACwAAxEdFhTRhQt/rhjU2evdXVMS2X4C8lfpmbJyhyvpKX9bHY3hUWV+p6Runa8ObG+Lcw8jLGJKh4rxipdpTg7ZLtadq7vi5yhiSEaOeWRsBFwAAiD2321vivr4+8GLN3Ys819d725HpQgzVNNRo2SvLZMjoNZ+p09MpQ4aWvrw0YpkuMw1ZXFGwQl2erqBtujxdWn7F8hj1yPoIuADAChh2hUSTluZdT2zcOOnoUf+gqzvYOnrUu3/bNqovIqZiVanPjEMWp42dprKiMtlk65XpSrWnyiabyorKkrokfLgIuADA7Bh2hUSVkyNVVfkHXTt2+AdbVVWBF3nGoJgpo2I2sarUZ+Yhi0smL1H1omoV5xX75nTZbXYV5xWrelF10i96HC4WPo4AFj4GEDVutzeYqq8PfPN5eiYgN1fav59MAKzn9PO4G8FWVNQ01Ght3VptPrxZHsPju4m+o+AOMhZ/19zarOxHskNu33RHk7Iys8L6GVZaXLito02udpcc6Q7mbJ2GhY8BIFEw7ArJICdHKi/331ZeTrAVYWbOqJhJLCr1WWlx4YwhGcrKzCLYGgQCLgAwO4ZdIdE1NkolJf7bSkr6HkaLsMW6CISVRbtSH4sLJx8CLgCwgp5BV2EhwRYSQ89MbW1t4IwuBsVKGRUziGalPhYXTj4EXABgFQy7QqLpGWxVVUlTp/bO6BJ0DQoZlfBFs1IfiwsnHwIuALAKhl0hkbjdktMZOFPbM6PrdLL0wSCQURmYaFXqY3Hh5BP8kwYAmEPPTEB5uTfY6s4AMKwQVpOWJq1Z413UeNu23udvd9DldHrbURBmwLozKqEEXWRU/BWOLVTh2MKIV+pbUbBCvzn0m6BtWFw4cZDhAgCzY9gVEtX8+d6lDPr6siAnx7t//vzY9ivBkFEZvEhX6mNx4eRCwAUAZsawKyS6/jJXZLYiIppFIDAwLC6cPFj4OAJY+BhAVFVU9D3sSvJmtrqHXZEJANCHDW9u0NKXlyrFnuJXQCPVnqouT5fKisq4yY8TFhe2nnDu/wm4IoCAC0DUud3Bv+nvbz8ASKptqNW6neu06dAmeQyP7Da75o6fq+VXLGf4GhCGcO7/KZoBAFbAsCsAERCtIhAA+kbABQAAkGQyhmQQaAExQtEMAAAAAIgSAi4AAAAAiBICLgAwg/7KuVPuHQAASyLgAoB4q6iQJkzoe+Hixkbv/oqK2PYLAAAMGgEXAMST2+1dY6u+Xpo5s3fQ1djo3V5f721HpgsAAEsh4AKAeEpL8y5oPG6cdPSof9DVHWwdPerdv20b5d8BALAYAi4AiLecHKmqyj/o2rHDP9iqqvK2AwAAlsI6XABgBt1BV3eQVVjo3U6wBQCApZHhAsJBJTlEU06OVF7uv628nGALAAALI+ACQkUlOURbY6NUUuK/raSk73MOAACYHgEXEAoqySFc4WZDexbIqK0NXEgDAABYCgEXEAoqySEc4WZDe55DVVXS1Km9C2kQdAEAYDkEXECoqCSHUISbDW1tlZzOwOdQz3PO6SR7CgAYtLaONjW3Nqutoy3eXUkKBFxAOHreABcWEmxFmtULk4SbDc3MlNaskXJzA59D3edcbq63HdlTAMAA1TTUaF7FPGWWZir7kWxllmZqXsU81TbUxrtrCY2ACwgXleSiJ1EKk4SbDZ0/X9q/v+9zKCfHu3/+/Nj0HwC
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(Xbn, Ybn, xlabel=r'$x_1$', ylabel=r'$x_2$')"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 37,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"liczba przykładów: {0: 69, 1: 30}\n",
"prior probability: {0: 0.696969696969697, 1: 0.30303030303030304}\n"
]
}
],
"source": [
"classes = [0, 1]\n",
"count = [sum(1 if y == c else 0 for y in Ybn.T.tolist()[0]) for c in classes]\n",
"prior_prob = [float(count[c]) / float(Ybn.shape[0]) for c in classes]\n",
"\n",
"print('liczba przykładów: ', {c: count[c] for c in classes})\n",
"print('prior probability:', {c: prior_prob[c] for c in classes})"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 38,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"średnia: [matrix([[1. , 0.03949835, 0.02825019]]), matrix([[1. , 0.09929617, 0.06206227]])]\n",
"odchylenie standardowe: [matrix([[0. , 0.52318432, 0.60106092]]), matrix([[0. , 0.61370281, 0.6081128 ]])]\n"
]
}
],
"source": [
"XY = np.column_stack((Xbn, Ybn))\n",
"XY_split = [XY[np.where(XY[:,3] == c)[0]] for c in classes]\n",
"X_split = [XY_split[c][:,0:3] for c in classes]\n",
"Y_split = [XY_split[c][:,3] for c in classes]\n",
"\n",
"X_mean = [np.mean(X_split[c], axis=0) for c in classes]\n",
"X_std = [np.std(X_split[c], axis=0) for c in classes]\n",
"print('średnia: ', X_mean) \n",
"print('odchylenie standardowe: ', X_std)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 39,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/1042079336.py:11: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(x1, x2, p, levels=np.arange(0.0, 1.0, 0.1),\n"
]
},
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1wAAAHvCAYAAABAJN42AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOydd1gUVxfG3116t6CAIHbsvaCx916wa6yxRo1RY6IxGtEYW2yfGnsvETv22CtiRbGLKFIERED6svV+f5ywC4IIuJ37e555Znd2ytnZ2Zn73nPuOQLGGAOHw+FwOBwOh8PhcNSOUNcGcDgcDofD4XA4HI6xwgUXh8PhcDgcDofD4WgILrg4HA6Hw+FwOBwOR0NwwcXhcDgcDofD4XA4GoILLg6Hw+FwOBwOh8PREFxwcTgcDofD4XA4HI6G4IKLw+FwOBwOh8PhcDSEqa4NMAYUCgUiIyNhZ2cHgUCga3M4HA6Hw+FwOByOBmGMITk5GaVKlYJQmLsPiwsuNRAZGYnSpUvr2gwOh8PhcDgcDoejRcLDw+Hm5pbrOlxwqQE7OzsAdMLt7e11bA2Hw+FwOBwOh8PRJElJSShdurRSB+QGF1xqICOM0N7engsuDofD4XA4HA6nkJCX4UQ8aQaHw+FwOBwOh8PhaAguuDgcDofD4XA4HA5HQ3DBxeFwOBwOh8PhcDgaggsuDofD4XA4HA6Hw9EQXHBxOBwOh8PhcDgcjobggovD4XA4HA6Hw+FwNAQXXBwOh8PhcDgcDoejIbjg4nA4HA6Hw+FwOBwNwQUXh8PhcDgcDofD4WgILrg4HA6Hw+FwOBwOR0NwwcXhcDgcDofD4XA4GoILLg6Hw+FwOBwOh8PREFxwcTgcDofD4XA4HI6G4IKLw+FwDAmJ5Os+L6zo6rzx34vD4XAKPQYluK5du4bu3bujVKlSEAgE8PX1/eI2V65cQb169WBhYYGKFStix44d2db5+++/UbZsWVhaWsLT0xN37txRv/EcDofztezfD9SsCYSH5/x5eDh9vn+/du3Sd3R13vjvxeFwOBwYmOBKTU1F7dq18ffff+dp/ZCQEHTt2hWtW7fGw4cPMWXKFIwePRpnz55VrrN//35MmzYNc+fORUBAAGrXro2OHTsiJiZGU1+Dw+Fw8o9EAvz+OxAUBLRqlb0RHx5Oy4OCaD3uOSF0dd7478XhcDic/xAwxpiujSgIAoEAR48eRa9evT67zowZM3Dq1Ck8efJEuWzgwIFISEjAv//+CwDw9PREw4YNsXbtWgCAQqFA6dKl8cMPP2DmzJl5siUpKQkODg5ITEyEvb19wb8Uh8Ph5EZGI/3NG6B8eeDKFaB06c8v5xC6Om/89+JwOByjJT/tf4PycOUXf39/tGvXLsuyjh07wt/fHwAgkUhw//79LOsIhUK0a9dOuU5OiMViJCUlZZk4HA5H45QuTY3z8uWpsd6qFXDzJm+8fwldnTf+e3E4HA4HRi64oqOj4eTklGWZk5MTkpKSIBKJEBsbC7lcnuM60dHRn93vokWL4ODgoJxK84clh8PRFp824ps25Y33vKCO85aaCggENKWmqu24qZJUCOYJIJgnQKokj/vlcDgcjsFg1IJLU/z6669ITExUTuGfGxDN4XA4mqB0aWD37qzLdu/mYutL6Oq88d+Lw+FwCjWmujZAkzg7O+P9+/dZlr1//x729vawsrKCiYkJTExMclzH2dn5s/u1sLCAhYWFRmzmcDicLxIeDgwdmnXZ0KHcw/Ulvva8mZkBS5eqXqvpuGYmZljajvZrZpKP/XI4HA7HIDBqD1eTJk1w8eLFLMvOnz+PJk2aAADMzc1Rv379LOsoFApcvHhRuQ6Hw9EBjAEPHgA//ww0agTI5dlWSRIn4dCzQxh5bCRqb6gNBVPowFAd8GnCBT+/rGOEuMc9Z9Rx3szN6Zr8+Wd6rabjmpuY4+emP+Pnpj/D3CSP++VwMliyBOjZEzhwAEhL06kpPX16ot/BftjxcAdiUnm2Zw4nA4PycKWkpCA4OFj5PiQkBA8fPkSxYsXg7u6OX3/9Fe/evcOuXbsAAOPHj8fatWvxyy+/4LvvvsOlS5dw4MABnDp1SrmPadOmYfjw4WjQoAEaNWqEVatWITU1FSNHjtT69+NwCj1hYRRqtWcP8OKFavmVK0DbtgiOD8bxl8dx6tUpXAu9BplCplwlICoADUo10L7N2uRz2e2uXFEtb9WKe7o+RVfnjf9eHE3DGLBjB90vjx8HbG0BLy/yoLZpA5iYaM2UhPQEnAo6BTmT49CzQxBAgIauDdGtUjd08+iGOs51IBAItGYP5/OIpCIkiZNgb2EPKzMrXZtTKDCotPBXrlxB69atsy0fPnw4duzYgREjRuDt27e4cuVKlm2mTp2KZ8+ewc3NDXPmzMGIESOybL927Vr89ddfiI6ORp06dbB69Wp4enrm2S6eFp7D+QpSU4EjR4CdO4FLl6gBAQAWFmDduyGgV2P4loiD7+uTeBLzJMumHsU90K1SN3T16Ipm7s2M2zsgkVCR3KCgnBM9ZG7ce3gAjx/n3QtjzKjzvMnlQEAAva5XL/fGbD6OK69cCQH/7gDMzFDPpR5MhNprJHOMgMePgX37gH/+AUJDVcvd3IBhw4Dhw+na1jAKpsC9yHs4FXQKJ1+dREBUQJbP3R3c0atyL3hV9UIz92YwFRpUn79RcCPsBlb4r8Cxl8egYAoIBUL0rNwTPzX5CU3dm+raPIMjP+1/gxJc+goXXBxOPlEogBs3qGf24EEgJUX5kbRNS1zr2wi+pRLh++Y0IpIilJ+ZCk3RskxLdPfojq4eXVGxWEUdGK9D9u+nIrkXLuTsEQkPB9q1A+bPBwYM0L59+oq6zltqKnkQALpmbWzUctxU71mwDRpBu/01BTbmX9gvh5MTjAG3blGUgI8P8PGj6rNvviHhNWAA4OCgFXMikyNx+tVpnAw6ifNvziNNqgp3LGZVDN09usOrihfaV2gPazNrrdhUmFl/dz0mnp4IE6FJlugQU6Ep5Ao51nVdh/ENxuvQQsODCy4twwUXh5NHIiNJZG3dSh6F/xBXLIcLQ77BoQpiHHt3ER/TVQ0FGzMbdKrYCb2q9ELXSl1R1KqoDgzXIySS3D1XX/q8sKKO85ZfwZXH46ZCCttFtF8uuDhqIT0dOHGCIgfOnKFOLgCwtAR69wZGjwZatgSE2hnKnyZNw4U3F3D0xVGceHkCcaI45WdWplboUqkL+lbri66VusLOwk4rNhUmboTdQIvtLcDw+Sa/AAJcH3mde7ryARdcWoYLLg4nF2QyeuBv3gycPq1MgCEqaot/v/XE4epCnEi4jSSxqoB4CesS6FG5B3pV6YW25dryGHOOflAQwZWX3UpSueDiaI6oKGDvXursevpUtbxCBRJew4cDLi5aM0emkMEvzA9HXxyF7wtfhCaqwiAtTCzQsWJH9K3aF90rd0cRyyJas8uY6b2/N04Encji2foUU6EpelbuiUP9D2nRMsOGCy4twwUXh5MDr18D27YB27fTAx9AuilwpnsV+DSxxynpU6RKVUVeS9mVQu8qvdGnWh80d2/Ox7Fw9A8uuDiGDGPA/fvAli003is5mZabmADduwNjxgAdO2o10QZjDAFRATj8/DAOPTuEV/GvlJ+ZCc3Qtnxb9K/WH15Vvbj4KiAiqQi2i2zzlMlXKBAi5dcU3smZR7jg0jJccHE4/yGTASdPAuvXA+fOAQCkQuBiXQfs6+SKo1ahSJapRJa7gzv6Vu2LPtX6oLFbYwgFRl2pgmPocMHFMRZSUmj87ObNgL+/armbGzBuHHm+cqlHqgkYY3gS8wSHnh3C4eeH8fSDyhtnbmKOzhU7Y1CNQejm0e2r/iOFLUPf+5T3cF6e998y+qdoONk6adAi44ELLi3DBRen0PP+PfWabtwIhIdDIQCulwH2dXLDIddExMmTlauWti+NAdUHoH/1/mhQqgFPE8wxHLjgIvg4QuPi6VO6f+/aBcTH0zJTUxrrNXEi0Lw5oIP79IvYFzj49CB8nvrg2YdnyuU2ZjboUbkHBtUYhI4
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(Xbn, Ybn, xlabel=r'$x_1$', ylabel=r'$x_2$')\n",
"draw_means(fig, X_mean, xmin=-1.0, xmax=1.0, ymin=-1.0, ymax=1.0)\n",
"plot_prob(fig, X_mean, X_std, classes, xmin=-1.0, xmax=1.0, ymin=-1.0, ymax=1.0)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 40,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/4022158666.py:8: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(x1, x2, p_diff, levels=[0.0], colors='c', lw=3);\n"
]
},
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0YAAAHvCAYAAABwqM8XAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABs8ElEQVR4nO3de3yU9Zn///c9MzkSEk4hgCQQoqAonitFUFHxvC3qVt3qeqpfWxS7LdpuxVbdurp0t1a663ro7q+itVqxWrRsq6K1IgdFRaknDnIykUAAgZwIk2Rmfn/M3MlkMod7JjNzz+H1fDzyUJLJzEWUm/s91+dzfQyfz+cTAAAAAOQxh90FAAAAAIDdCEYAAAAA8h7BCAAAAEDeIxgBAAAAyHsEIwAAAAB5j2AEAAAAIO8RjAAAAADkPZfdBaST1+tVY2OjBg8eLMMw7C4HAAAAQAr5fD61trZqzJgxcjii94TyKhg1Njaqurra7jIAAAAApFFDQ4PGjh0b9TF5FYwGDx4syf+DKS8vt7kaAEidNc3NuvSTT9Tm8ejkwYP1/NFHa0hBgd1lwSKfz6cfbd2qXzU2SpIW1Nbq5hh/oQMA+mtpaVF1dXVPDogmr4KRuXyuvLycYAQgZ608cECXbNum9uJizRwyREuPOUZlrry63OeER44/XkMrKvSz+nrN371brrIy/aCmxu6yACArWdlGw/AFAMghq5ubdcFHH6nd69WsoUP1pylTCEVZyjAM/Vttre4aN06S9MOtW/VAQ4PNVQFA7iIYAUCOeKelRed/+KHaPB6dPWSI/njMMSp1Ou0uCwNgGIZ+GhSObtuyRf/9xRc2VwUAuYlgBAA54MO2Np3/4Ydq9Xg0c8gQ/XHKFJUQinLGv4wfrx8HltF9d/NmLdq50+aKACD3EIwAIMttPnhQ5/ztb9rf3a1p5eVaSqco5xiGoX+trdW8wACG/7dxo57fs8fmqgAgtxCMACCL7XS7de6HH2p3V5eOGzRIf2ZPUc4yDEO/qKvTjaNHyyvpyk8/1ev799tdFgDkDIIRAGSp5u5unf/hh9p26JDqiov1ynHHMZI7xxmGoUcmTtTfjxihTp9Psz/+WB+0ttpdFgDkBIIRAGShTq9Xf//xx/qwvV1VBQVadtxxqiostLsspIHTMPTU5Mk6c8gQtXk8uvCjj7S9o8PusgAg6xGMACDL+Hw+/b+NG/WXAwdU5nTqz8ceqwklJXaXhTQqcji05JhjNGXQIO3q7NSFH32k/V1ddpcFAFmNYAQAWebezz/Xk01Nckp67uijdaKF07yReypcLv15yhSNLSrS+oMHddknn6jL67W7LADIWgQjAMgii3fv1l3bt0uSHpo4UecNG2ZvQbDV2OJi/d+UKSpzOvWXAwc097PP5PP57C4LALISwQgAssR7LS26bsMGSdKtY8fqO2PG2FwRMsFxZWV6ZvJkOST9786denDHDrtLAoCsRDACgCyw0+3WxR9/rENery4aNkz/UVdnd0nIIBcNH97z/8S8zZv16r59NlcEANmHYAQAGa7T69U3PvlEOzo7dVRpqZ6aPFlOw7C7LGSYW8eO1bVVVfJKuuLTT7WVSXUAEBeCEQBkuHmbN2t1S4sqnE69eMwxquAAV4RhGIYenThRpwwerP3d3br044910OOxuywAyBoEIwDIYE/s2qWHGxtlSHpq8mQdUVpqd0nIYMVOp54/+miNLCjQ39rb9Z1NmxjGAAAWEYwAIEN92NammzZtkiTdPX68Lho+3OaKkA3GFhfr2aOPllPSb5ua9D87d9pdEgBkBYIRAGSglu5ufeOTT9Th9er8YcN057hxdpeELHLGkCFaMGGCJOmfPvtM77e22lwRAGQ+ghEAZBifz6fvbNqkzzo6NLaoSE8eeaQcDFtAnH5QXa2vDx+uTp9Pl33yiZq7u+0uCQAyGsEIADLMY7t26Zndu+UyDD07ebJGFBbaXRKykGEYevzIIzWuqEhbDx3SdzZuZL8RAERBMAKADLK+vV3f/ewzSdK9tbWaVlFhc0XIZkMLCvTM5MlySlq8Z49+zX4jAIiIYAQAGcLt9eqbn36qDq9X5w4dqh9WV9tdEnLAVysqdF9gv9H3Nm/WpoMHba4IADITwQgAMsSPt27V39rbNaKgQE+wrwhJ9MPqap01ZIgOer268tNP1en12l0SAGQcghEAZIDX9+/XL774QpL02KRJGlVUZHNFyCUOw9BvjjpKw1wurW1r079s3253SQCQcQhGAGCzA11dunbDBknSd0aP1tdGjLC5IuSiw4qK9KuJEyVJ/15fr7eam22uCAAyC8EIAGz2/c2b9YXbrbriYv3i8MPtLgc57BsjR+ofq6rklXTthg1q93jsLgkAMgbBCABs9Me9e/VEU5Mckn5z1FEa5HTaXRJy3IOHH66xRUX6rKND87dutbscAMgYBCMAsMm+ri59Z9MmSf7DOE9lNDfSYEhBgR6bNEmS9OCOHVp+4IC9BQFAhiAYAYBN5m3erF2dnTqytFQ/HT/e7nKQR84ZNkw3jh4tSbphwwYdZEkdABCMAMAOL335pX7T1CRD0qJJk1TMEjqk2f11dRpbVKQthw7pzm3b7C4HAGxHMAKANGvr7tacwBK6748dq6+yhA42KHe59D+BKXW//OILvdvSYnNFAGAvghEApNlPtm1Tvdut8cXF+tfaWrvLQR67YPhwXTlypLySbty4UV0c/AogjxGMACCN3mtp0X/t2CFJenTiRKbQwXa/PPxwDXO59Lf2di0MHDIMAPmIYAQAadLt9eo7mzbJJ+mqkSN13rBhdpcEqLKwUL+oq5Mk/cv27dre0WFzRQBgD4IRAKTJw42Ner+tTUNcLg5yRUa5dtQonVFRoQ6vV/+0ebPd5QCALQhGAJAGO91u/SQw+WtBba2qCgttrgjoZRiGHp44US7D0NIvv9Qf9+61uyQASDuCEQCkwT9v3apWj0enDB6sb48ZY3c5QD+TBw3SD6qrJUnf27xZHZxtBCDPEIwAIMVWHjig3wbOLHroiCPkMAy7SwLC+sm4cRpbVKTthw7p3+vr7S4HANKKYAQAKeTx+XTLZ59Jkv7f6NE6ubzc5oqAyAY5nXogMIjh3xsaGMQAIK8QjAAghf63sVF/a2/XEJdL/8aZRcgC36is1FlDhuiQ16sfbt1qdzkAkDYEIwBIkQNdXbpz+3ZJ0j3jx2sEAxeQBQzD0C8PP1wOSc/t2aPlBw7YXRIApAXBCABS5F8//1x7u7p0VGmp5jBwAVlkSlmZvhP4f/Z7n30mj89nc0UAkHoEIwBIgc0HD+rBHTskSQ/U1anAweUW2eWe8eM1xOXS39rb9cSuXXaXAwApx9/UAJACt2/dqi6fT+cPG6bzhw+3uxwgbiMKC/WTceMkST/Ztk3tjO8GkOMIRgCQZKuam/X83r1ySPr5hAl2lwMk7JbDDlNtcbF2dnbqFw0NdpcDAClFMAKAJPL5fPrhli2SpG+NHq1jyspsrghIXJHDoZ8Fwv1/1NerqbPT5ooAIHUIRgCQRC/s3au3WlpU6nDop+PH210OMGCXVVbqlMGD1e716p7AlEUAyEUEIwBIkm6vV3ds2yZJmjd2rMYUFdlcETBwhmHo3wNdo//ZuVNbOPQVQI7KqmD0yCOP6Nhjj1V5ebnKy8s1bdo0vfTSS3aXBQCSpCeamrTh4EENc7n0w5oau8sBkmbm0KE6f9gwdft8ujMQ/gEg12RVMBo7dqx+9rOfae3atXrvvfd01llnafbs2frkk0/sLg1AnnN7vfppYJnRj8eNU4XLZW9BQJItqK2VJP1u926ta221uRoASL6sCkZf+9rXdOGFF+qII47QxIkTdd9996msrExvv/122Me73W61tLT0+QCAVPhVY6Ma3G4dVliomznMFTno+MGDdUVlpSTpLvYaAchBWRWMgnk8Hj3zzDNqb2/XtGnTwj5mwYIFqqio6Pmorq5Oc5UA8sFBj0f/9vnnkqQ7x49XsdNpc0VAavy0tlYOSUu//FLv8GYjgByTdcHoo48+UllZmYqKijRnzhwtWbJEkydPDvvY+fP
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(Xbn, Ybn, xlabel=r'$x_1$', ylabel=r'$x_2$')\n",
"plot_decision_boundary_bayes(fig, X_mean, X_std, xmin=-4.0, xmax=4.0, ymin=-4.0, ymax=4.0)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 41,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"theta = [[-0.31582268]\n",
" [ 0.43496774]\n",
" [-0.21840373]\n",
" [-7.88802319]\n",
" [22.73897346]\n",
" [-4.43682364]]\n"
]
}
],
"source": [
"# Uruchomienie metody gradientu prostego dla regresji logistycznej\n",
"theta_start = np.matrix(np.zeros(Xbnp.shape[1])).reshape(Xbnp.shape[1], 1)\n",
"theta, errors = GD(h, J, dJ, theta_start, Xbnp, Ybn, \n",
" alpha=0.05, eps=10**-7, maxSteps=100000)\n",
"print(r'theta = {}'.format(theta))"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 42,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/2795780436.py:10: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(xx, yy, z, levels=[0.5], colors='m', lw=3);\n"
]
},
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0YAAAHvCAYAAABwqM8XAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAB2UklEQVR4nO3deXhb5Zn//49k2fIiyYkTLwmxIQk7hCUrIZRCoaxtA7RAWcJShhmGtP2W0M5AZ1papp10CgP9lZaUmWmBpoSEQgOlQBughRJCdsISCJANGxIv2SzJi2RJ5/fHsWzZlmzJln0k6/26Ll/U9rF0y06T8/H9PPdjMwzDEAAAAADkMLvVBQAAAACA1QhGAAAAAHIewQgAAABAziMYAQAAAMh5BCMAAAAAOY9gBAAAACDnEYwAAAAA5DyH1QWMpEgkoj179sjtdstms1ldDgAAAIBhZBiGfD6fJk6cKLu9/55QTgWjPXv2qLq62uoyAAAAAIyguro6TZo0qd9rcioYud1uSeY3xuPxWFwNAADZ5a2L3pL3da9qvlujw//1cKvLAYABeb1eVVdXd+WA/uRUMIoun/N4PAQjAABS0PJ+i8Kvh1ViL9FR/3yUCj2FVpcEAElLZhsNwxcAAMCA6n9TL0ka94VxKpxEKAIw+hCMAABAv4yIocbljZKkqhuqLK4GAIYHwQgAAPSr+fVmBT4JKM+Tp7ILy6wuBwCGBcEIAAD0q+G3DZKk8svKlVeYZ3E1ADA8CEYAACChUHNIDcvMYFR1I8voAIxeBCMAAJBQ4/JGRVojKj6+WKWfKbW6HAAYNgQjAACQUMNj3d2iZMbdAkC2IhgBAIC42uva1fxasySp4soKi6sBgOFFMAIAAHFFu0WlZ5aqsJqziwCMbgQjAADQhxExtPf/9kqSqq5n6AKA0Y9gBAAA+jj0t0Nq39GuvNI8ltEByAkEIwAA0EfD78xldBVfrVBeCWcXARj9CEYAAKCHcHtYTX9okiRVXl1pcTUAMDIIRgAAoIf9f9qvsDcs5ySnSs/g7CIAuYFgBAAAeogOXahcUCmbnbOLAOQGghEAAOjStqtNB/9yULJJE/5hgtXlAMCIIRgBAIAu0aELYz43RkVTiiyuBgBGDsEIAABIkgzDUOPyRklS5bUMXQCQWwhGAABAkuR/y6/W91plc9pUfmm51eUAwIgiGAEAAEndQxfGf2m8HKUOi6sBgJFFMAIAAAq3hrv2F024maELAHIPwQgAAGjfM/sUbg6rcHKhxp4z1upyAGDEEYwAAIAal3UOXbiGs4sA5CaCEQAAOS64L6gDfz4gSaq4qsLiagDAGgQjAAByXONjjTJChlynulRyfInV5QCAJQhGAADkMMMwtOd/90iSJvwDQxcA5C6CEQAAOcy3wafWra2yF9lVeQ2HugLIXQQjAAByWMMyc0T3+Es4uwhAbiMYAQCQoyKhiJpWNEmSKq5m6AKA3EYwAgAgRx188aCC9UE5xjlUdl6Z1eUAgKUIRgAA5Ki9/7NXklR5baXsBdwSAMht/C0IAEAOCjYGte/ZfZKkif840eJqAMB6BCMAAHJQ4xONUlhyz3JzdhEAiGAEAEBOanysURJDFwAgimAEAECOaf2wVd61XskuVVxJMAIAiWAEAEDO2ft/5tCFsgvK5JzgtLgaAMgMBCMAAHJIpCOi+kfrJTF0AQBiEYwAAMghB186qI7GDuWX56vsIs4uAoAoghEAADmk4bEGSVL5FeWy53MbAABR/I0IAECOCPlC2ve0eXZR5TWVFlcDAJmFYAQAQI5oXNGoSEtERUcXyXOax+pyACCjEIwAAMgR0Wl0E26aIJvNZnE1AJBZCEYAAOSA1g9b5Vvnk/KkquurrC4HADIOwQgAgBzQ+HijJGnsuWNVUFlgcTUAkHkIRgAAjHKGYajhd+Y0OoYuAEB8BCMAAEa55tea1ba9TXmuPI2/dLzV5QBARiIYAQAwykWHLlR8tUIOl8PiagAgM2VVMFqyZIlOOukkeTweeTwezZ07Vy+88ILVZQEAkLFC/pCanmqSJFXdxNAFAEgkq4LRpEmT9JOf/ESbNm3Sxo0b9bnPfU7z58/X1q1brS4NAICMtP+Z/Yq0RlR0ZJE8czi7CAASyap++he/+MUe7//4xz/WkiVLtHbtWp1wwgl9rg8EAgoEAl3ve73eYa8RAIBMEh26UHF1BWcXAUA/sqpjFCscDmv58uVqaWnR3Llz416zePFilZaWdr1VV1ePcJUAAFin/ZN2HVh1QJJUeS3T6ACgP1kXjN555x25XC45nU7dcsstWrlypY4//vi41955551qbm7uequrqxvhagEAsE79I/VSRCo9s1TFRxVbXQ4AZLSsWkonScccc4y2bNmi5uZmPfnkk7r++uv16quvxg1HTqdTTqfTgioBALCWYRhqeNRcRjfhpgkWVwMAmS/rglFBQYGOPPJISdKMGTO0YcMG/X//3/+nhx56yOLKAADIHL71PrVtb5O92K7xl3F2EQAMJOuW0vUWiUR6DFgAAABSw2Nmt2j8JeM5uwgAkpBVf1PeeeeduvDCC1VTUyOfz6dly5bplVde0V/+8herSwMAIGNEAhE1LDODEUMXACA5WRWMGhsbdd1112nv3r0qLS3VSSedpL/85S/6/Oc/b3VpAABkjH3P7FNof0gFEws09vNjrS4HALJCVgWjX//611aXAABAxqt/uF6SNOFrE2R3ZP2qeQAYEfxtCQDAKBJsCOrAi51nFy1gGR0AJItgBADAKNK4olEKS+5ZbhUfzdlFAJAsghEAAKNI/W/NZXQMXQCA1BCMAAAYJXxbfPJv8suWb1PFVRVWlwMAWYVgBADAKBEdujD+kvEqKC+wuBoAyC4EIwAARoFIKKLGxxslSVU3VFlcDQBkH4IRAACjwMEXD6qjqUP55fmcXQQAg0AwAgBgFGj4bYMkqeLKCtnz+ecdAFLF35wAAGS5jgMdalrZJEmqvJ5pdAAwGAQjAACyXOPjjTIChkpOKpF7htvqcgAgKxGMAADIcvVLzWl0VTdWyWazWVwNAGQnghEAAFmsdXurfOt8kl2q+CpnFwHAYBGMAADIYg1LzaELY88dK2eV0+JqACB7EYwAAMhSRsRQ/aOdy+iu5+wiABgKghEAAFnq4F8PKvBxQHmleRp/6XirywGArEYwAgAgS0XPLqq8qlJ5RXkWVwMA2Y1gBABAFgq3htX0h86zixZwdhEADBXBCACALLTv6X2KtERUOKVQnrkeq8sBgKxHMAIAIAvVP2wOXai8tpKziwAgDQhGAABkmfaP23Xw5YOSmEYHAOlCMAIAIMvUL62XDGnM2WNUNKXI6nIAYFQgGAEAkEUMw1DD7zqn0V3H0AUASBeCEQAAWcS/2a+2D9pkL7Sr/LJyq8sBgFGDYAQAQBapf9QcujBu/jg5PA6LqwGA0YNgBABAlgi3h7uW0VXdwNAFAEgnghEAAFli/7P7FToYknOSU2WfL7O6HAAYVQhGAABkia6hC9dWypbH2UUAkE4EIwAAskDH/g4deOGAJDMYAQDSi2AEAEAWaFzeKKPDkOtUl0pOKLG6HAAYdQhGAABkgb2/2StJqrqeoQsAMBwIRgAAZDj/O375N/tly7ep4poKq8sBgFGJYAQAQIZrWGoOXRh38TgVjC+wuBoAGJ0IRgAAZDAjYqjx8UZJDF0AgOFEMAIAIIMdeuWQAp8E5BjjUNnFnF0EAMOFYAQAQAarf7heklTx1QrlFeZZXA0AjF4EIwAAMlTIG1LTU02SpKobmUYHAMOJYAQAQIZqerJJkbaIio8tlnuW2+pyAGBUIxgBAJChGh4zp9FVXlspm81mcTUAMLoRjAAAyEDtde069LdDksTZRQAwAghGAABkoIalDZIhlX62VEVHFFldDgCMegQjAAAyjGEYXdPoqm5g6AIAjASCEQAAGca7zqu27W2yl9hV/pVyq8sBgJxAMAIAIMM0LDWHLpRfVi6Hy2FxNQCQG7IqGC1evFizZs2S2+1WRUWFLrnkEn3wwQdWlwUAQNpEghE1rmiUJFVeU2lxNQCQO7I
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(Xbnp, Ybn, xlabel=r'$x_1$', ylabel=r'$x_2$')\n",
"plot_decision_boundary(fig, theta, Xbnp, xmin=-4.0, xmax=4.0)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 43,
2022-12-09 15:06:17 +01:00
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/4022158666.py:8: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(x1, x2, p_diff, levels=[0.0], colors='c', lw=3);\n",
2023-01-13 14:18:12 +01:00
"/tmp/ipykernel_22218/2795780436.py:10: UserWarning: The following kwargs were not used by contour: 'lw'\n",
2022-12-09 15:06:17 +01:00
" plt.contour(xx, yy, z, levels=[0.5], colors='m', lw=3);\n"
]
},
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0YAAAHvCAYAAABwqM8XAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAACdT0lEQVR4nOzdeXxcZb0/8M+ZNZNZsm9tkzYtXSgtpTtdgCIFWdQCCkjZRRSpVwX0Cl4F5ae3ekW4G6D3XtkKhSJQsCIIiJTuK6WldF9I2jRb02SWTGY75/fHmTOZZJZMkpk5s3zevPKiyWxPJu3J+Zzv83wfQZIkCURERERERHlMo/YAiIiIiIiI1MZgREREREREeY/BiIiIiIiI8h6DERERERER5T0GIyIiIiIiynsMRkRERERElPcYjIiIiIiIKO/p1B5AOomiiKamJlitVgiCoPZwiIiIiIgohSRJgsPhwIgRI6DRxK8J5VUwampqQm1trdrDICIiIiKiNGpsbMSoUaPi3ievgpHVagUgvzE2m03l0RARpc6Wri5cu3cvnIEAZlmteO2cc1Cs16s9LEqQJEn48dGj+ENTEwBgeX097hngF3o6fHLlJ7BvsKPuJ3UY/ePRag+HiGhAdrsdtbW1oRwQT14FI2X6nM1mYzAiopy1vrMT1xw7BldBARYVF2PNlCmw6PLqcJ8TnjrvPJQUFeHXDQ14sLUVOosFP6yrU208rn0uBDYEYNaYMf4741FgK1BtLEREg5XIMho2XyAiyiEbu7pwxZ49cIkiFpeU4K2pUxmKspQgCPjX+no8NFquzPzo6FE81tio2nian24GAJR9qQwFoxiKiCj3MBgREeWIrXY7Lt+9G85AAJcUF+PPU6agUKtVe1g0DIIg4Bdh4ej+I0fw3ydOpH0ckiih9eVWAED17dVpf30ionRgMCIiygG7nU5cvns3HIEAFhUX489Tp8LEUJQzfj5mDP4lOI3unw4fxjOnTqX19bs2dMFzwgOtTYvSK0rT+tpEROnCYERElOUOd3fj0k8+wRm/H/NsNqxhpSjnCIKA/1dfj3uDDRi+eeAAXmtrS9vrtzzfAgCouLYC2gL+3SKi3MRgRESUxU55PLhs9260+nyYZjbjr1xTlLMEQcDvxo3DXTU1EAEs/ewzfHDmTMpf19/lR8tKORhV38FpdESUuxiMiIiyVJffj8t378axnh6MKyjA36ZNY0vuHCcIAp6aMAFfLS+HV5Kw5NNP8bHDkdLXbH25FWK3iMLJhSi6oCilr0VEpCYGIyKiLOQVRXz100+x2+VClV6Pd6dNQ5XBoPawKA20goAXJ0/GxcXFcAYCuHLPHhx3u1P2ei0v9laLEml3S0SUrRiMiIiyjCRJ+OaBA/h7ZycsWi3+eu65GGsyqT0sSiOjRoPVU6ZgqtmMZq8XV+7ZgzM+X9Jfp6exB13rugAAlTdUJv35iYgyCYMREVGW+eXnn2NFSwu0AF495xzMSGA3b8o9RTod/jp1KkYZjdjX3Y3r9u6FTxST+hpKtajowiIU1HLvIiLKbQxGRERZZFVrKx46fhwA8MSECfhiKVsn57NRBQX4y9SpsGi1+HtnJ5YdOgRJkpLy3JIo4dT/yW3Bq29j0wUiyn0MRkREWWK73Y7b9+8HANw3ahS+PWKEyiOiTDDNYsHLkydDA+B/T53Cf508mZTn7fxHJ3qO9EBbpOU0OiLKCwxGRERZ4JTHg6s//RQ9ooirSkvxb+PGqT0kyiBXlZWF/k7ce/gw3uvoGPZztrwgT6Or/HoltGbuXUREuY/BiIgow3lFEV/buxcnvV6cXViIFydPhpbdwaif+0aNwm1VVRAB3PDZZzg6jE51gZ4A2l6XN5CtWlqVpBESEWU2BiMiogx37+HD2Gi3o0irxZtTpqCIG7hSFIIg4PcTJmCO1Yozfj+u/fRTdAcCQ3qu0385jYA9AOMoI4oWcu8iIsoPDEZERBnsueZmPNnUBAHAi5MnY3xhodpDogxWoNXitXPOQaVej09cLnz74MEhNWNQmi5U3VIFQcPqJBHlBwYjIqIMtdvpxHcOHgQAPDxmDK4qK1N5RJQNRhUU4JVzzoEWwAstLfifU6cG9Xj3MTfO/O0MIAA136xJzSCJiDIQgxERUQay+/342t69cIsiLi8txc9Gj1Z7SJRFLiouxvKxYwEA3zt0CDsdjoQfqzRdKP5CMUxjuXEwEeUPBiMiogwjSRK+ffAgDrndGGU0YsWkSdCw2QIN0g9ra/GVsjJ4JQnX7d2LLr9/wMdIkoTWl1sBAFU3s+kCEeUXBiMiogzzdHMzXm5thU4Q8MrkySg3GNQeEmUhQRDw7KRJGG004mhPD7594MCA642cnzjR/Vk3BKOAimsq0jRSIqLMwGBERJRB9rlc+KdDhwAAv6yvx7widgSjoSvR6/Hy5MnQAljV1oY/DrDeSGm6UP6VcuiK2P2QiPILgxERUYbwiCJu/OwzuEURl5WU4Ee1tWoPiXLA+UVF+FVwvdH3Dx/Gwe7uqPcLdAdC64tq7mLTBSLKPwxGREQZ4l+OHsUnLhfK9Xo8x3VFlEQ/qq3FF4qL0S2KWPrZZ/CKYsR92t9sR6ArgIL6ApRcUqLCKImI1MVgRESUAT44cwa/O3ECAPD0xImoNhpVHhHlEo0g4Pmzz0apTocdTid+fvx4xH1aVwabLtzEvYuIKD8xGBERqazT58Nt+/cDAL5dU4Mvl5erPCLKRSONRvxhwgQAwG8aGrCpqyt0m7fdi453OgAAlTdWqjI+IiK1MRgREansB4cP44THg3EFBfjdWWepPRzKYV+rrMTNVVUQAdy2fz9cgQAAoPXFVkh+CZbpFpgnm9UdJBGRShiMiIhU9Of2djzX0gINgOfPPhtmrVbtIVGO+6+zzsIooxGH3G48ePQoJElC0/82AQBqvsmmC0SUvxiMiIhU0uHz4dsHDwKQN+Ocz9bclAbFej2enjgRAPBfJ09i7QdN6N7bDY1Jg6qbuKkrEeUvBiMiIpXce/gwmr1eTCosxC/GjFF7OJRHLi0txV01cnXovf85CgAov5p7FxFRfmMwIiJSwdunT+P5lhYIAJ6ZOBEFnEJHafbouHGo0xow4315nVHlUjZdIKL8xmBERJRmTr8fdwen0P1g1Ciczyl0pAKbTof/aRqBsg6gywYcPZ/VIiLKbwxGRERp9tNjx9Dg8WBMQQH+X3292sOhPDbyZScA4L1LgW8dOwRflI1fiYjyBYMREVEabbfb8Z8nTwIAfj9hArvQkWq8rV60r2kHAKxbosUnLhceD24yTESUjxiMiIjSxC+K+PbBg5AA3FRZiS+Wlqo9JMpjra+0AgHAOtuKexfL+2f9/PhxHHe7VR4ZEZE6GIyIiNLkyaYm7HQ6UazTcSNXUl3ri60A5KYLt1VX46KiIrhFEd87fFjlkRERqYPBiIgoDU55PPjpsWMAgOX19agyGFQeEeWz7oPdsG+2Axqg8oZKCIKAJydMgE4QsOb0afy5vV3tIRIRpR2DERFRGvzz0aNwBAKYY7XiWyNGqD0cynOn/u8UAKD08lIYa4wAgMlmM35YWwsA+P7hw3AHAqqNj4hIDQxGREQptr6zEy8E9yx6Yvx4aARB7SFRHhN9IpqfawYAjPhW35D+09GjMcpoxPGeHvymoUGN4RERqYbBiIgohQKShO8eOgQA+GZNDWbZbCqPiPLdmffPwNfqg75Cj9Ir+zYAMWu1eGzcOADAbxob2YiBiPIKgxERUQr9b1MTPnG5UKzT4V+5ZxFlgJYXWwAAFddXQKOPPA34WkUFvlBcjB5RxI+OHk338IiIVMNgRESUIp0+H352/DgA4JExY1DOhgukMr/Dj/Y35MYKVTdVRb2PIAj497POggbAq21tWNvZmb4BEhGpiMGIiChF/t/nn6Pd58PZhYW4mw0XKAO0rmqF6BJhmmCC7fzY0zqnWiz4dvDv7PcPHUJAktI1RCIi1TAYERGlwOHubvzXyZMAgMfGjYNew8MtqU/pRldzZw2EAZqAPDJmDIp1OnzicuG55uZ0DI+ISFX8TU1ElAIPHD0KnyT
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(Xbn, Ybn, xlabel=r'$x_1$', ylabel=r'$x_2$')\n",
"plot_decision_boundary_bayes(fig, X_mean, X_std, xmin=-4.0, xmax=4.0, ymin=-4.0, ymax=4.0)\n",
"plot_decision_boundary(fig, theta, Xbnp, xmin=-4.0, xmax=4.0)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Naiwny klasyfikator Bayesa nie działa, jeżeli dane nie różnią się ani średnią, ani odchyleniem standardowym."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 9.2. Algorytm $k$ najbliższych sąsiadów"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### KNN intuicja"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Do której kategorii powinien należeć punkt oznaczony gwiazdką?"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 44,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Przydatne importy\n",
"\n",
"import ipywidgets as widgets\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import pandas\n",
"\n",
"%matplotlib inline"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 45,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Wczytanie danych (gatunki kosaćców)\n",
"\n",
"data_iris = pandas.read_csv('iris.csv')\n",
"data_iris_setosa = pandas.DataFrame()\n",
"data_iris_setosa['dł. płatka'] = data_iris['pl'] # \"pl\" oznacza \"petal length\"\n",
"data_iris_setosa['szer. płatka'] = data_iris['pw'] # \"pw\" oznacza \"petal width\"\n",
"data_iris_setosa['Iris setosa?'] = data_iris['Gatunek'].apply(lambda x: 1 if x=='Iris-setosa' else 0)\n",
"\n",
"m, n_plus_1 = data_iris_setosa.values.shape\n",
"n = n_plus_1 - 1\n",
"Xn = data_iris_setosa.values[:, 0:n].reshape(m, n)\n",
"\n",
"X = np.matrix(np.concatenate((np.ones((m, 1)), Xn), axis=1)).reshape(m, n_plus_1)\n",
"Y = np.matrix(data_iris_setosa.values[:, 2]).reshape(m, 1)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 46,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Wykres danych (wersja macierzowa)\n",
"def plot_data_for_classification(X, Y, xlabel, ylabel): \n",
" fig = plt.figure(figsize=(16*.6, 9*.6))\n",
" ax = fig.add_subplot(111)\n",
" fig.subplots_adjust(left=0.1, right=0.9, bottom=0.1, top=0.9)\n",
" X = X.tolist()\n",
" Y = Y.tolist()\n",
" X1n = [x[1] for x, y in zip(X, Y) if y[0] == 0]\n",
" X1p = [x[1] for x, y in zip(X, Y) if y[0] == 1]\n",
" X2n = [x[2] for x, y in zip(X, Y) if y[0] == 0]\n",
" X2p = [x[2] for x, y in zip(X, Y) if y[0] == 1]\n",
" ax.scatter(X1n, X2n, c='r', marker='x', s=50, label='Dane')\n",
" ax.scatter(X1p, X2p, c='g', marker='o', s=50, label='Dane')\n",
" \n",
" ax.set_xlabel(xlabel)\n",
" ax.set_ylabel(ylabel)\n",
" ax.margins(.05, .05)\n",
" return fig"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 47,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"def plot_new_example(fig, x, y):\n",
" ax = fig.axes[0]\n",
" ax.scatter([x], [y], c='k', marker='*', s=100, label='?')"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 48,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0cAAAHvCAYAAACfaqQpAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABXFklEQVR4nO3de3xU1b3///dMJgnRkGis3GSiRAheQEWwMVwUj1gvHA3aU5FvBaRqFaFyUdvaX4+eUlusVezpAwPa46XGo0RrEasW5GIohItcFVRExJJULlrBhGCa2+zfH/skZCaTyexk9syemdfz8ZiHzt6fWfnM2rFdn6y913IZhmEIAAAAAJKcO9YJAAAAAIATUBwBAAAAgCiOAAAAAEASxREAAAAASKI4AgAAAABJFEcAAAAAIIniCAAAAAAkURwBAAAAgCTJE+sEos3n82n//v3q3r27XC5XrNMBAAAAYCPDMHT06FH16dNHbnfouaGkK472798vr9cb6zQAAAAARFFlZaX69u0bMibpiqPu3btLMjsnKysrxtkAAAAAsFN1dbW8Xm9LHRBK0hVHzbfSZWVlURwBAAAASSKcR2pYkAEAAAAARHEEAAAAAJIojgAAAABAEsURAAAAAEiiOAIAAAAASRRHAAAAACCJ4ggAAAAAJFEcAQAAAIAkiiMAAAAAkERxBAAAAACSKI4AAAAQqL6+a+cjzWo+TssfcYPiCAAAAMeVlkqDB0uVlcHPV1aa50tLnZmP0/JHXIlpcTR37lxddNFF6t69u3r06KFx48bp448/DvmZ5557Ti6Xy+/VrVu3KGUMAACQwOrrpQcekHbvlkaPbltgVFaax3fvNuPsnoGxmk9NjbPyR9yJaXG0evVqTZs2TRs2bNDy5cvV0NCg73znOzp27FjIz2VlZenAgQMtr3379kUpYwAAgASWliatWCHl5Ul79/oXGM2Fxd695vkVK8x4J+WTmems/BF3XIZhGLFOotmXX36pHj16aPXq1brkkkuCxjz33HOaOXOmvv766079jOrqamVnZ6uqqkpZWVldyBYAACBBBRYSJSXSxInH35eVSV6vc/NxWv6IKSvjf0c9c1RVVSVJysnJCRlXU1Oj008/XV6vV0VFRfrggw/aja2rq1N1dbXfCwAAACF4vWYB0TwDM2JEbAsLq/k4LX/EDccURz6fTzNnztSIESM0aNCgduMGDhyoZ555RkuWLNELL7wgn8+n4cOH6x//+EfQ+Llz5yo7O7vl5eU/BgAAgI55veaMS2slJbErLKzm47T8ERccc1vd1KlT9de//lVr165V3759w/5cQ0ODzj77bE2YMEG//OUv25yvq6tTXV1dy/vq6mp5vV5uqwMAAAil9a1pzWI582I1H6flj5iJu9vqpk+frjfeeEPvvPOOpcJIklJTUzVkyBDt2bMn6Pn09HRlZWX5vQAAABBC4DM75eXBFzlwaj5Oyx9xI6bFkWEYmj59uhYvXqxVq1apX79+lttoamrSjh071Lt3bxsyBAAASDKBhUVZmTR8uP8zPNEsMKzm47T8EVdiWhxNmzZNL7zwgl588UV1795dBw8e1MGDB1VbW9sSM2nSJN1///0t7+fMmaO3335be/fu1datW3XzzTdr3759uu2222LxFQAAABJHfb00ZkzwxQsCFzkYMyY6+xxZyaemxln5I+7EtDhasGCBqqqqNHr0aPXu3bvlVdpqx+KKigodOHCg5f2RI0d0++236+yzz9Y111yj6upqrVu3Tuecc04svgIAAEDiSEuT5syR8vNDrwKXn2/GRWOfIyv5ZGY6K3/EHccsyBAt7HMEAADQgfr60IVDR+djnY/T8kdMxd2CDAAAAHCQjgqHaBcWVvNxWv6IGxRHAAAAACCKIwAAAACQRHEEAAAAAJIojgAAQLLqaBnnRFvmuaama+eBJEBxBAAAkk9pqTR4cPsbgVZWmudbbS8S12bMkHJypI0bg5/fuNE8P2NGdPMCHIalvAEAQHKprzcLn927224UKpmF0ejR5kah+fnSjh3xvbpZTY1Z+DQ0SB6PtHatVFBw/PzGjdLIkVJjo5SaKh0+bO4XBCQIlvIGAABoT1qatGKFWRjt3WsWQs0zSK0Lo7w8My6eCyPJLHTWrDELo8ZGsxBqnkFqXRh5PGYchRGSGMURAABIPl6vOWPUukBat86/MAqcUYpnBQXmjFHrAunJJ/0Lo8AZJSAJcVsdAABIXq1nipolWmHUWuuZomYURkhw3FYHAAAQDq9XKinxP1ZSkpiFkWQWQPPn+x+bP5/CCPg/FEcAACB5VVZKEyf6H5s4sf1V7OLdxo3S9On+x6ZPb38VOyDJUBwBAIDkFLj4Qnl58EUaEkXg4gsLFwZfpAFIYhRHAAAg+QQWRmVl0vDhbRdpSJQCKbAwWrtWuuOOtos0UCAhyVEcAQCA5FJfL40ZE3xVusBV7MaMMePjWU2NNGpU8FXpAlexGzXKjAeSFMURAABILmlp0pw55gavwValay6Q8vPNuETY52jqVHOD12Cr0jUXSKmpZhz7HCGJsZQ3AABITvX1oQufjs7Hm5qa0IVPR+eBOMVS3gAAAB3pqPBJpMJI6rjwoTACKI4AAAAAQKI4AgAAAABJFEcAACBZdbQKXeB5p8Xbye5c4r19BJcA/U5xBAAAkk9pqTR4cPv7GFVWmudLS50Zbye7c4n39hFcovS7kWSqqqoMSUZVVVWsUwEAALFQV2cY+fmGIRlGXp5hVFT4n6+oMI9LZtzRo86Kr6uLfJ80s9o3VnOJ9/YRnMP73cr4n+IIAAAkn9aDtdaDuXg57qS+Sbb2EZyD+53iKASKIwAAYBhG20FbeXnoQZzT4u1kdy7x3j6Cc2i/Wxn/swksAABIXpWV0ujR0t69x4/l5UllZZLX6/x4O9mdS7y3j+Ac2O9sAgsAABAOr1cqKfE/VlLS/iDOafF2sjuXeG8fwcV5v1McAQCA5FVZKU2c6H9s4sTQK245Kd5OducS7+0juDjvd4ojAACQnFrf/pOXJ5WXm//cu9c8HjiYc1q8nezOJd7bR3CJ0O+2PwHlMCzIAAAAHLf6nJNW+or31eSc1JfJxMH9zmp1IVAcAQCQ5NjnqH3xvg+Rw/fbSVgO73eKoxAojgAAgLFokTlIa++v2BUV5vlFi5wZbye7c4n39hGcg/udpbxDYClvAAAgSaqvl9LSwj/vtHg72Z1LvLeP4Bza7yzlDQAA0JGOBmmB550Wbye7c4n39hFcAvQ7xREAAAAAiOIIAAAAACRRHAEAACSH+vqunY9V24gsrlVIFEcAAACJrrRUGjy4/U04KyvN86WlzmobkcW16hCr1QEAACSy+npzwLt7t5SXJ5WVSV7v8fOVldLo0dLevVJ+vrRjR/gPztvZNiIria8Vq9UBAADAlJYmrVhhDoj37jUHwM0zB60HxHl5ZpyVAbGdbSOyuFZhoTgCAABIdF6vOVPQemC8bp3/gDhwJsEJbSOyuFYd4rY6AACAZNF6hqBZpAbEdraNyEqya8VtdQAAAGjL65VKSvyPlZREZkBsZ9uILK5VuyiOAAAAkkVlpTRxov+xiRPbX73MKW0jsrhW7aI4AgAASAaBD92Xlwd/ON9pbSOyuFYhURwBAAAkusABcVmZNHx424fzOzMwtrNtRBbXqkMURwAAAImsvl4aMyb4amSBq5eNGWPGO6FtRBbXKiwURwAAAIksLU2aM8fc2DPYamTNA+P8fDPO6j5HdrWNyOJahYWlvAEAAJJBfX3oAW9H52PVNiIrCa8VS3kDAADAX0cD3q4MiO1sG5HFtQqJ4ggAAAAARHEEAAAAAJIojgAAAABAEsURAACAM3S0dHLgeTvjrbZtld3tJxP6MqIojgAAAGKttFQaPLj9zTcrK83zpaX2x1tt2yq7208m9GXEsZQ3AABALNXXmwPY3bvbbs4pmQPc0aPNzTnz86UtW6ShQ+2JHzDAPPbJJ+G1vWOHtdXNrH5Xq+0nE/oybCzlDQAAEC/S0qQVK8wB7t695oC2eSag9QA3L8+My8y0L37lSvMVbttWB9tWv2u
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(X, Y, xlabel=u'dł. płatka', ylabel=u'szer. płatka')\n",
"plot_new_example(fig, 2.8, 0.9)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Wydaje się sensownym przyjąć, że punkt oznaczony gwiazdką powinien być czerwony, ponieważ sąsiednie punkty są czerwone. Najbliższe czerwone punkty są położone bliżej niż najbliższe zielone."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Algorytm oparty na tej intuicji nazywamy algorytmem **$k$ najbliższych sąsiadów** (*$k$ nearest neighbors*, KNN)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Idea (KNN dla $k = 1$):\n",
" 1. Dla nowego przykładu $x'$ znajdź najbliższy przykład $x$ ze zbioru uczącego.\n",
" 1. Jego klasa $y$ to szukana klasa $y'$."
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 49,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"from scipy.spatial import Voronoi, voronoi_plot_2d\n",
"\n",
"def plot_voronoi(fig, points):\n",
" ax = fig.axes[0]\n",
" vor = Voronoi(points)\n",
" ax.scatter(vor.vertices[:, 0], vor.vertices[:, 1], s=1)\n",
" \n",
" for simplex in vor.ridge_vertices:\n",
" simplex = np.asarray(simplex)\n",
" if np.all(simplex >= 0):\n",
" ax.plot(vor.vertices[simplex, 0], vor.vertices[simplex, 1],\n",
" color='orange', linewidth=1)\n",
" \n",
" xmin, ymin = points.min(axis=0).tolist()[0]\n",
" xmax, ymax = points.max(axis=0).tolist()[0]\n",
" pad = 0.1\n",
" ax.set_xlim(xmin - pad, xmax + pad)\n",
" ax.set_ylim(ymin - pad, ymax + pad)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 50,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0sAAAHvCAYAAACFVkSnAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOydd3hUZfbHP5MeSCEBMgkQSoDQew8giRJEXHvBhsrqrvITV+xlXVSsq65l1UXdVVQURbErIEQMEnrvECBAAiQhtIRQUu/vj0NMITOZfu9k3s/zzJPk3vfe97z3zkzec99zvsekaZqGQqFQKBQKhUKhUChq4ae3AQqFQqFQKBQKhUJhRJSzpFAoFAqFQqFQKBT1oJwlhUKhUCgUCoVCoagH5SwpFAqFQqFQKBQKRT0oZ0mhUCgUCoVCoVAo6kE5SwqFQqFQKBQKhUJRD8pZUigUCoVCoVAoFIp6UM6SQqFQKBQKhUKhUNRDgN4GeJrKykoOHTpEeHg4JpNJb3MUCoVCoVAoFAqFG9E0jZMnT9KqVSv8/OxbK/I5Z+nQoUPEx8frbYZCoVAoFAqFQqHwIDk5ObRp08auY3zOWQoPDwfkYkVEROhsjQ+z52NYdx9csReCovS2RuGtrH0QDi+CS9brbYnnWflXKNoOqUv0tsQ6y2+D04eg/2uwcAQMfAcSbtHbKvdQfgbmD4RmvWDEF/W3OX0Q1t4HuQuh3fXQ958QHO1ZOxXeQekJ+DERejwBXafItmMbIG0UjF4MJzbBmnvh0q3Q1L7JHwBFu+RcrcbBkP+CirbxHSpK4adEaHcz9H3e9uPKiuCHztDtIej+sPvscwNFRUXEx8f/4QfYg885S1WhdxEREcpZ0pPEy2H73+DMWmhxtd7WKLyVjhdD7v8g4CQ0aa23NZ4l4WJY+SWEVBj7gUPCGFjzN2jdF7pcDftfhd5/Ab9AvS1zAxEw/FVYegOcXg2xF9XTJALG/QJ7Z4rTtGQxDJoO8Vd53lyFsdk1C0LKoddfIPTcfKU8DJoAEWHQ+jbY+Sgc/Rbi/m7/+SMGQPJ/YdlNUDAaOv3VpeYrDEz21+B/HHrfLd9JNhMBXa6Dw1/AkGle6WA7koKjBB4U+tC0HYQnytNVhcJRYkbJz/x0Xc3QBXMyoMHh3/W2xDoxyaCVQ8FS6PU0nNoLez/R2yr30fZ6aJEE6x6Ayor625hMkHAr/GkbtBgKS66GjPFwtsCztiqMTdYMiBsLoXH17w8Mh7bXQtZHoGmO9dH+Rug8SR5oHPPBFXpfJWsGRA+CZj3sP7bjRCjeDQUZrrfLoChnSaEfsamQt0BvKxTeTEhLiOwJh3/T2xLPE9ZBHjrkG3zsEV0hJFbsbNYL2l4HW56VMJDGiMkE/V+XEKmsD623DY2Dkd9C0izI/xV+7g77Zzs+8VU0Hgq3wdFVkDDRersEF0xc+78GkT0g4zooLXT8PArv4Ewu5M4Tp8cRYkZB0/bicPkIyllS6EdcKhRnyUuhcBRzim+uLIF3jN1kklWwKjt7PgWnsmHvRzoa5WZaDIb2t8CmJyXG3xomkzzdH7dVVuGW3gBLroEzeR4xVWFQsmZAcHNofZn1djEXQNMOzk1c/UNg5FdQcgRW3qGc9cbO3plgCoR2Nzh2vMkPEm6H7C+hrNilphkV5Swp9CMmGUz+KhRP4RzmZCjeA6dy9LbE88Qkw4mNUHJUb0usE5MMx1ZD2UkJ+2g3HrY8DxUlelvmPvq+KOPd+oJt7UPNMmEd8ZWsEvzcA/Z+qiauvkhlmUxo290M/kHW27pq4hqWAENnQM7XkPmW4+dRGBtNE8c6/irncl073AblpyBnjutsMzDKWVLoR1CkxOvnKWdJ4QQxowCT8cPR3IE5RX4eXqyvHQ1hTgGtojpUqOdUOJ3TcJiaN9OkDXR7BHa8bt/qedtr4dJtEHcxLJ8Aiy8XBT2F73BoPpzNtz1MKuE2KD/t/MQ1/irocj+sfwiOrHTuXApjcnQlFO1oOLyzIcLag/lCnwnFU86SQl9iUyHvV8uJ0ApFQwQ3h2a94XC63pZ4nqZt5Ymw0UPxwjtDaKtqOyO7QbsbZdWlMa8udX8YglvC+kftOy6kBQyfBRd8B8fWyCrTnhlqlclXyJoBUX3lZQtN27lu4tr3JYgaABnXQ8kx58+nMBZZM+RBjrkepU57SZgoAkMn9zh/LoOjnCWFvsSmQtkJmRAoFI5iTvbNlSWQEDejj91kOt/OXlPhzCHY8z/dzHI7AU0lHC9njmOqhW2uEMW8NlfCyj9D+iWS76VovJwtgIM/2v/k31UTV/8gGDEbyoulRppW6dz5FMah/DTs/0JC6Pz8nT9f/NUQGCFqjI0c5Swp9KX5YPmwqVA8hTOYU+DUPijep7clnsecAoVbjC87bU6B42ur1bYiukhOxtYXoOKsvra5k/Y3i0Tv2vsdm3gGRcGwj2DUz3BiC/zcE3a9p1aZGiv7PpOHC+1usu+4+KtcN3Ft2haGzYRDP8H2V+0/vrQBpcuG9rsDR2wy4jicIedbEZxJuN015wtoAm3Hw96PG310kHKWFPriFyDhA7lKQlzhBDEXACbfDMUzJ8tPr8hbqqwtcdzzH5Kbsft9/exyNyY/GPA6HF/nXH2p1uPg0q0ijrH6blg0Gor3us5Ohf5UJd+3vlxCMe0hoImom7lq4tp6HHR/HDY+AYeX2H7c7NnQqxfkWBDcycmR/bNnO2+jO20y4jicJWsGtBwJ4Z1cd86EiZJ/mr/Idec0ILo6Sy+++CKDBg0iPDycmJgYrrzySnbu3Gn1mI8++giTyVTrFRIS4iGLFW4hNhWOLBflKIXCEYKiJL7f6OFo7qBJGwjrZPyxhyWIrTXtjOgMHSbA1heh/Ix+trmblsPlCezGJ5xTLAuKhCH/hZQFcHI3zO0FO99WoVKNhePrpT6Xo8n3rp649p4m792lN8DZww23Ly2FqVMhMxOSk893NHJyZHtmprTzxMqMIzYZcRzOcmq/vC+cFXaoS4uhEiXQyIUedHWWFi9ezD333MOKFStYuHAhZWVljBkzhlOnTlk9LiIigtzc3D9e+/fv95DFCrcQmwpaufGfjCuMjTlFJuK+GJ5UNXYjYzJBTD129ngSSgpg97v62OUp+v1TEua3/dP5c8WlwqVboMOtsPZe+DVFnCeFd5M1QwoVx13s2PHNh0gRaFdNXP0CIOlz+f+87OaGV6yCgiAtDRISICurtqNR5WBkZcn+tDRp724cscmI43CWrI/Phc1d59rzmkzigB34FkpPuPbcBkJXZ2n+/Pncfvvt9OjRgz59+vDRRx+RnZ3N2rVrrR5nMpmIjY3942U2mz1kscIthHeSatCq3pLCGcwp8lT1lA+GJplToGg7nMnX2xLrmFPk6XnNf6rhHSXheNs/JQG5sdK0HXR7EHa86hqRhsBwGPQfuGgRnD4Ac3uLTHkjzx1otFSUwL5Z0H6COCmO4I6Ja5NWkDRLVGu3Pt9w+/h4SE+v7WgsW1bbwUhPl3aewhGbjDgOR9EqJZet7XUQGOb687efAJWlIh7RSDFUzlJhoST+RkdHW21XXFxMu3btiI+P54orrmDr1q0W25aUlFBUVFTrpTAYJtM5CXGVt6RwgpYjJT/E6Css7uCPvKV0Pa1oGHMyoJ2vDNfzSSmsu2u6HlZ5ju6PQWAz2PCY685pToFxm6DTX2Hdg5A2Egp3uO78Cs9w8AcoPeZ8mFSHCVLU1pUT19iLoNfTsPlpcZoaoq6jMXy4/g6GIzYZcRyOcPh3eYjo6hC8Kpq0grixjToUzzDOUmVlJVOmTGH48OH07NnTYrsuXbrw4Ycf8v333/Ppp59SWVlJUlISBw4cqLf9iy++SGRk5B+veG95c/sacalSKO10/fdRoWiQoEiI6u+bzlJonMSNG33sYR1khaWunWEd5B/5tn9KVfjGSmA49Hke9n8OBctdd96ApjDgDRj9O5QcgXl9YdvLUFnuuj4U7mXPDGg+FCK7Onee0Dj3TFx7/B1iR8Oym+D0oYbbx8fDzJm1t82cqa+D4YhNRhyHvWTNgLCO8kDRXSR
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(X, Y, xlabel=u'dł. płatka', ylabel=u'szer. płatka')\n",
"plot_new_example(fig, 2.8, 0.9)\n",
"plot_voronoi(fig, X[:, 1:])"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Podział płaszczyzny jak na powyższym wykresie nazywamy **diagramem Woronoja** (*Voronoi diagram*)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Taki algorytm wyznacza dość efektowne granice klas, zwłaszcza jak na tak prosty algorytm. "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Niestety jest bardzo podatny na obserwacje odstające:"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 51,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"X_outliers = np.vstack((X, np.matrix([[1.0, 3.9, 1.7]])))\n",
"Y_outliers = np.vstack((Y, np.matrix([[1]])))"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 52,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0sAAAHvCAYAAACFVkSnAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOydd3hU1daH30klkEICpAChBAg19JpQEiE0u6LYULleP+WKir2jYi/XrqD3KnpRFMGGUoQIQUIndBACBEiAJIQAKRBSz/fHSkwhmUw/k2S/zzNPMufsffba55yZ2evstX/LoGmahkKhUCgUCoVCoVAoquCitwEKhUKhUCgUCoVC4YwoZ0mhUCgUCoVCoVAoakA5SwqFQqFQKBQKhUJRA8pZUigUCoVCoVAoFIoaUM6SQqFQKBQKhUKhUNSAcpYUCoVCoVAoFAqFogaUs6RQKBQKhUKhUCgUNaCcJYVCoVAoFAqFQqGoATe9DXA0paWlnDx5Eh8fHwwGg97mKBQKhUKhUCgUCjuiaRq5ubm0bt0aFxfz5ooanbN08uRJQkND9TZDoVAoFAqFQqFQOJDU1FTatm1rVp1G5yz5+PgAcrJ8fX11tqYRc/gr2PYgXH0EPPz1tkZhT9LiYO31MGIRhMTqbY1C4VjW3gClxTDqJ3l/4QQkPghpK6H9jdD3DfAM0NdGhXNSeA5+DYeeT0O3GbLtzA6IGwVj1sC5XbD1frh8LzQzb/AHQM5BOVbriTDkP6CibRoPJYXwWzi0vxX6vmJ6vaIcWNwFuj8KPR6zn312ICcnh9DQ0L/9AHNodM5Seeidr6+vcpb0JPwq+OsByE+EltfpbY3CnvhcC8eGwdE3Ifw69YOsaFx0GAX73gDvZuDiCr6+MPF3ODJPnKa1a2DQbAi9Vm9LFc7GwfnQpBgi7gavsvFKsTc0BXy9oc0dcOAJyPoJQp4x//i+AyD6P7D+FsgcA53/z6bmK5yYlB/A9Sz0vle+k0zGF7reAKe+gyGz6uXvuSVLcJTAg0IfmrUHn3B5uqpo2BgM0HsWZG2Gk8v0tkahcCwtI+VpbM6+im0GA4TdDlfsg5ZDYe11kDAZLmbqZ6fC+UieCyHjwSuk5v3uPtBuEiR/CZpmWRsdboYu02DrA3Bmu8WmKuoZyXMhYBA072l+3U5TIe8QZCbY3i4nRTlLCv0IjoX0FXpboXAEQaOh1XDY/bzlP+oKRX2kxSAwuELm+kv3eYXAiJ8gcj5k/AFLesCxBeozooDsffKAKWyq8XJhNhi49n8H/HpCwg1QmG35cRT1g/w0SFsmTo8lBI6CZh3E4WokKGdJoR8hsZCXLC9Fw8ZggIgX4cxWOPGb3tYoFI7DrRn494PMdTXvNxjk6f7EvRAYDetukjV++ekONVPhZCTPBc8W0OZK4+UCR0KzjtYNXF2bwIiFUHAaNt2lnPWGzpF5YHCH9jdZVt/gAmF3Qsr3UJRnU9OcFeUsKfQjMFqeuKpQvMZBUIw8kVKzS4rGRstIOF3DzFJlvIJkwDp8ocwSLOkJR75Wn5XGSGmRDGjb3wquHsbL2mrg6h0GQ+dC6g+Q9KHlx1E4N5omjnXotdaJa3W8A4rPQ+oi29nmxChnSaEfHn4Sr5+unKVGQfns0tntcPwXva1RKBxHq0jIOwz5GXWXbTcJLt8HIeNgwxRYc5Uo6CkaDyeXw8UM08Okwu6A4gvWD1xDr4WuD8H2R+H0JuuOpXBOsjZBzv66wzvrwrsDBF3WaELxlLOk0JfgWEj/A0pL9LZE4QiCRskM0+4XQCvV2xqFwjG0jJS/pzeYVr5JS4iaDyN/ltDVJT3h8Fw1y9RYSJ4L/n3lZQrN2ttu4Nr3dfAfAAk3QsEZ64+ncC6S50LTtrKO2FrCpsKpPyH3sPXHcnKUs6TQl+BYKDonAwJF4yDiRTi3E47/rLclCoVjaBYqA5S6QvGq0/ZqUcxrew1s+gfET4DzKXYxUeEkXMyEE7+a/+TfVgNXVw8YvgCK82DDHeqhVkOi+AIc+05C6FxcrT9e6HXg7itqjA0c5Swp9KXFYPmwqVC8xkPgCAgeA7ueVz/EisZDy6jaRR6M4eEPw76EUUvg3B5Y0gsOfqpmmRoqR7+RkOX2t5hXL/Ra2w1cm7WDYfPg5G/w19vm1y8stG6/PbDEJmfshzWk/iRpDMLutM3x3JpCu8lw5KsGHx2knCWFvri4SfhAmpIQb1REvAjZe2QxsULRGGgVKTPoJQWW1W8zES7fC+0nw5Z7YdUYyDtiWxsV+lK++L7NVRKKaQ5uTUXdzFYD1zYTocdTsPNpOLXW9HoLFkBEBKSm1rw/NVX2L1hgvY32tMkZ+2EtyXOh1Qjw6Wy7Y4ZNhQupkLHKdsd0QnR1ll577TUGDRqEj48PgYGBXHPNNRw4cMBonS+//BKDwVDl1aRJEwdZrLALwbESy1+Uq7clCkfRKlIWsO9+ocE/kVIoAFm3VFoIp6zIh+PhB0P+AzErIPcQLI2AAx+pGdqGwtntcG6X5YvvbT1w7T0LWkWJnP3FU3WXLyyEmTMhKQmioy91NFJTZXtSkpRzxMyMJTY5Yz+s5fwxuS+sFXaoTsuh4Nu1wQs96OosrVmzhvvuu4+NGzeycuVKioqKGDt2LOfPnzdaz9fXl7S0tL9fx44dc5DFCrsQHAtaMZxao7clCkcS8aIkXkxZqLclCoX98e6EBuxddh/frt9v3bFCYuHyPdDxdki8H/6IEedJUb9JniuJikPGWVa/xRDw7Wa7gauLG0R+K7/P62+t+8GWhwfExUFYGCQnV3U0yh2M5GTZHxcn5e2NJTY5Yz+sJfmrsrC5G2x7XINBHLDjP0HhOdse24nQ1Vlavnw5d955Jz179qRPnz58+eWXpKSkkJiYaLSewWAgODj471dQUJCDLFbYBZ/Okg1a5VtqXLQcAiETYM+LanZJ0fDx9GfDhUH0bHKAxM0/Wn88dx8Y9AmMXgUXjsPS3rD/XfVZqq+UFMDR+dBhijgplmCPgWvT1hA5X1Rr975Sd/nQUIiPr+porF9f1cGIj5dyjsISm5yxH5ailcpatnY3gLu37Y/fYYrMmh/7zvbHdhKcas1SdnY2AAEBAUbL5eXl0b59e0JDQ7n66qvZu3dvrWULCgrIycmp8lI4GQZDmYS4WrfU6Oj9ouR8SKlHcd8KhYXkdnoMgIfbLbfdQYNiYOIu6Px/sO0RiBsB2VbOXCkcz4nFUHjG+jCpjlMkqa0tB67BoyHiBQmbTv+j7vLVHY2oKP0dDEtscsZ+WMKpP+H8EduH4JXTtDWEjG/QoXhO4yyVlpYyY8YMoqKi6NWrV63lunbtyhdffMEvv/zC119/TWlpKZGRkRw/frzG8q+99hp+fn5/v0Lry83d2AiJlUHzhZqvo6KB0mIQtL4Cdr8IpcV6W6NQ2JVxfToB0PriWsg0U0bcGG7NYMB7MOZPKDgNy/rCvjfVZ6o+cXgutBgKft2sO45XiH0Grj2fERXT9bfAhZN1lw8NhXnzqm6bN09fB8MSm5yxH+aSPBe8O4m4g70ImwpZmyW0vgHiNM7Sfffdx549e/juO+NPQ4YNG8btt99O3759GTVqFD/++COtWrXi008/rbH8U089RXZ29t+v1NqUTRT6EjQaMKhQvMZI7xcgNwmOfau3JQqFY/AJh20P2V6YIXA4TNgJXe+HnU/BikiRG1c4NxdOQPrv0MlGT/7tMXB1cYXIr8HgButvrtsRT02FKVOqbpsypXZ1OUdgiU3O2A9zKMqFlEUiF24w2K+dNleCR0CDnV1yCmdp+vTp/Pbbb6xevZq2bduaVdfd3Z1+/fpx6FDNi1s9PT3x9fWt8lI4IZ4BEDBQheI1RgIGiFTu7lnqSbiicdDtERnMHrXDAwI3L+j3FsSul8Siy/vDnpclNEvhnByZBy6ekrPGFrS5Ejxb2H7g2iQQor6TfGG7ZtZerroIwrp1NYslOBJLbHLGfphLyvdQki+JaO2Jqyd0uFXu5Qb4XaOrs6RpGtOnT+enn35i1apVdOzY0exjlJSUsHv3bkJCQuxgocKhhMRCepySwW2
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(X_outliers, Y_outliers, xlabel=u'dł. płatka', ylabel=u'szer. płatka')\n",
"plot_new_example(fig, 2.8, 0.9)\n",
"plot_voronoi(fig, X_outliers[:, 1:])"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* Pojedyncza obserwacja odstająca dramatycznie zmienia granice klas."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Aby temu zaradzić, użyjemy więcej niż jednego najbliższego sąsiada ($k > 1$)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Algorytm $k$ najbliższych sąsiadów dla problemu klasyfikacji\n",
"\n",
"1. Dany jest zbiór uczący zawierajacy przykłady $(x_i, y_i)$, gdzie: $x_i$ zestaw cech, $y_i$ klasa.\n",
"1. Dany jest przykład testowy $x'$, dla którego chcemy określić klasę.\n",
"1. Oblicz odległość $d(x', x_i)$ dla każdego przykładu $x_i$ ze zbioru uczącego.\n",
"1. Wybierz $k$ przykładów $x_{i_1}, \\ldots, x_{i_k}$, dla których wyliczona odległość jest najmniejsza.\n",
"1. Jako wynik $y'$ zwróć tę spośrod klas $y_{i_1}, \\ldots, y_{i_k}$, która występuje najczęściej."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Algorytm $k$ najbliższych sąsiadów dla problemu klasyfikacji przykład"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 53,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# Odległość euklidesowa\n",
"def euclidean_distance(x1, x2):\n",
" return np.linalg.norm(x1 - x2)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 54,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# Algorytm k najbliższych sąsiadów\n",
"def knn(X, Y, x_new, k, distance=euclidean_distance):\n",
" data = np.concatenate((X, Y), axis=1)\n",
" nearest = sorted(\n",
" data, key=lambda xy:distance(xy[0, :-1], x_new))[:k]\n",
" y_nearest = [xy[0, -1] for xy in nearest]\n",
" return max(y_nearest, key=lambda y:y_nearest.count(y))"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 55,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Wykres klas dla KNN\n",
"def plot_knn(fig, X, Y, k, distance=euclidean_distance):\n",
" ax = fig.axes[0]\n",
" x1min, x2min = X.min(axis=0).tolist()[0]\n",
" x1max, x2max = X.max(axis=0).tolist()[0]\n",
" pad1 = (x1max - x1min) / 10\n",
" pad2 = (x2max - x2min) / 10\n",
" step1 = (x1max - x1min) / 50\n",
" step2 = (x2max - x2min) / 50\n",
" x1grid, x2grid = np.meshgrid(\n",
" np.arange(x1min - pad1, x1max + pad1, step1),\n",
" np.arange(x2min - pad2, x2max + pad2, step2))\n",
" z = np.matrix([[knn(X, Y, [x1, x2], k, distance) \n",
" for x1, x2 in zip(x1row, x2row)] \n",
" for x1row, x2row in zip(x1grid, x2grid)])\n",
" plt.contour(x1grid, x2grid, z, levels=[0.5]);"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 56,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Przygotowanie interaktywnego wykresu\n",
"\n",
"slider_k = widgets.IntSlider(min=1, max=10, step=1, value=1, description=r'$k$', width=300)\n",
"\n",
"def interactive_knn_1(k):\n",
" fig = plot_data_for_classification(X_outliers, Y_outliers, xlabel=u'dł. płatka', ylabel=u'szer. płatka')\n",
" plot_voronoi(fig, X_outliers[:, 1:])\n",
" plot_knn(fig, X_outliers[:, 1:], Y_outliers, k)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 57,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
2023-01-13 14:18:12 +01:00
"model_id": "34daa3fb4fee4514afac780d8cbcd791",
2022-12-09 15:06:17 +01:00
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"interactive(children=(IntSlider(value=1, description='$k$', max=10, min=1), Button(description='Run Interact',…"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"<function __main__.interactive_knn_1(k)>"
]
},
2023-01-13 14:18:12 +01:00
"execution_count": 57,
2022-12-09 15:06:17 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"widgets.interact_manual(interactive_knn_1, k=slider_k)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 59,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Wczytanie danych (inny przykład)\n",
"\n",
"alldata = pandas.read_csv('classification.tsv', sep='\\t')\n",
"data = np.matrix(alldata)\n",
"\n",
"m, n_plus_1 = data.shape\n",
"n = n_plus_1 - 1\n",
"Xn = data[:, 1:].reshape(m, n)\n",
"\n",
"X2 = np.matrix(np.concatenate((np.ones((m, 1)), Xn), axis=1)).reshape(m, n_plus_1)\n",
"Y2 = np.matrix(data[:, 0]).reshape(m, 1)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 60,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
2023-01-13 14:18:12 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1wAAAHvCAYAAABAJN42AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABF60lEQVR4nO3de3wU9b3/8fduLpsgbAIHyJJjUCICUoMglBiCyqPsgWjOaRDbgkUQDoWDolaDF/CnUMEWb0UfWpDWA4hUJcUjKlVTIZrKJYINUhGBiKAEIUGhZLnEbJKd3x95ZHVzT8jsLa/n47EPyHe+M/lMhmH3ne/MdyyGYRgCAAAAALQ7a6ALAAAAAIBwReACAAAAAJMQuAAAAADAJAQuAAAAADAJgQsAAAAATELgAgAAAACTELgAAAAAwCSRgS4gHHg8Hh09elRdunSRxWIJdDkAAAAATGQYhk6fPq3ExERZrU2PYRG42sHRo0eVlJQU6DIAAAAA+FFxcbEuvPDCJvsQuNpBly5dJNX8wO12e4CrAQAAAGAml8ulpKQkbw5oCoGrHdReRmi32wlcAAAAQAfRktuJmDQDAAAAAExC4AIAAAAAkxC4AAAAAMAkBC4AAAAAMAmBCwAAAABMQuACAAAAAJMQuAAAAADAJAQuAAAAADAJgQsAAAAATELgAgAAAACTELgAAAAAwCQELgAAAABt53af3/IwR+ACAAAA0DY5OVJKilRc3PDy4uKa5Tk5/q0riBC4AAAAALSe2y3Nny8VFUmjRtUPXcXFNe1FRTX9OuhIF4ELAAAAQOtFR0ubNknJydLBg76hqzZsHTxYs3zTppr+HRCBCwAAAEDbJCVJ+fm+oWvbNt+wlZ9f06+Digx0AQAAAABCWG3oqg1Z6ek17YQtSYxwAQgnzJIEAEBgJCVJa9b4tq1Z0+HDlkTgAhAumCUJAIDAKS6WJk/2bZs8ufH35Q6EwAUg9DFLEgAAgVN3goytWxueSKODInABCH3MkgQAQGDUfZ/Nz5dGjKg/kUYHDl0ELgDhgVmSAADwL7dbcjobfp+t+77sdHbYK0wIXADCR93/3NPTCVsAAJglOlpauFDq16/h99na9+V+/Wr6ddArTCyGYRiBLiLUuVwuxcXFqaysTHa7PdDlANi27fspaaWaa8lHjAhcPQAAhDO3u+kw1dzyENSaz/+McAEIL8ySBACAfzUXpsIsbLUWgQtA+GCWJAAAEGQIXADCA7MkAQCAIETgAhD6mCUJAAAEKQIXgNDHLEkAACBIMUthO2CWQiBIdMBZkgAAgP8xSyGAjolZkgAAQJAhcAEAAACASQhcAAAAAGASAhcAAAAAmITABQAAAAAmCcnAtXTpUl188cWKiYlRamqqduzY0WjfUaNGyWKx1HtlZmZ6+0ydOrXe8oyMDH/sCgAAAIAwFhnoAlorJydH2dnZWr58uVJTU/X0009r7Nix2r9/v3r27Fmv/2uvvSb3Dx5yeuLECV1xxRX6+c9/7tMvIyNDq1at8n5ts9nM2wkAAAAAHULIjXAtWbJEM2bM0LRp0zRw4EAtX75cnTp10sqVKxvs361bNzkcDu9r48aN6tSpU73AZbPZfPp17drVH7sDAAAAIIyFVOByu90qLCyU0+n0tlmtVjmdThUUFLRoGytWrNDEiRN1wQUX+LTn5+erZ8+e6t+/v2699VadOHGi0W1UVFTI5XL5vAAAAACgrpAKXN9++62qq6uVkJDg056QkKCSkpJm19+xY4c+/fRT/epXv/Jpz8jI0Isvvqi8vDw99thj+vvf/67rrrtO1dXVDW5n8eLFiouL876SkpLavlMAAAAAwlbI3cN1PlasWKGUlBQNHz7cp33ixInev6ekpGjQoEG65JJLlJ+fr9GjR9fbzrx585Sdne392uVyEboAAAAA1BNSI1zdu3dXRESESktLfdpLS0vlcDiaXPfs2bNau3atpk+f3uz3SU5OVvfu3XXgwIEGl9tsNtntdp8XAAAAANQVUoErOjpaQ4cOVV5enrfN4/EoLy9PaWlpTa67bt06VVRU6Oabb272+xw5ckQnTpxQr169zrtmAAAAAB1XSAUuScrOztbzzz+v1atXa+/evbr11lt19uxZTZs2TZI0ZcoUzZs3r956K1as0Lhx4/Rv//ZvPu1nzpzRvffeqw8//FBffvml8vLylJWVpb59+2rs2LF+2ScAAAAA4Snk7uGaMGGCvvnmG82fP18lJSUaPHiwcnNzvRNpHD58WFarb47cv3+/tmzZonfffbfe9iIiIvTJJ59o9erVOnXqlBITEzVmzBgtWrSIZ3EBAAAAOC8WwzCMQBcR6lwul+Li4lRWVsb9XAAAAECYa83n/5C7pBAAAAAAQgWBCwAAAABMQuACAAAAAJMQuAAAAADAJAQuAAAAADAJgQsAAAAATELgAgAAAACTELgAAAAAwCQELgAAAAAwCYELAAAAAExC4AIAAAAAkxC4woHbfX7LAQAAAJiCwBXqcnKklBSpuLjh5cXFNctzcvxbFwAAAAACV0hzu6X586WiImnUqPqhq7i4pr2oqKYfI10AAACAXxG4Qll0tLRpk5ScLB086Bu6asPWwYM1yzdtqukPAAAAwG8IXKEuKUnKz/cNXdu2+Yat/PyafgAAAAD8KjLQBaAd1Iau2pCVnl7TTtgCAAAAAooRrnCRlCStWePbtmYNYQsAAAAIIAJXuCguliZP9m2bPLnx2QsBAAAAmI7AFQ7qTpCxdWvDE2kAAAAA8CsCV6irG7by86URI+pPpEHoAgAAAPyOwBXK3G7J6Wx4NsK6sxc6nTyHCwAAAPAzAlcoi46WFi6U+vVreDbC2tDVr19NP57DBQAAAPiVxTAMI9BFhDqXy6W4uDiVlZXJbrf7vwC3u+kw1dxyAAAAAC3Wms//jHCFg+bCFGELAAAACAgCFwAAAACYhMAFAAAAACYhcAEAAACASQhcAAAAAGASAhcAAAAAmITABQAAAAAmIXABAAAAgEkIXAAAAABgEgIXAAAAAJiEwAUAAAAAJiFwAQAAAIBJCFwAAAAAYBICFwAAAACYhMAFAAAAACYhcAEAAACASQhcAAAAAGASAhcABJHyynKVnilVeWV5oEsBAADtICQD19KlS3XxxRcrJiZGqamp2rFjR6N9X3jhBVksFp9XTEyMTx/DMDR//nz16tVLsbGxcjqd+vzzz83eDQDw2nJ4i8bnjFfnxZ3l+L1DnRd31vic8dp6eGugSwMAAOch5AJXTk6OsrOztWDBAu3cuVNXXHGFxo4dq+PHjze6jt1u17Fjx7yvr776ymf5448/rmeeeUbLly/X9u3bdcEFF2js2LH67rvvzN4dANBzHz2na1Zdow1FG+QxPJIkj+HRhqINunrV1Vr+j+UBrhAAALRVyAWuJUuWaMaMGZo2bZoGDhyo5cuXq1OnTlq5cmWj61gsFjkcDu8rISHBu8wwDD399NN68MEHlZWVpUGDBunFF1/U0aNH9frrr/thjwB0ZFsOb9Hst2fLkKEqT5XPsipPlQwZuu2t2xjpAgAgRIVU4HK73SosLJTT6fS2Wa1WOZ1OFRQUNLremTNndNFFFykpKUlZWVnas2ePd9mhQ4dUUlLis824uDilpqY2us2Kigq5XC6fFwC0xZKCJYqwRjTZJ8Iaoac+fMpPFQEAgPYUUoHr22+/VXV1tc8IlSQlJCSopKSkwXX69++vlStX6o033tCf//xneTwejRgxQkeOHJEk73qt2ebixYsVFxfnfSUlJZ3vrgHogMory/XG/jfqjWzVVeWp0vp965lIAwCAEBRSgast0tLSNGXKFA0ePFjXXnutXnvtNfXo0UN//OMf27zNefPmqayszPsqLi5ux4oBdBSuCpf3nq3meAyPXBWMpgMAEGpCKnB1795dERERKi0t9WkvLS2Vw+Fo0TaioqI0ZMgQHThwQJK867VmmzabTXa73ecFAK1lt9lltbTsv2GrxSq7jf9rAAAINSEVuKKjozV06FDl5eV52zwej/Ly8pSWltaibVRXV2v37t3q1auXJKlPnz5yOBw+23S
2022-12-09 15:06:17 +01:00
"text/plain": [
2023-01-13 14:18:12 +01:00
"<Figure size 960x540 with 1 Axes>"
2022-12-09 15:06:17 +01:00
]
},
2023-01-13 14:18:12 +01:00
"metadata": {},
2022-12-09 15:06:17 +01:00
"output_type": "display_data"
}
],
"source": [
"fig = plot_data_for_classification(X2, Y2, xlabel=r'$x_1$', ylabel=r'$x_2$')"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 61,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "notes"
}
},
"outputs": [],
"source": [
"# Przygotowanie interaktywnego wykresu\n",
"\n",
"slider_k = widgets.IntSlider(min=1, max=10, step=1, value=1, description=r'$k$', width=300)\n",
"\n",
"def interactive_knn_2(k):\n",
" fig = plot_data_for_classification(X2, Y2, xlabel=r'$x_1$', ylabel=r'$x_2$')\n",
" plot_voronoi(fig, X2[:, 1:])\n",
" plot_knn(fig, X2[:, 1:], Y2, k)"
]
},
{
"cell_type": "code",
2023-01-13 14:18:12 +01:00
"execution_count": 62,
2022-12-09 15:06:17 +01:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
2023-01-13 14:18:12 +01:00
"model_id": "2bc697f1f57244a9b1c3864e98b21538",
2022-12-09 15:06:17 +01:00
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"interactive(children=(IntSlider(value=1, description='$k$', max=10, min=1), Button(description='Run Interact',…"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"<function __main__.interactive_knn_2(k)>"
]
},
2023-01-13 14:18:12 +01:00
"execution_count": 62,
2022-12-09 15:06:17 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"widgets.interact_manual(interactive_knn_2, k=slider_k)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Algorytm $k$ najbliższych sąsiadów dla problemu regresji\n",
"\n",
"1. Dany jest zbiór uczący zawierajacy przykłady $(x_i, y_i)$, gdzie: $x_i$ zestaw cech, $y_i$ liczba rzeczywista.\n",
"1. Dany jest przykład testowy $x'$, dla którego chcemy określić klasę.\n",
"1. Oblicz odległość $d(x', x_i)$ dla każdego przykładu $x_i$ ze zbioru uczącego.\n",
"1. Wybierz $k$ przykładów $x_{i_1}, \\ldots, x_{i_k}$, dla których wyliczona odległość jest najmniejsza.\n",
"1. Jako wynik $y'$ zwróć średnią liczb $y_{i_1}, \\ldots, y_{i_k}$:\n",
" $$ y' = \\frac{1}{k} \\sum_{j=1}^{k} y_{i_j} $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Wybór $k$\n",
"\n",
"* Wartość $k$ ma duży wpływ na wynik działania algorytmu KNN:\n",
" * Jeżeli $k$ jest zbyt duże, wszystkie nowe przykłady są klasyfikowane jako klasa większościowa.\n",
" * Jeżeli $k$ jest zbyt małe, granice klas są niestabilne, a algorytm jest bardzo podatny na obserwacje odstające.\n",
"* Aby dobrać optymalną wartość $k$, najlepiej użyć zbioru walidacyjnego."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Miary podobieństwa"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Odległość euklidesowa\n",
"$$ d(x, x') = \\sqrt{ \\sum_{i=1}^n \\left( x_i - x'_i \\right) ^2 } $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Dobry wybór w przypadku numerycznych cech.\n",
"* Symetryczna, traktuje wszystkie wymiary jednakowo.\n",
"* Wrażliwa na duże wahania jednej cechy."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Odległość Hamminga\n",
"$$ d(x, x') = \\sum_{i=1}^n \\mathbf{1}_{x_i \\neq x'_i} $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Dobry wybór w przypadku cech zero-jedynkowych.\n",
"* Liczba cech, którymi różnią się dane przykłady."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Odległość Minkowskiego ($p$-norma)\n",
"$$ d(x, x') = \\sqrt[p]{ \\sum_{i=1}^n \\left| x_i - x'_i \\right| ^p } $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Dla $p = 2$ jest to odległość euklidesowa.\n",
"* Dla $p = 1$ jest to odległość taksówkowa.\n",
"* Jeżeli $p \\to \\infty$, to $p$-norma zbliża się do logicznej alternatywy.\n",
"* Jeżeli $p \\to 0$, to $p$-norma zbliża się do logicznej koniunkcji."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### KNN praktyczne porady"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* Co zrobić z remisami?\n",
" * Można wybrać losową klasę.\n",
" * Można wybrać klasę o wyższym prawdopodobieństwie _a priori_.\n",
" * Można wybrać klasę wskazaną przez algorytm 1NN."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"* KNN źle radzi sobie z brakującymi wartościami cech (nie można wówczas sensownie wyznaczyć odległości)."
]
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"livereveal": {
"start_slideshow_at": "selected",
"theme": "white"
}
},
"nbformat": 4,
"nbformat_minor": 4
}