{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Regresja wielomianowa" ] }, { "source": [ "![polynomial_regression](regression.png)\n", "\n", "Celem regresji wielomianowej jest zamodelowanie relacji między zmienną zależną od zmiennych niezależnych jako funkcję wielomianu n-tego stopnia.\n", "\n", "Postać ogólna regresji wielomianowej:\n", "\n", "$$ h_{\\theta}(x) = \\sum_{i=0}^{n} \\theta_i x^i $$\n", "\n", "Gdzie:\n", "\n", "$$ \\theta - \\text{wektor parametrów modelu} $$ " ], "cell_type": "markdown", "metadata": {} }, { "source": [ "## Funkcja kosztu\n", "![MSE](mse.webp)\n", "\n", "W celu odpowiedniego dobrania parametrów modelu, trzeba znaleźć minimum funkcji kosztu zdefiniowanej poniższym wzorem:\n", "\n", "$$ J = \\frac{1}{2m} (X \\theta - y)^T (X \\theta - y) $$\n", "\n", "Gdzie:\n", "\n", "$$ m - \\text{ilość przykładów} $$ \n", "\n", "Za funkcję kosztu przyjmuje się zmodyfikowaną wersję błędu średniokwadratowego. Dodatkowo dodaje się dzielenie przez 2*m zamiast m, aby gradient z funkcji był lepszej postaci." ], "cell_type": "markdown", "metadata": {} }, { "source": [ "## Metoda gradientu prostego\n", "![gradient_descent](gradient.png)" ], "cell_type": "markdown", "metadata": {} }, { "source": [ "Do znalezienia minimum funckji kosztu można użyć metody gradientu prostego. W tym celu iteracyjnie liczy się gradient funkcji kosztu i aktualizuje się za jego pomocą wektor parametrów modelu aż do uzyskania zbieżności (Różnica między obliczoną funkcją kosztu a funkcją kosztu w poprzedniej iteracji będzie mniejsza od ustalonej wcześniej wartości $\\varepsilon$).\n", "\n", "Gradient funkcji kosztu:\n", "\n", "$$ \\dfrac{\\partial J(\\theta)}{\\partial \\theta} = \\frac{1}{m}X^T(X \\theta - y)$$\n", "\n", "Modyfikacja parametrów modelu:\n", "\n", "$$ \\theta_{new} = \\theta - \\alpha * \\dfrac{\\partial J(\\theta)}{\\partial \\theta}$$\n", "\n", "Gdzie:\n", "\n", "$$ \\alpha - \\text{współczynnik uczenia} $$" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 46, "metadata": { "slideshow": { "slide_type": "notes" } }, "outputs": [], "source": [ "import ipywidgets as widgets\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas\n", "\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "slideshow": { "slide_type": "notes" } }, "outputs": [], "source": [ "# Przydatne funkcje\n", "cost_functions = dict()\n", "\n", "def cost(theta, X, y):\n", " \"\"\"Wersja macierzowa funkcji kosztu\"\"\"\n", " m = len(y)\n", " J = 1.0 / (2.0 * m) * ((X * theta - y).T * (X * theta - y))\n", " return J.item()\n", "\n", "def gradient(theta, X, y):\n", " \"\"\"Wersja macierzowa gradientu funkcji kosztu\"\"\"\n", " return 1.0 / len(y) * (X.T * (X * theta - y)) \n", "\n", "def gradient_descent(fJ, fdJ, theta, X, y, alpha=0.1, eps=10**-7):\n", " \"\"\"Algorytm gradientu prostego\"\"\"\n", " current_cost = fJ(theta, X, y)\n", " logs = [[current_cost, theta]]\n", " while True:\n", " theta = theta - alpha * fdJ(theta, X, y)\n", " current_cost, prev_cost = fJ(theta, X, y), current_cost\n", " # print(current_cost)\n", " if abs(prev_cost - current_cost) > 10**15:\n", " print('Algorytm nie jest zbieżny!')\n", " break\n", " if abs(prev_cost - current_cost) <= eps:\n", " break\n", " logs.append([current_cost, theta]) \n", " return theta, logs\n", "\n", "def plot_data(X, y, xlabel, ylabel):\n", " \"\"\"Wykres danych\"\"\"\n", " fig = plt.figure(figsize=(16*.6, 9*.6))\n", " ax = fig.add_subplot(111)\n", " fig.subplots_adjust(left=0.1, right=0.9, bottom=0.1, top=0.9)\n", " ax.scatter([X[:, 1]], [y], c='r', s=50, label='Dane')\n", " \n", " ax.set_xlabel(xlabel)\n", " ax.set_ylabel(ylabel)\n", " ax.margins(.05, .05)\n", " plt.ylim(y.min() - 1, y.max() + 1)\n", " plt.xlim(np.min(X[:, 1]) - 1, np.max(X[:, 1]) + 1)\n", " return fig\n", "\n", "def plot_data_cost(X, y, xlabel, ylabel):\n", " \"\"\"Wykres funkcji kosztu\"\"\"\n", " fig = plt.figure(figsize=(16 * .6, 9 * .6))\n", " ax = fig.add_subplot(111)\n", " fig.subplots_adjust(left=0.1, right=0.9, bottom=0.1, top=0.9)\n", " ax.scatter([X], [y], c='r', s=50, label='Dane')\n", "\n", " ax.set_xlabel(xlabel)\n", " ax.set_ylabel(ylabel)\n", " ax.margins(.05, .05)\n", " plt.ylim(min(y) - 1, max(y) + 1)\n", " plt.xlim(np.min(X) - 1, np.max(X) + 1)\n", " return fig\n", "\n", "def plot_fun(fig, fun, X):\n", " \"\"\"Wykres funkcji `fun`\"\"\"\n", " ax = fig.axes[0]\n", " x0 = np.min(X[:, 1]) - 1.0\n", " x1 = np.max(X[:, 1]) + 1.0\n", " Arg = np.arange(x0, x1, 0.1)\n", " Val = fun(Arg)\n", " return ax.plot(Arg, Val, linewidth='2')" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [], "source": [ "def MSE(Y_true, Y_pred):\n", " \"\"\"Błąd średniokwadratowy - Mean Squared Error\"\"\"\n", " return np.square(np.subtract(Y_true,Y_pred)).mean()" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# Funkcja regresji wielomianowej\n", "\n", "def h_poly(Theta, x):\n", " \"\"\"Funkcja wielomianowa\"\"\"\n", " return sum(theta * np.power(x, i) for i, theta in enumerate(Theta.tolist()))\n", "\n", "def get_poly_data(data, deg):\n", " \"\"\"Przygotowanie danych do regresji wielomianowej\"\"\"\n", " m, n_plus_1 = data.shape\n", " n = n_plus_1 - 1\n", "\n", " X1 = data[:, 0:n]\n", " X1 /= np.amax(X1, axis=0)\n", "\n", " Xs = [np.ones((m, 1)), X1]\n", "\n", " for i in range(2, deg+1):\n", " Xn = np.power(X1, i)\n", " Xn /= np.amax(Xn, axis=0)\n", " Xs.append(Xn)\n", "\n", " X = np.matrix(np.concatenate(Xs, axis=1)).reshape(m, deg * n + 1)\n", "\n", " y = np.matrix(data[:, -1]).reshape(m, 1)\n", "\n", " return X, y\n", "\n", "\n", "def polynomial_regression(X, y, n):\n", " \"\"\"Funkcja regresji wielomianowej\"\"\"\n", " theta_start = np.matrix([0] * (n+1)).reshape(n+1, 1)\n", " theta, logs = gradient_descent(cost, gradient, theta_start, X, y)\n", " return lambda x: h_poly(theta, x), logs" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [], "source": [ "def predict_values(model, data, n):\n", " \"\"\"Funkcja predykcji\"\"\"\n", " x, y = get_poly_data(np.array(data), n)\n", " preprocessed_x = []\n", " for i in x:\n", " preprocessed_x.append(i.item(1))\n", " return y, model(preprocessed_x), MSE(y, model(preprocessed_x))\n", "\n", "def plot_and_mse(data, data_test, n):\n", " \"\"\"Wykres wraz z MSE\"\"\"\n", " x, y = get_poly_data(np.array(data), n)\n", " model, logs = polynomial_regression(x, y, n)\n", " cost_function = [[element[0], i] for i, element in enumerate(logs)]\n", " cost_functions[n] = cost_function\n", " \n", " fig = plot_data(x, y, xlabel='x', ylabel='y')\n", " plot_fun(fig, model, x)\n", "\n", " y_true, Y_pred, mse = predict_values(model, data_test, n)\n", " print(f'Wielomian {n} stopnia, MSE = {mse}')" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "slideshow": { "slide_type": "notes" } }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " sqrMetres price\n", "1102 73 550000.0\n", "1375 78 496573.0\n", "1644 40 310000.0\n", "1434 22 377652.0\n", "1444 30 345180.0\n", "... ... ...\n", "384 76 269984.0\n", "214 42 249000.0\n", "24 47 299000.0\n", "1004 57 293848.0\n", "1632 112 329000.0\n", "\n", "[1674 rows x 2 columns]" ], "text/html": "
\n | sqrMetres | \nprice | \n
---|---|---|
1102 | \n73 | \n550000.0 | \n
1375 | \n78 | \n496573.0 | \n
1644 | \n40 | \n310000.0 | \n
1434 | \n22 | \n377652.0 | \n
1444 | \n30 | \n345180.0 | \n
... | \n... | \n... | \n
384 | \n76 | \n269984.0 | \n
214 | \n42 | \n249000.0 | \n
24 | \n47 | \n299000.0 | \n
1004 | \n57 | \n293848.0 | \n
1632 | \n112 | \n329000.0 | \n
1674 rows × 2 columns
\n