{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Analiza danych w Pythonie: sklearn\n", "\n", "### Tomasz Dwojak\n", "\n", "### 3 czerwca 2018" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ " * Pierwsza część: pandas\n", " * Druga część: sklearn" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Przypomnienie z UMZ\n", " * przygotowanie i czyszczenie danych\n", " * wybór i trening modelu\n", " * tuning\n", " * ewaluacja" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "import sklearn\n", "import pandas as pd\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "data = pd.read_csv(\"./gapminder.csv\", index_col=0)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
\n", " | female_BMI | \n", "male_BMI | \n", "gdp | \n", "population | \n", "under5mortality | \n", "life_expectancy | \n", "fertility | \n", "
---|---|---|---|---|---|---|---|
Afghanistan | \n", "21.07402 | \n", "20.62058 | \n", "1311.0 | \n", "26528741.0 | \n", "110.4 | \n", "52.8 | \n", "6.20 | \n", "
Albania | \n", "25.65726 | \n", "26.44657 | \n", "8644.0 | \n", "2968026.0 | \n", "17.9 | \n", "76.8 | \n", "1.76 | \n", "
Algeria | \n", "26.36841 | \n", "24.59620 | \n", "12314.0 | \n", "34811059.0 | \n", "29.5 | \n", "75.5 | \n", "2.73 | \n", "
Angola | \n", "23.48431 | \n", "22.25083 | \n", "7103.0 | \n", "19842251.0 | \n", "192.0 | \n", "56.7 | \n", "6.43 | \n", "
Antigua and Barbuda | \n", "27.50545 | \n", "25.76602 | \n", "25736.0 | \n", "85350.0 | \n", "10.9 | \n", "75.5 | \n", "2.16 | \n", "