forked from AITech/aitech-ium
1478 lines
402 KiB
Plaintext
1478 lines
402 KiB
Plaintext
|
{
|
|||
|
"cells": [
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"# Plan na dzisiaj\n",
|
|||
|
"1. Motywacja\n",
|
|||
|
"2. Podział danych\n",
|
|||
|
"3. Skąd wziąć dane?\n",
|
|||
|
"4. Przygotowanie danych\n",
|
|||
|
"5. Zadanie"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"# Motywacja\n",
|
|||
|
"- Zasada \"Garbage in - garbage out\"\n",
|
|||
|
"- Im lepszej jakości dane - tym lepszy model\n",
|
|||
|
"- Najlepsza architektura, najpotężniejsze zasoby obliczeniowe i najbardziej wyrafinowane metody nie pomogą, jeśli dane użyte do rozwoju modelu nie odpowiadają tym, z którymi będzie on używany, albo jeśli w danych nie będzie żadnych zależności\n",
|
|||
|
"- Możemy stracić dużo czasu, energii i zasobów optymalizując nasz model w złym kierunku, jeśli dane są źle dobrane"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"# Źródła danych\n",
|
|||
|
"- Gotowe zbiory:\n",
|
|||
|
" - Otwarte wyzwania (challenge)\n",
|
|||
|
" - Repozytoria otwartych zbiorów danych\n",
|
|||
|
" - Dane udostępniane przez firmy\n",
|
|||
|
" - Repozytoria zbiorów komercyjnych\n",
|
|||
|
" - Dane wewnętrzne (np. firmy)"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"# Źródła danych\n",
|
|||
|
"- Tworzenie danych:\n",
|
|||
|
" - Generowanie syntetyczne\n",
|
|||
|
" - Crowdsourcing\n",
|
|||
|
" - Data scrapping\n",
|
|||
|
" - Ekstrakcja\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"## Otwarte wyzwania (shared task / challenge)\n",
|
|||
|
"- Kaggle: https://www.kaggle.com/datasets\n",
|
|||
|
"- Gonito: https://gonito.net/list-challenges - polski (+poznański +z UAM) Kaggle\n",
|
|||
|
"- Semeval: https://semeval.github.io/ - zadania z semantyki\n",
|
|||
|
"- Poleval: http://poleval.pl/ - przetwarzanie języka polskiego\n",
|
|||
|
"- WMT http://www.statmt.org/wmt20/ (tłumaczenie maszynowe)\n",
|
|||
|
"- IWSLT https://iwslt.org/2021/#shared-tasks (tłumaczenie mowy)"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"## Repozytoria/wyszukiwarki otwartych zbiorów danych\n",
|
|||
|
"- Papers with code: https://paperswithcode.com/datasets\n",
|
|||
|
"- UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/\n",
|
|||
|
"- Google dataset search: https://datasetsearch.research.google.com/\n",
|
|||
|
"- Zbiory google:https://research.google/tools/datasets/\n",
|
|||
|
"- https://registry.opendata.aws/\n",
|
|||
|
" "
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"## Otwarte zbiory\n",
|
|||
|
"- Rozpoznawanie mowy:\n",
|
|||
|
" - https://www.openslr.org/ - Libri Speech, TED Lium\n",
|
|||
|
" - Mozilla Open Voice: https://commonvoice.mozilla.org/\n",
|
|||
|
"- NLP:\n",
|
|||
|
" - Clarin PL: https://lindat.cz/repository/xmlui/\n",
|
|||
|
" - Clarin: https://clarin-pl.eu/index.php/zasoby/\n",
|
|||
|
" "
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"## Crowdsourcing\n",
|
|||
|
"- Amazon Mechanical Turk: https://www.mturk.com/\n",
|
|||
|
"- Yandex Toloka\n",
|
|||
|
"- reCAPTCHA\n",
|
|||
|
"<img src=\"https://upload.wikimedia.org/wikipedia/commons/8/8b/Tuerkischer_schachspieler_windisch4.jpg\">\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"## Licencje\n",
|
|||
|
"- Przed podjęciem decyzji o użyciu danego zbioru koniecznie sprawdź jego licencję!\n",
|
|||
|
"- Wiele dostępnych w internecie zbiorów jest udostępniana na podstawie otwartych licencji\n",
|
|||
|
"- Zazwyczaj jednak ich użycie wymaga spełnienia pewnych warunków, np. podania źródła\n",
|
|||
|
"- Wiele ogólnie dostępnych zbiorów nie może być jednak użytych za darmo w celach komercyjnych!\n",
|
|||
|
"- Niektóre z nich mogą nawet powodować, że praca pochodna, która zostanie stworzona z ich wykorzystaniem, będzie musiała być udostępniona na tej samej licencji (GPL). Jest to \"niebezpieczeństwo\" w przypadku wykorzystania zasobów przez firmę komercyjną!\n",
|
|||
|
"- Zasady działania licencji CC: https://creativecommons.pl/\n",
|
|||
|
"- Najbardziej popularne licencje:\n",
|
|||
|
" - Przyjazne również w zastosowaniach komercyjnych: MIT, BSD, Appache, CC (bez dopisku NC)\n",
|
|||
|
" - GPL (GNU Public License) - \"zaraźliwa\" licencja Open Source"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"### Przykład \n",
|
|||
|
"- Za pomocą standardowych narzędzi bash dokonamy wstępnej inspekcji i podziału danych\n",
|
|||
|
"- Jako przykładu użyjemy klasycznego zbioru IRIS: https://archive.ics.uci.edu/ml/datasets/Iris\n",
|
|||
|
"- Zbiór zawiera dane dotyczące długości i szerokości płatków kwiatowych trzech gatunków irysa:\n",
|
|||
|
" - Iris Setosa\n",
|
|||
|
" - Iris Versicolour\n",
|
|||
|
" - Iris Virginica\n",
|
|||
|
" \n",
|
|||
|
"<img src=IUM_02/iris.png>\n",
|
|||
|
"https://www.kaggle.com/vinayshaw/iris-species-100-accuracy-using-naive-bayes"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"## Inspekcja\n",
|
|||
|
"- Zanim zaczniemy trenować model na danych, powinniśmy poznać ich specyfikę\n",
|
|||
|
"- Pozwoli nam to:\n",
|
|||
|
" - usunąć lub naprawić nieprawidłowe przykłady\n",
|
|||
|
" - dokonać selekcji cech, których użyjemy w naszym modelu\n",
|
|||
|
" - wybrać odpowiedni algorytm uczenia\n",
|
|||
|
" - podjąć dezycję dotyczącą podziału zbioru i ewentualnej normalizacji\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"## Inspekcja\n",
|
|||
|
"- Do inspekcji danych użyjemy popularnej biblioteki pythonowej Pandas: https://pandas.pydata.org/\n",
|
|||
|
"- Do wizualizacji użyjemy biblioteki Seaborn: https://seaborn.pydata.org/index.html\n",
|
|||
|
"- Służy ona do analizy i operowania na danych tabelarycznych jak i szeregach czasowych"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 12,
|
|||
|
"metadata": {
|
|||
|
"scrolled": true,
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"Requirement already satisfied: kaggle in /home/tomek/.local/lib/python3.8/site-packages (1.5.12)\n",
|
|||
|
"Requirement already satisfied: python-dateutil in /home/tomek/anaconda3/lib/python3.8/site-packages (from kaggle) (2.8.1)\n",
|
|||
|
"Requirement already satisfied: six>=1.10 in /home/tomek/anaconda3/lib/python3.8/site-packages (from kaggle) (1.15.0)\n",
|
|||
|
"Requirement already satisfied: urllib3 in /home/tomek/anaconda3/lib/python3.8/site-packages (from kaggle) (1.25.11)\n",
|
|||
|
"Requirement already satisfied: python-slugify in /home/tomek/.local/lib/python3.8/site-packages (from kaggle) (4.0.1)\n",
|
|||
|
"Requirement already satisfied: certifi in /home/tomek/anaconda3/lib/python3.8/site-packages (from kaggle) (2020.6.20)\n",
|
|||
|
"Requirement already satisfied: tqdm in /home/tomek/anaconda3/lib/python3.8/site-packages (from kaggle) (4.50.2)\n",
|
|||
|
"Requirement already satisfied: requests in /home/tomek/anaconda3/lib/python3.8/site-packages (from kaggle) (2.24.0)\n",
|
|||
|
"Requirement already satisfied: text-unidecode>=1.3 in /home/tomek/.local/lib/python3.8/site-packages (from python-slugify->kaggle) (1.3)\n",
|
|||
|
"Requirement already satisfied: chardet<4,>=3.0.2 in /home/tomek/anaconda3/lib/python3.8/site-packages (from requests->kaggle) (3.0.4)\n",
|
|||
|
"Requirement already satisfied: idna<3,>=2.5 in /home/tomek/anaconda3/lib/python3.8/site-packages (from requests->kaggle) (2.10)\n",
|
|||
|
"Requirement already satisfied: pandas in /home/tomek/anaconda3/lib/python3.8/site-packages (1.1.3)\n",
|
|||
|
"Requirement already satisfied: python-dateutil>=2.7.3 in /home/tomek/anaconda3/lib/python3.8/site-packages (from pandas) (2.8.1)\n",
|
|||
|
"Requirement already satisfied: numpy>=1.15.4 in /home/tomek/anaconda3/lib/python3.8/site-packages (from pandas) (1.19.2)\n",
|
|||
|
"Requirement already satisfied: pytz>=2017.2 in /home/tomek/anaconda3/lib/python3.8/site-packages (from pandas) (2020.1)\n",
|
|||
|
"Requirement already satisfied: six>=1.5 in /home/tomek/anaconda3/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"#Zainstalujmy potrzebne biblioteki \n",
|
|||
|
"!pip install --user kaggle #API Kaggle, do pobrania zbioru\n",
|
|||
|
"!pip install --user pandas"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 13,
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"Warning: Your Kaggle API key is readable by other users on this system! To fix this, you can run 'chmod 600 /home/tomek/.kaggle/kaggle.json'\n",
|
|||
|
"iris.zip: Skipping, found more recently modified local copy (use --force to force download)\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"# Żeby poniższa komenda zadziałała, musisz posiadać plik /.kaggle/kaggle.json, zawierający Kaggle API token.\n",
|
|||
|
"# Instrukcje: https://www.kaggle.com/docs/api\n",
|
|||
|
"!kaggle datasets download -d uciml/iris"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 14,
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"Archive: iris.zip\r\n",
|
|||
|
" inflating: Iris.csv \r\n",
|
|||
|
" inflating: database.sqlite \r\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"!unzip -o iris.zip"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 15,
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"Id,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species\r\n",
|
|||
|
"1,5.1,3.5,1.4,0.2,Iris-setosa\r\n",
|
|||
|
"2,4.9,3.0,1.4,0.2,Iris-setosa\r\n",
|
|||
|
"3,4.7,3.2,1.3,0.2,Iris-setosa\r\n",
|
|||
|
"4,4.6,3.1,1.5,0.2,Iris-setosa\r\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"!head -n 5 Iris.csv"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 18,
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"text/html": [
|
|||
|
"<div>\n",
|
|||
|
"<style scoped>\n",
|
|||
|
" .dataframe tbody tr th:only-of-type {\n",
|
|||
|
" vertical-align: middle;\n",
|
|||
|
" }\n",
|
|||
|
"\n",
|
|||
|
" .dataframe tbody tr th {\n",
|
|||
|
" vertical-align: top;\n",
|
|||
|
" }\n",
|
|||
|
"\n",
|
|||
|
" .dataframe thead th {\n",
|
|||
|
" text-align: right;\n",
|
|||
|
" }\n",
|
|||
|
"</style>\n",
|
|||
|
"<table border=\"1\" class=\"dataframe\">\n",
|
|||
|
" <thead>\n",
|
|||
|
" <tr style=\"text-align: right;\">\n",
|
|||
|
" <th></th>\n",
|
|||
|
" <th>Id</th>\n",
|
|||
|
" <th>SepalLengthCm</th>\n",
|
|||
|
" <th>SepalWidthCm</th>\n",
|
|||
|
" <th>PetalLengthCm</th>\n",
|
|||
|
" <th>PetalWidthCm</th>\n",
|
|||
|
" <th>Species</th>\n",
|
|||
|
" </tr>\n",
|
|||
|
" </thead>\n",
|
|||
|
" <tbody>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>0</th>\n",
|
|||
|
" <td>1</td>\n",
|
|||
|
" <td>5.1</td>\n",
|
|||
|
" <td>3.5</td>\n",
|
|||
|
" <td>1.4</td>\n",
|
|||
|
" <td>0.2</td>\n",
|
|||
|
" <td>Iris-setosa</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>1</th>\n",
|
|||
|
" <td>2</td>\n",
|
|||
|
" <td>4.9</td>\n",
|
|||
|
" <td>3.0</td>\n",
|
|||
|
" <td>1.4</td>\n",
|
|||
|
" <td>0.2</td>\n",
|
|||
|
" <td>Iris-setosa</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>2</th>\n",
|
|||
|
" <td>3</td>\n",
|
|||
|
" <td>4.7</td>\n",
|
|||
|
" <td>3.2</td>\n",
|
|||
|
" <td>1.3</td>\n",
|
|||
|
" <td>0.2</td>\n",
|
|||
|
" <td>Iris-setosa</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>3</th>\n",
|
|||
|
" <td>4</td>\n",
|
|||
|
" <td>4.6</td>\n",
|
|||
|
" <td>3.1</td>\n",
|
|||
|
" <td>1.5</td>\n",
|
|||
|
" <td>0.2</td>\n",
|
|||
|
" <td>Iris-setosa</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>4</th>\n",
|
|||
|
" <td>5</td>\n",
|
|||
|
" <td>5.0</td>\n",
|
|||
|
" <td>3.6</td>\n",
|
|||
|
" <td>1.4</td>\n",
|
|||
|
" <td>0.2</td>\n",
|
|||
|
" <td>Iris-setosa</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>...</th>\n",
|
|||
|
" <td>...</td>\n",
|
|||
|
" <td>...</td>\n",
|
|||
|
" <td>...</td>\n",
|
|||
|
" <td>...</td>\n",
|
|||
|
" <td>...</td>\n",
|
|||
|
" <td>...</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>145</th>\n",
|
|||
|
" <td>146</td>\n",
|
|||
|
" <td>6.7</td>\n",
|
|||
|
" <td>3.0</td>\n",
|
|||
|
" <td>5.2</td>\n",
|
|||
|
" <td>2.3</td>\n",
|
|||
|
" <td>Iris-virginica</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>146</th>\n",
|
|||
|
" <td>147</td>\n",
|
|||
|
" <td>6.3</td>\n",
|
|||
|
" <td>2.5</td>\n",
|
|||
|
" <td>5.0</td>\n",
|
|||
|
" <td>1.9</td>\n",
|
|||
|
" <td>Iris-virginica</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>147</th>\n",
|
|||
|
" <td>148</td>\n",
|
|||
|
" <td>6.5</td>\n",
|
|||
|
" <td>3.0</td>\n",
|
|||
|
" <td>5.2</td>\n",
|
|||
|
" <td>2.0</td>\n",
|
|||
|
" <td>Iris-virginica</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>148</th>\n",
|
|||
|
" <td>149</td>\n",
|
|||
|
" <td>6.2</td>\n",
|
|||
|
" <td>3.4</td>\n",
|
|||
|
" <td>5.4</td>\n",
|
|||
|
" <td>2.3</td>\n",
|
|||
|
" <td>Iris-virginica</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>149</th>\n",
|
|||
|
" <td>150</td>\n",
|
|||
|
" <td>5.9</td>\n",
|
|||
|
" <td>3.0</td>\n",
|
|||
|
" <td>5.1</td>\n",
|
|||
|
" <td>1.8</td>\n",
|
|||
|
" <td>Iris-virginica</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" </tbody>\n",
|
|||
|
"</table>\n",
|
|||
|
"<p>150 rows × 6 columns</p>\n",
|
|||
|
"</div>"
|
|||
|
],
|
|||
|
"text/plain": [
|
|||
|
" Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm \\\n",
|
|||
|
"0 1 5.1 3.5 1.4 0.2 \n",
|
|||
|
"1 2 4.9 3.0 1.4 0.2 \n",
|
|||
|
"2 3 4.7 3.2 1.3 0.2 \n",
|
|||
|
"3 4 4.6 3.1 1.5 0.2 \n",
|
|||
|
"4 5 5.0 3.6 1.4 0.2 \n",
|
|||
|
".. ... ... ... ... ... \n",
|
|||
|
"145 146 6.7 3.0 5.2 2.3 \n",
|
|||
|
"146 147 6.3 2.5 5.0 1.9 \n",
|
|||
|
"147 148 6.5 3.0 5.2 2.0 \n",
|
|||
|
"148 149 6.2 3.4 5.4 2.3 \n",
|
|||
|
"149 150 5.9 3.0 5.1 1.8 \n",
|
|||
|
"\n",
|
|||
|
" Species \n",
|
|||
|
"0 Iris-setosa \n",
|
|||
|
"1 Iris-setosa \n",
|
|||
|
"2 Iris-setosa \n",
|
|||
|
"3 Iris-setosa \n",
|
|||
|
"4 Iris-setosa \n",
|
|||
|
".. ... \n",
|
|||
|
"145 Iris-virginica \n",
|
|||
|
"146 Iris-virginica \n",
|
|||
|
"147 Iris-virginica \n",
|
|||
|
"148 Iris-virginica \n",
|
|||
|
"149 Iris-virginica \n",
|
|||
|
"\n",
|
|||
|
"[150 rows x 6 columns]"
|
|||
|
]
|
|||
|
},
|
|||
|
"execution_count": 18,
|
|||
|
"metadata": {},
|
|||
|
"output_type": "execute_result"
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"import pandas as pd\n",
|
|||
|
"iris=pd.read_csv('Iris.csv')\n",
|
|||
|
"iris"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 19,
|
|||
|
"metadata": {
|
|||
|
"scrolled": true,
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"text/html": [
|
|||
|
"<div>\n",
|
|||
|
"<style scoped>\n",
|
|||
|
" .dataframe tbody tr th:only-of-type {\n",
|
|||
|
" vertical-align: middle;\n",
|
|||
|
" }\n",
|
|||
|
"\n",
|
|||
|
" .dataframe tbody tr th {\n",
|
|||
|
" vertical-align: top;\n",
|
|||
|
" }\n",
|
|||
|
"\n",
|
|||
|
" .dataframe thead th {\n",
|
|||
|
" text-align: right;\n",
|
|||
|
" }\n",
|
|||
|
"</style>\n",
|
|||
|
"<table border=\"1\" class=\"dataframe\">\n",
|
|||
|
" <thead>\n",
|
|||
|
" <tr style=\"text-align: right;\">\n",
|
|||
|
" <th></th>\n",
|
|||
|
" <th>Id</th>\n",
|
|||
|
" <th>SepalLengthCm</th>\n",
|
|||
|
" <th>SepalWidthCm</th>\n",
|
|||
|
" <th>PetalLengthCm</th>\n",
|
|||
|
" <th>PetalWidthCm</th>\n",
|
|||
|
" <th>Species</th>\n",
|
|||
|
" </tr>\n",
|
|||
|
" </thead>\n",
|
|||
|
" <tbody>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>count</th>\n",
|
|||
|
" <td>150.000000</td>\n",
|
|||
|
" <td>150.000000</td>\n",
|
|||
|
" <td>150.000000</td>\n",
|
|||
|
" <td>150.000000</td>\n",
|
|||
|
" <td>150.000000</td>\n",
|
|||
|
" <td>150</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>unique</th>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>3</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>top</th>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>Iris-virginica</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>freq</th>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" <td>50</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>mean</th>\n",
|
|||
|
" <td>75.500000</td>\n",
|
|||
|
" <td>5.843333</td>\n",
|
|||
|
" <td>3.054000</td>\n",
|
|||
|
" <td>3.758667</td>\n",
|
|||
|
" <td>1.198667</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>std</th>\n",
|
|||
|
" <td>43.445368</td>\n",
|
|||
|
" <td>0.828066</td>\n",
|
|||
|
" <td>0.433594</td>\n",
|
|||
|
" <td>1.764420</td>\n",
|
|||
|
" <td>0.763161</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>min</th>\n",
|
|||
|
" <td>1.000000</td>\n",
|
|||
|
" <td>4.300000</td>\n",
|
|||
|
" <td>2.000000</td>\n",
|
|||
|
" <td>1.000000</td>\n",
|
|||
|
" <td>0.100000</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>25%</th>\n",
|
|||
|
" <td>38.250000</td>\n",
|
|||
|
" <td>5.100000</td>\n",
|
|||
|
" <td>2.800000</td>\n",
|
|||
|
" <td>1.600000</td>\n",
|
|||
|
" <td>0.300000</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>50%</th>\n",
|
|||
|
" <td>75.500000</td>\n",
|
|||
|
" <td>5.800000</td>\n",
|
|||
|
" <td>3.000000</td>\n",
|
|||
|
" <td>4.350000</td>\n",
|
|||
|
" <td>1.300000</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>75%</th>\n",
|
|||
|
" <td>112.750000</td>\n",
|
|||
|
" <td>6.400000</td>\n",
|
|||
|
" <td>3.300000</td>\n",
|
|||
|
" <td>5.100000</td>\n",
|
|||
|
" <td>1.800000</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>max</th>\n",
|
|||
|
" <td>150.000000</td>\n",
|
|||
|
" <td>7.900000</td>\n",
|
|||
|
" <td>4.400000</td>\n",
|
|||
|
" <td>6.900000</td>\n",
|
|||
|
" <td>2.500000</td>\n",
|
|||
|
" <td>NaN</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" </tbody>\n",
|
|||
|
"</table>\n",
|
|||
|
"</div>"
|
|||
|
],
|
|||
|
"text/plain": [
|
|||
|
" Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm \\\n",
|
|||
|
"count 150.000000 150.000000 150.000000 150.000000 150.000000 \n",
|
|||
|
"unique NaN NaN NaN NaN NaN \n",
|
|||
|
"top NaN NaN NaN NaN NaN \n",
|
|||
|
"freq NaN NaN NaN NaN NaN \n",
|
|||
|
"mean 75.500000 5.843333 3.054000 3.758667 1.198667 \n",
|
|||
|
"std 43.445368 0.828066 0.433594 1.764420 0.763161 \n",
|
|||
|
"min 1.000000 4.300000 2.000000 1.000000 0.100000 \n",
|
|||
|
"25% 38.250000 5.100000 2.800000 1.600000 0.300000 \n",
|
|||
|
"50% 75.500000 5.800000 3.000000 4.350000 1.300000 \n",
|
|||
|
"75% 112.750000 6.400000 3.300000 5.100000 1.800000 \n",
|
|||
|
"max 150.000000 7.900000 4.400000 6.900000 2.500000 \n",
|
|||
|
"\n",
|
|||
|
" Species \n",
|
|||
|
"count 150 \n",
|
|||
|
"unique 3 \n",
|
|||
|
"top Iris-virginica \n",
|
|||
|
"freq 50 \n",
|
|||
|
"mean NaN \n",
|
|||
|
"std NaN \n",
|
|||
|
"min NaN \n",
|
|||
|
"25% NaN \n",
|
|||
|
"50% NaN \n",
|
|||
|
"75% NaN \n",
|
|||
|
"max NaN "
|
|||
|
]
|
|||
|
},
|
|||
|
"execution_count": 19,
|
|||
|
"metadata": {},
|
|||
|
"output_type": "execute_result"
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"iris.describe(include='all')"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 20,
|
|||
|
"metadata": {
|
|||
|
"scrolled": true,
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"text/plain": [
|
|||
|
"Iris-virginica 50\n",
|
|||
|
"Iris-setosa 50\n",
|
|||
|
"Iris-versicolor 50\n",
|
|||
|
"Name: Species, dtype: int64"
|
|||
|
]
|
|||
|
},
|
|||
|
"execution_count": 20,
|
|||
|
"metadata": {},
|
|||
|
"output_type": "execute_result"
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"iris[\"Species\"].value_counts()"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 21,
|
|||
|
"metadata": {
|
|||
|
"scrolled": true,
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"text/plain": [
|
|||
|
"<AxesSubplot:>"
|
|||
|
]
|
|||
|
},
|
|||
|
"execution_count": 21,
|
|||
|
"metadata": {},
|
|||
|
"output_type": "execute_result"
|
|||
|
},
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAEyCAYAAADjiYtYAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAASkklEQVR4nO3de6xlZX3G8e8zgOKNCuFAplwcbFGrlpujEaGaglhaVKgVkaqdGCq9YEtTi4HeEmusWBPjpd5GRKf1SivIFI1CByiSEHC4CkGD5aYyMgNVGcEil1//2OvIdDgzZ5+zz9lr3tnfT3Ky9nr33rN/yTrznLXf9b7vSlUhSWrPkr4LkCTNjwEuSY0ywCWpUQa4JDXKAJekRhngktSoHcf5YbvvvnstW7ZsnB8pSc27+uqr76mqqc3bxxrgy5YtY+3ateP8SElqXpI7Zmq3C0WSGmWAS1KjDHBJapQBLkmNMsAlqVFDjUJJcjuwEXgEeLiqlifZDfgisAy4HXhdVf1occqUJG1uLmfgv1lVB1XV8m7/dGBNVe0PrOn2JUljMkoXyrHAqu7xKuC4kauRJA1t2Ik8BVyYpICPV9VKYM+qWgdQVeuS7DHTG5OcDJwMsO+++y5AycNbdvpXxvp543b7mcf0XcKi8di1zeM3HsMG+GFVdVcX0hcl+fawH9CF/UqA5cuXe/sfSVogQ3WhVNVd3XY9cB7wIuDuJEsBuu36xSpSkvR4swZ4kqckedr0Y+AVwI3AamBF97IVwPmLVaQk6fGG6ULZEzgvyfTrP1dVX0vyTeCcJCcBdwLHL16ZkqTNzRrgVXUrcOAM7fcCRy5GUZKk2TkTU5IaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktSooQM8yQ5Jrk1yQbe/W5KLktzSbXddvDIlSZubyxn4qcDNm+yfDqypqv2BNd2+JGlMhgrwJHsDxwBnbdJ8LLCqe7wKOG5BK5MkbdWwZ+DvB94OPLpJ255VtQ6g2+6xsKVJkrZm1gBP8kpgfVVdPZ8PSHJykrVJ1m7YsGE+/4QkaQbDnIEfBrw6ye3AF4AjknwGuDvJUoBuu36mN1fVyqpaXlXLp6amFqhsSdKsAV5VZ1TV3lW1DHg9cHFVvRFYDazoXrYCOH/RqpQkPc4o48DPBI5KcgtwVLcvSRqTHefy4qq6FLi0e3wvcOTClyRJGoYzMSWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNmjXAk+yc5Kok1ye5Kck7uvbdklyU5JZuu+vilytJmjbMGfiDwBFVdSBwEHB0khcDpwNrqmp/YE23L0kak1kDvAZ+2u3u1P0UcCywqmtfBRy3GAVKkmY2VB94kh2SXAesBy6qqiuBPatqHUC33WPRqpQkPc5QAV5Vj1TVQcDewIuSPH/YD0hycpK1SdZu2LBhnmVKkjY3p1EoVfVj4FLgaODuJEsBuu36LbxnZVUtr6rlU1NTo1UrSfqFYUahTCV5evf4ScDLgW8Dq4EV3ctWAOcvUo2SpBnsOMRrlgKrkuzAIPDPqaoLklwBnJPkJOBO4PhFrFOStJlZA7yqbgAOnqH9XuDIxShKkjQ7Z2JKUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjZg3wJPskuSTJzUluSnJq175bkouS3NJtd138ciVJ04Y5A38YeFtV/RrwYuCUJM8FTgfWVNX+wJpuX5I0JrMGeFWtq6pruscbgZuBvYBjgVXdy1YBxy1SjZKkGcypDzzJMuBg4Epgz6paB4OQB/ZY8OokSVs0dIAneSrwJeAvquq+Obzv5CRrk6zdsGHDfGqUJM1gqABPshOD8P5sVZ3bNd+dZGn3/FJg/UzvraqVVbW8qpZPTU0tRM2SJIYbhRLgk8DNVfW+TZ5aDazoHq8Azl/48iRJW7LjEK85DHgT8K0k13Vtfw2cCZyT5CTgTuD4RalQkjSjWQO8qi4HsoWnj1zYciRJw3ImpiQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRswZ4krOTrE9y4yZtuyW5KMkt3XbXxS1TkrS5Yc7APw0cvVnb6cCaqtofWNPtS5LGaNYAr6rLgP/ZrPlYYFX3eBVw3MKWJUmazXz7wPesqnUA3XaPhStJkjSMRb+ImeTkJGuTrN2wYcNif5wkTYz5BvjdSZYCdNv1W3phVa2squVVtXxqamqeHydJ2tx8A3w1sKJ7vAI4f2HKkSQNa5hhhJ8HrgCeneT7SU4CzgSOSnILcFS3L0kaox1ne0FVnbiFp45c4FokSXPgTExJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWrUSAGe5Ogk30ny3SSnL1RRkqTZzTvAk+wAfBj4beC5wIlJnrtQhUmStm6UM/AXAd+tqlur6ufAF4BjF6YsSdJsRgnwvYDvbbL//a5NkjQGO47w3szQVo97UXIycHK3+9Mk3xnhM7d1uwP3jOvD8p5xfdJE8Ni1bXs/fs+YqXGUAP8+sM8m+3sDd23+oqpaCawc4XOakWRtVS3vuw7NnceubZN6/EbpQvkmsH+S/ZI8AXg9sHphypIkzWbeZ+BV9XCStwJfB3YAzq6qmxasMknSVo3ShUJVfRX46gLVsj2YiK6i7ZTHrm0TefxS9bjrjpKkBjiVXpIaZYBLUqMMcEnNSbIkyUv6rqNv9oEvgCTHAM8Ddp5uq6p/6K8iDctj164kV1TVoX3X0SfPwEeU5GPACcCfMZidejxbmDWlbYvHrnkXJvm9JDPNCp8InoGPKMkNVXXAJtunAudW1Sv6rk1b57FrW5KNwFOAR4CfMfgjXFW1S6+FjdFI48AFDH5xAB5I8svAvcB+Pdaj4XnsGlZVT+u7hr4Z4KO7IMnTgfcC1zBY0OusXivSsDx2jUvyauCl3e6lVXVBn/WMm10oCyjJE4Gdq+onfdeiufHYtSfJmcALgc92TScCV1fVxNwdzIuYI0pySncWR1U9CCxJ8qf9VqVhJDk+yfTX8NOATyU5uM+aNCe/AxxVVWdX1dnA0V3bxDDAR/eWqvrx9E5V/Qh4S3/laA7+rqo2Jjkc+C1gFfCxnmvS3Dx9k8e/1FcRfTHAR7dk02FM3b1Cn9BjPRreI932GOCjVXU+HruWvBu4Nsmnk6wCrgb+seeaxso+8BEleS+wjMGZWwF/DHyvqt7WZ12aXZILgB8ALwdewGBUylVVdWCvhWloSZYy6AcPcGVV/bDnksbKAB9RkiXAHwFHMvgluhA4q6oe2eob1bskT2b
|
|||
|
"text/plain": [
|
|||
|
"<Figure size 432x288 with 1 Axes>"
|
|||
|
]
|
|||
|
},
|
|||
|
"metadata": {
|
|||
|
"needs_background": "light"
|
|||
|
},
|
|||
|
"output_type": "display_data"
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"iris[\"Species\"].value_counts().plot(kind=\"bar\")"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 22,
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"text/html": [
|
|||
|
"<div>\n",
|
|||
|
"<style scoped>\n",
|
|||
|
" .dataframe tbody tr th:only-of-type {\n",
|
|||
|
" vertical-align: middle;\n",
|
|||
|
" }\n",
|
|||
|
"\n",
|
|||
|
" .dataframe tbody tr th {\n",
|
|||
|
" vertical-align: top;\n",
|
|||
|
" }\n",
|
|||
|
"\n",
|
|||
|
" .dataframe thead th {\n",
|
|||
|
" text-align: right;\n",
|
|||
|
" }\n",
|
|||
|
"</style>\n",
|
|||
|
"<table border=\"1\" class=\"dataframe\">\n",
|
|||
|
" <thead>\n",
|
|||
|
" <tr style=\"text-align: right;\">\n",
|
|||
|
" <th></th>\n",
|
|||
|
" <th>PetalLengthCm</th>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>Species</th>\n",
|
|||
|
" <th></th>\n",
|
|||
|
" </tr>\n",
|
|||
|
" </thead>\n",
|
|||
|
" <tbody>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>Iris-setosa</th>\n",
|
|||
|
" <td>1.464</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>Iris-versicolor</th>\n",
|
|||
|
" <td>4.260</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" <tr>\n",
|
|||
|
" <th>Iris-virginica</th>\n",
|
|||
|
" <td>5.552</td>\n",
|
|||
|
" </tr>\n",
|
|||
|
" </tbody>\n",
|
|||
|
"</table>\n",
|
|||
|
"</div>"
|
|||
|
],
|
|||
|
"text/plain": [
|
|||
|
" PetalLengthCm\n",
|
|||
|
"Species \n",
|
|||
|
"Iris-setosa 1.464\n",
|
|||
|
"Iris-versicolor 4.260\n",
|
|||
|
"Iris-virginica 5.552"
|
|||
|
]
|
|||
|
},
|
|||
|
"execution_count": 22,
|
|||
|
"metadata": {},
|
|||
|
"output_type": "execute_result"
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"iris[[\"Species\",\"PetalLengthCm\"]].groupby(\"Species\").mean()"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 23,
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"text/plain": [
|
|||
|
"<AxesSubplot:xlabel='Species'>"
|
|||
|
]
|
|||
|
},
|
|||
|
"execution_count": 23,
|
|||
|
"metadata": {},
|
|||
|
"output_type": "execute_result"
|
|||
|
},
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAWoAAAFACAYAAACV7zazAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAY+ElEQVR4nO3dfZRU9Z3n8c+nGxQSMG603WPEBFRGI0+NNixCIFHiw4qTmU1iiJKsZ+LT7IYdNpnokTiYE0ej2XjUjJPEIIO46xNO8GnUzGhURs1xeZIGRXQh2kZGFDQZRPAB8Lt/1K22hYa+jV11f9X1fp1Tp+reunXr21TXh1//7u/+riNCAIB0NRRdAABgzwhqAEgcQQ0AiSOoASBxBDUAJI6gBoDE9anETg888MAYPHhwJXYNAL3SsmXLXo+Ips6eq0hQDx48WEuXLq3ErgGgV7L90u6eo+sDABJHUANA4ghqAEhcRfqoO7Nt2zatW7dO77zzTrXeEj2gX79+GjRokPr27Vt0KUDdqlpQr1u3TgMHDtTgwYNlu1pvi48gIvTGG29o3bp1GjJkSNHlAHWral0f77zzjg444ABCuobY1gEHHMBfQUDBqtpHTUjXHj4zoHh1dTCxsbFRzc3NGj58uE4//XRt3bp1t9u2trbqgQce6HKfCxcu1GmnnSZJmjdvnqZPn95j9e6sra1Nt956a/vynt7vrbfe0vnnn6/DDz9cw4YN06RJk7Ro0aKK1QagcqrWR72zwRfd36P7a7tySpfb9O/fX62trZKkadOm6frrr9d3v/vdTrdtbW3V0qVLdeqpp/ZkmR9JOajPPPPMLrc955xzNGTIEK1Zs0YNDQ164YUXtHr16ipUiXrS09/jlOTJlGqpqxZ1RxMnTtTatWu1ZcsWfetb39KYMWM0evRo3XPPPXrvvfd0ySWXaP78+Wpubtb8+fO1ePFijR8/XqNHj9b48eP1/PPP536vm2++WWPHjlVzc7POP/987dixQ5I0YMAAXXzxxRo1apTGjRun1157TZL0u9/9TuPGjdOYMWN0ySWXaMCAAZKkiy66SI8//riam5t1zTXXSJJeeeUVnXLKKRo6dKguvPDC9tcvWrRIl112mRoaSh/xYYcdpilTpqitrU1HHXWUzjnnHA0fPlzTpk3Tb37zG02YMEFDhw7V4sWLe+zfGEDPqMug3r59u379619rxIgRuvzyy3XCCSdoyZIlevTRR3XBBRdo27ZtuvTSSzV16lS1trZq6tSpOuqoo/TYY49p+fLluvTSS/X9738/13utXr1a8+fP129/+1u1traqsbFRt9xyiyRpy5YtGjdunFasWKFJkybphhtukCTNmDFDM2bM0JIlS/SpT32qfV9XXnmlJk6cqNbWVn3nO9+RVGr5z58/X08//bTmz5+vl19+WatWrVJzc7MaGxs7rWnt2rWaMWOGVq5cqeeee0633nqrnnjiCV111VX60Y9+9FH+aQFUQGFdH0V4++231dzcLKnUoj777LM1fvx43XvvvbrqqqsklUan/P73v9/ltZs2bdJZZ52lNWvWyLa2bduW6z0ffvhhLVu2TGPGjGmv4aCDDpIk7bPPPu3928cee6weeughSdKTTz6pu+++W5J05pln6nvf+95u9z958mR94hOfkCQdffTReuml3U4X0G7IkCEaMWKEJGnYsGGaPHmybGvEiBFqa2vL9XMBqJ66CuqOfdRlEaEFCxboyCOP/ND6nQ+8zZo1S8cff7zuuusutbW16Qtf+EKu94wInXXWWbriiit2ea5v377toyoaGxu1ffv2/D9MZt99921/XN7HsGHDtGLFCr3//vvtXR+7e01DQ0P7ckNDw17VAKCy6rLro6OTTz5Z1113ncpXY1++fLkkaeDAgdq8eXP7dps2bdIhhxwiqTTaIq/JkyfrV7/6lTZs2CBJ+sMf/tBlq3fcuHFasGCBJOn2229vX79zTbtz+OGHq6WlRT/4wQ/af641a9bonnvuyV03gHTUfVDPmjVL27Zt08iRIzV8+HDNmjVLknT88cfr2WefbT+YeOGFF2rmzJmaMGFC+8HAzsybN0+DBg1qv+2333667LLLdNJJJ2nkyJE68cQTtX79+j3WdO211+rqq6/W2LFjtX79+vaujZEjR6pPnz4aNWpU+8HE3ZkzZ45effVVHXHEERoxYoTOPffcD/V3A6gdLre4elJLS0vsPB/16tWr9dnPfrbH36s32rp1q/r37y/buv3223XbbbcV2hrms8PuMDyv59heFhEtnT1XV33UtWLZsmWaPn26IkL777+/5s6dW3RJAApEUCdo4sSJWrFiRdFlAEhE3fdRA0DqqhrUlegPR2XxmQHFq1pQ9+vXT2+88QZf/BpSno+6X79+RZcC1LWq9VEPGjRI69at08aNG6v1lugB5Su8AChO1YK6b9++XCUEAPYCBxMBIHEENQAkLlfXh+02SZsl7ZC0fXdnzwAAel53+qiPj4jXK1YJAKBTdH0AQOLyBnVIetD2MtvnVbIgAMCH5e36mBARr9g+SNJDtp+LiMc6bpAF+HmS9OlPf7qHywSA+pWrRR0Rr2T3GyTdJWlsJ9vMjoiWiGhpamrq2SoBoI51GdS2P257YPmxpJMkPVPpwgAAJXm6Pv6jpLuya/v1kXRrRPxzRasCALTrMqgj4gVJo6pQCwCgEwzPA4DEEdQAkDiCGgASR1ADQOIIagBIHEENAIkjqAEgcQQ1ACSOoAaAxBHUAJA4ghoAEkdQA0DiCGoASBxBDQCJI6gBIHEENQAkjqAGgMQR1ACQOIIaABJHUANA4ghqAEgcQQ0AiSOoASBxfYouAPVt8EX3F11CRbVdOaXoEtAL0KIGgMQR1ACQOIIaABJHUANA4nIHte1G28tt31fJggAAH9adFvUMSasrVQgAoHO5gtr2IElTJM2pbDkAgJ3lbVFfK+lCSe9XrhQAQGe6DGrbp0naEBHLutjuPNtLbS/duHFjjxUIAPUuT4t6gqQv2W6TdLukE2zfvPNGETE7IloioqWpqamHywSA+tVlUEfEzIgYFBGDJX1d0iMR8Y2KVwYAkMQ4agBIXrcmZYqIhZIWVqQSAECnaFEDQOIIagBIHEENAIkjqAEgcQQ1ACSOoAaAxBHUAJA4ghoAEkdQA0DiCGoASBxBDQCJI6gBIHEENQAkjqAGgMQR1ACQOIIaABJHUANA4ghqAEgcQQ0AiSOoASBxBDUAJI6gBoDEEdQAkDiCGgASR1ADQOIIagBIHEENAIkjqAEgcQQ1ACSuy6C23c/2YtsrbK+y/cNqFAYAKOmTY5t3JZ0QEW/Z7ivpCdu/joj/W+HaAADKEdQREZLeyhb7ZreoZFEAgA/k6qO23Wi7VdIGSQ9FxKKKVgUAaJcrqCNiR0Q0Sxokaazt4TtvY/s820ttL924cWMPlwkA9atboz4i4t8lLZR0SifPzY6IlohoaWpq6pnqAAC5Rn002d4/e9xf0hclPVfhugAAmTyjPg6WdJPtRpWC/Y6IuK+yZQEAyvKM+lgpaXQVagEAdIIzEwEgcQQ1ACSOoAaAxBHUAJA4ghoAEkdQA0DiCGoASBxBDQCJI6gBIHEENQAkjqAGgMQR1ACQOIIaABJHUANA4ghqAEgcQQ0AiSOoASBxBDUAJI6gBoDEEdQAkDiCGgASR1ADQOIIagBIHEENAIkjqAEgcQQ1ACSOoAaAxBHUAJC4LoPa9qG2H7W92vYq2zOqURgAoKRPjm22S/rriHjK9kBJy2w/FBHPVrg2AIBytKgjYn1EPJU93ixptaRDKl0YAKCkW33UtgdLGi1pUUWqAQDsIndQ2x4gaYGk/xkRb3by/Hm2l9peunHjxp6sEQDqWq6gtt1XpZC+JSLu7GybiJgdES0R0dLU1NSTNQJAXcsz6sOS/kHS6oi4uvIlAQA6ytOiniDpm5JOsN2a3U6tcF0AgEyXw/Mi4glJrkItAIBOcGYiACSOoAaAxBHUAJA4ghoAEkdQA0DiCGoASBxBDQCJI6gBIHEENQAkjqAGgMQR1ACQOIIaABJHUANA4ghqAEgcQQ0
|
|||
|
"text/plain": [
|
|||
|
"<Figure size 432x288 with 1 Axes>"
|
|||
|
]
|
|||
|
},
|
|||
|
"metadata": {
|
|||
|
"needs_background": "light"
|
|||
|
},
|
|||
|
"output_type": "display_data"
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"iris[[\"Species\",\"PetalLengthCm\"]].groupby(\"Species\").mean().plot(kind=\"bar\")"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 24,
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"text/plain": [
|
|||
|
"<seaborn.axisgrid.FacetGrid at 0x7f97eed545b0>"
|
|||
|
]
|
|||
|
},
|
|||
|
"execution_count": 24,
|
|||
|
"metadata": {},
|
|||
|
"output_type": "execute_result"
|
|||
|
},
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAdoAAAFtCAYAAACgK6tiAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABg1ElEQVR4nO3deXwU9f348dfMnrnvC8KNHMohEO5LARUBgYIHXtSq0NYDa2ulnigiX1FbWlGLtrX+rEetiiKIioCK3CAooIDIkQC5T5JNstfM74/AwrIhB2R3s+H9fDx4PNhP5j3zzhLy3pn5zPuj6LquI4QQQgi/UIOdgBBCCNGSSaEVQggh/EgKrRBCCOFHUmiFEEIIP5JCK4QQQviRFFohhBDCj4zBTqCpFRVVoGmNe2IpLi6ckpJKP2XkP6GaN0juwRCqeUPLzz0pKSpA2YhgkDNawGg0BDuFcxKqeYPkHgyhmjdI7iK0SaEVQggh/EgKrRBCCOFHUmiFEEIIP5JCK4QQQviRFFohhBDCj6TQCiGEEH4khVYIIYTwIym0QgghhB8FpDNUSUkJDz74IFlZWZjNZtq1a8fcuXOJj4/32m7RokW8/fbbJCcnA9C3b1/mzJkTiBSFEH5mMCiAgtutnUOcN6NRRdP0RneBEyIYAlJoFUXhzjvvZODAgQAsWLCA559/nvnz5/tsO3nyZGbPnh2ItIQQAaAokO/KY8PhrdhcVQxrM4B0azqqXnfHJEWBPGceGw9vpXJfFUPbDCA1LJnD5VlsOPotaZHJDGjVh3g1AV3qrWjGAlJoY2NjPUUW4NJLL+Wdd94JxKGFEEFW4Mpn3jd/w6W5AFiftZU/DP41HcM61Rv39LpTcbvz9zGhy2je3PmhZ5vVh9bx6NDfEa3E+i1/Ic5XwO/RaprGO++8w6hRo2r9+ieffMI111zD7bffzo4dOwKcnRCiKSmKwq6CPZ5iedKyn74Aw9kvISuKws5877hBbfry0d6VXtvZHJUcrchu2qSFaGIBX73nqaeeIjw8nFtuucXna9OmTeM3v/kNJpOJ9evXc9ddd7FixQri4uIavP+EhMhzyitUV88I1bxBcg+GYOSt5PqO6ehER4dhMZrPHpjjfT1YQUGv5RqxwaA2+3+P5p6f8K+AFtoFCxaQmZnJ4sWLUVXfk+mkpCTP34cOHUpaWhr79+9nwIABDT7GuSyTl5QURUFBeaNimoNQzRsk92AIVt6XJHRlifIpbv3UGez4zmM4XmIH7GeN65HYjY+Uzzxxm4/u4JquY/jv7o8924SZrKSGpTTrf4+GvO9SiFu2gBXahQsXsnv3bl599VXM5to/xebl5ZGSkgLAnj17OHbsGB06dAhUikIIP0gypfDI8PtYc3g9Nmclo9sPo214W6jn83DyaXGVzkpGtR9Gq/AUEsLi+TpzI62iUhneZiCxapxMhhLNmqLXdi2mie3fv58JEybQvn17rFYrAOnp6bz00kvMmDGDWbNm0bNnT2bPns0PP/yAqqqYTCZmzZrFyJEjG3UsOaMNDZJ74AU7b4Oh5ipW4x/vUYmLC6ewsMIzZjSq6LqO2938K6yc0YqAFNpAkkIbGiT3wAvVvKHl5y6FtmWTzlBCCCGEH0mhFUIIIfxICq0QQgjhR1JohbhAqQZQDP6ZomEwKGByYzTKrxghAt6wQggRZIpOtuMYn+75Epujkqs6jaRTVCeMuqlJdl+k5/PNwc38XHyYnindGdiqD7EkNMm+hQhFUmiFuMDkOfOYv26Rp8vSvqID3NP/NrpHXXze+65Uj/PSxtfJsxUCcKjkCAeLM5nZ+1YM7jq6QAnRgsl1HSEuIIqisLtgr08rw0/2r0ZX3ee9/2xbnqfInrQ7fx/59oLz3rcQoUoKrRAXFB2zwfcSscVoQfFd9rXRDIrvrxQFBbWWcSEuFPLTL8QFRNfh4oQuPsX2mi5XgLvu9WEbIi08lc7x7b3GhrbNIMmSfN77FiJUyT1aIS4wicZkHh32O3bk7cLmrCIjrRdpltb19h5uCKsWwe2XTuPHwp84WJJJt8TOdI3vjOo6/yIuRKiSQivEBUbXdRIMSVyZPhqgpmVpEz7lE0M8Q5MHMSJtKE6nu0n3LUQokkIrxAWqsT3BG7dv0LTzn1wlREsg92iFEEIIP5JCK4QQQviRFFohhBDCj6TQCnGB0gwu3KrD8/ysqio4VQe64bR7q6qGU7WjqPpZ4871eLVS9ZrjKWe/f1xrns2AprpwncP7Ilo+mQwlxAVGUzQyKw+zZO8KbM4qxl80mm7xndmes5PVh9aRGB7PlO7jiDRFsGzvSvYWHaBfWk+u6DCS/MpCluxdQeWJuJ7xl2DSLXUeT1c0Dlce5oO9K6iqI65EL+KTvV+wr+gg/dJ6Mrr9CKKI8dqmChtbcraz5tB6kiMSmNptPKnmNNCDV900xc0h2yGW7F2B3eVgQpcxXBLXvd73RVw4FP3MXmwhrqiootGzKZOSoigoKPdTRv4TqnmD5B4MJ/POcR5j3jd/84ynRiaR0bo3y/et8owZVAM39ZrEf75b4hm7o+8N/Gv7u177nNnvFi6N7UVdv0WyHUd5et0LXmO/7ncLvU+Lq1JszFu/kNLq455tuiR05J4+d2DQTCQlRVFYVM5nWav4+KeVXnk+MfwPxBsSG/VeNKVjjiPMX7fIa+y3GdPpEd0DaNjPS1JSlN/yE8Enl46FuICoqsKu/L1eY31b9WT1wXVeY27NTaWj2vM6PiyWgyVHfPb32c9fotXRI1lVFXbm7/GNO+Adl19V4FVkAX4qOkiJs8Tzulqv5PMDX/nkecyWc9bj+5uqKmzP3e0zvvLA12DQgpCRaI6k0ApxAdF1nUhzhNdYlbOKSFO4z7an9ye2uxyEmaw+28RYolA4+2VbXdeJskT4jMdYolFO+/VjVn37L6uKium0cVUxEGH2zdOsBm9VIF2HaEukz3iMNbrO90VcWKTQCnEB0XW4OLGLV8Hacux7rusxwWu7pPAEok4rIDZnJZ3j2xNxWkFWFZVrul4J7rP/GtF1uCShq29clyvAfaoQJVmT6JvW0yt23EWjiTHEel6bdQs39ZzstU1yRCLpka3q/qb9SNd1eiVd7PUhxKCojOs8Ct0thVbUkHu0hP49t1AkuQfeybwVBUrdxewvPYTdZeei+I4kmhPJrc5lf8khYi3RdIptjxETmRVZZFfk0Ta6NW0j21DptrG/5FRcsiml3olIigIlWjE/n4jrEt+JZFMy+hlxVdjIrDhCTkUebaPTaRuR7plQdDJ3TXGTY8/hZ0+eHYgk2m/vWUMoChS7i/i59BBOt5OL4juSZEz2vC9yj1ZIoSX0f3GGIsk98GrLW1Hwmsh05uvGjDVEQ+Jq2+bM3M/1+P7WkNxrI4W2ZZNLx0JcwM4sCrUVr4aOncvx/LlNMDTXvERwSaEVQggh/EgKrRBCCOFHUmiFEEIIP5JCK4RocprBRZViQ6+jmYWi6lQrlbhURwAzazxFJSTyFM2X9DoWQjQZRYF8Vx7vfP8RB0oy6ZXSjWu7XUOMEue1nY1yVh38mq8yNxIfFsetvabQPqxDUHsW18ZGOV8c/IqvMzeREBbHLb2m0j6sfbPLUzRvckYrhGgyNr2c5zb+nX1FB3BpLrbn7Obv2/8fztPOBhVV54vDX7Py4Focbie5Ffk8v/EVCpz5Qcy8FqrO54fW8MXBb3C4neRU5PP8xsUUugqCnZkIMVJohRBNJr+qCJuj0mvsSFk2pY5TPYur9Cq+ytzotY2u62Tb8gKSY0NV6ZV8lbnJa0zXdXKaWZ6i+ZNCK4RoMmFG337IBtWAxXBqyTijYiTBGuuzXXgtscFkVIzEW2N8xmv7HoWoixRaIUSTSTQncFn7wV5j13UfT4x6qmAZNTO39J7q1XS/Q2wb0iOC17O4Nmbdwq29r/XKs2NcW1o3szxF8yctGGlZLfVCheQeeIHK20E12VW5lNpLSQxLINWailE/Y3UeRafAmU+2LY9wo5X0iFaE4bsKzklBe88VjXxnPjm2/Jo8I1sTpvuuRlQXacEoZNaxEKJJmbHWzMw
|
|||
|
"text/plain": [
|
|||
|
"<Figure size 474.35x360 with 1 Axes>"
|
|||
|
]
|
|||
|
},
|
|||
|
"metadata": {},
|
|||
|
"output_type": "display_data"
|
|||
|
},
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAdoAAAFtCAYAAACgK6tiAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAB0yElEQVR4nO3dd5hU5fnw8e8502d2ts82ekekShOwgcQuIJZgosYSkVjwF2PktSJGMZZoFDXYExM1VlSqBrECIkiR3qVt77uzu9POef8YGBhmYWfZndkF7s91cV3sc9q9Z87Ofcpz7kfRdV1HCCGEEDGhtnQAQgghxIlMEq0QQggRQ5JohRBCiBiSRCuEEELEkCRaIYQQIoYk0QohhBAxFPdE+8ILL9CjRw+2bNkSMW3GjBkMGzaMsWPHMnbsWKZNmxbv8IQQQohmZYznxtavX8/q1avJyck54jzjxo1jypQpx7yNkpJqNC1+rwanpNgpK6uJ2/ai1VrjgtYbm8TVOBJX4x0pNpfL2QLRiHiJ2xWt1+vlkUceYerUqSiKEq/NxpzRaGjpEOrVWuOC1hubxNU4ElfjtebYROzELdE+99xzjBkzhnbt2h11vrlz53LppZdy4403smrVqjhFJ4QQQsSGEo8SjKtWreLZZ5/lX//6F4qiMGrUKGbOnEn37t3D5isqKiI5ORmTycTixYu5++67mTdvHikpKbEOUQghhIiJuDyjXb58OTt27ODcc88FID8/n5tuuonHH3+cM844IzSfy+UK/X/EiBFkZ2ezdetWhgwZEvW24v2M1uVyUlRUFbftRau1xgWtNzaJq3EkrsY7UmzyjPbEFpdEO3HiRCZOnBj6+UhXtAUFBWRmZgKwceNG9u3bR6dOneIRohBCCBETce11XJ+bb76ZyZMn06dPH5555hnWr1+PqqqYTCaefPLJsKtcIYQQ4njTIol20aJFof+/+uqrof8/8cQTLRGOEEIIETNSGUoIIYSIIUm0QgghRAxJohVCCCFiSBKtOC5oQHGVh/IaX0uHIoQQjdLivY6FaEhVnZ83521g9ZZiVFVh7JmdOX9wO8xGOU8UQrR+8k0lWjVFVfhq5V5WbykGQNN0Zn2znZ0FrbMggRBCHE4SrWjVfH6dH9bnR7Rv3VOOqp44g1MIIU5ckmhFq2YyKPRoH1nrun2mM66lNoUQ4lhJohWtmq7rXHpGJ1KcllBbn65pdGmT1IJRCSFE9KQzlGj1Uh1m/nLz6eSX1mAyGshMsWI2yDmiEOL4IIlWHBfsZgOds2SEEyHE8UcuC4QQQogYkkQrhBBCxJAkWiGEECKGJNEKIYQQMSSJVgghhIghSbRCCCFEDEmiFUIIIWJIEq0QQggRQ5JohRBCiBiSRCuEEELEkCRaIYQQIoYk0QohhBAxJIlWCCGEiCFJtEIIIUQMSaIVQgghYkgSrRBCCBFDMvC7iIuArlNYUUdRWS2uijrSE8yYDHKeJ4Q48UmiFTGnKPDT5mJmzlobart4eEfGntEJo6q0YGRCCBF7ckkhYq6i1s+bczaEtc1d8gvFlXUtFJEQQsSPJFoRc3VePx5fIKK9utbXAtEIIUR8SaIVMZeSYKFtRkJYm8VsICPF3kIRCSFE/EiiFTFnUhX+79f96dkhBYCcdAf3/24wSTZTC0cmhBCxJ52hRFykOszcNaE/tZ4AaSl2vLVedF1v6bCEECLm5IpWxI1RUXBajSQlWFo6FCGEiBtJtEIIIUQMSaIVQgghYkgSrRBCCBFDkmhFBEWKNQkhRLORXscixKfp7C1ys31fBZmpdjrnJOIwG1o6LCGEOK5JohUAKCosWZvPv+ZuDLX17JDC5Cv7YTXKjQ8hhDhW8g0qAKis9fPfL7aEtW3aVUZeSU0LRSSEECcGSbQCgEBAr7cesbeeNiGEENGTRCsASLSbGNYnO6zNbjWSne5ooYiEEOLEIM9oBRA847p6dDcyU2x8/3MuHbOTuGJkV5JsRqRSohBCHDtJtCIkwWJk7BmduGBoB8wGFdAlyQohRBPJrWMRRtd0zAYFkAwrhBDNQRKtEEIIEUOSaIUQQogYkkQrhBBCxFDcE+0LL7xAjx492LJlS8S0QCDAtGnTGD16NL/61a/44IMP4h2eOE4pChiNKooUahZCtDJx7XW8fv16Vq9eTU5OTr3TZ8+eze7du/niiy8oLy9n3LhxDBs2jLZt28YzTHGccXsCrN5ezMpNhfTvls6A7i4SLNKhXgjROsTtitbr9fLII48wderUI151zJs3jyuvvBJVVUlNTWX06NEsWLAgXiGK45Bf13lz3kZe/2w9q7YU8ebcjcyctQ6fJr2mhRCtQ9wS7XPPPceYMWNo167dEefJy8sLu9rNzs4mPz8/HuGJ41RJpYeVmwvD2jb8UkpRRV0LRSSEEOHicn9t1apVrF27lrvvvjvm20pLS4j5Ng7ncjnjvs1otNa4oPliK6v119tuMhmOaRutdZ9JXI3TWuOC1h2biI24JNrly5ezY8cOzj33XADy8/O56aabePzxxznjjDNC82VnZ5Obm0vfvn2ByCvcaJSUVKPF8bahy+WkqKgqbtuLVmuNC5o3NqfVSP/uLlZvKQq19eyQQrLN1OhttNZ9JnE1TmuNC44cmyTfE1tcEu3EiROZOHFi6OdRo0Yxc+ZMunfvHjbfBRdcwAcffMB5551HeXk5Cxcu5O23345HiOI4ZVTgxotPYVV3Fys3FdKvm4uBPVyYDNL7WAjROrR418ybb76ZyZMn06dPH8aOHcuaNWs477zzALjtttuO+kxXCAjWaD6rTzYj+7chENDQpUCzEKIVaZFEu2jRotD/X3311dD/DQYD06ZNa4mQxHFO13X8fhk7VwjR+khlKCGEECKGJNEKIYQQMSSJVgghhIghSbSi2agq+HSQJ6VCCHFQi/c6FieGOr/Gmu0lzF28E5NR5bKzu9KzXSJGVc7lhBAnN/kWFM1i/a4yXp61lr2F1ezMreSZd1eys8Dd0mEJIUSLk0Qrmkw1qiz8cXdE+7L1+ZhMcogJIU5u8i0omkxRwGk3R7Q77SY0rQUCEkKIVkQSrWiygE/j4uEdMagHyx7aLEYG98okEJBMK4Q4uUlnKNEsOmQ6mPr709mwswSjUeWUjqm0SbHKFa0Q4qQniVY0Dw3aptpon942+KOGJFkhhEASrWhmklyFECKcPKMVQgghYkgSrRBCCBFDkmiFEEKIGJJEe5JRjVBe66e81o/BqDS8QGuhgNsboM6noarHUdzi+KHquKnCo9SiKI07xlRVoU5xU6NUo8i3qjiMdIY6iVR5Any7bB9zvt8JwCVndOLMfjkkWlv3YVDjDTB/2S4WLN2F1WLkmgt6MrBbOkZJuKKZuKnisy2fs3jPcpzmBK7rezk9E3ug6IYGl/UrXn4qXMUHG+bi0/xc0PUcRrY9Ayv2OEQujgdy7nUS2fBLKR99tQ2PL4DHF+Cjr7axcVdZS4d1VKqqsHR9PnMX/0JA03HX+nh51lr2FEkdZdE8FFXnf798w/e7f0TXdSo9Vbyw/J8UeAuiWn63ew9v/fwRtf46/JqfOVsWsqZkXaOvisWJSxLtScJuN/Pj+vyI9h/W5WOvp3xia+ENaCz6aW9E+8ZfSuUWsmgWtXot3+/5MaJ9X3Xk38vhVFXh58INEe3f7PoBXfU3S3zi+CeJ9iTh9wfIcSVEtLdxJeD3t94RZI2qQpt64s5ItaNpegtEJE40JsVEpiM9oj3RHHncHU7XdbITMiPa2yXmRHXbWZwcJNGeJLzeACP65pCUcPDqNdFhZkS/bLze1pto0eHyc7pgMR/80mrjSqBHu+SWi0mcUAyaid/0vgyDevAY65raibYJbRpcVtehV3oP0u2poTab0cp5nc9Gl+ItYj9F1/UT6rKgpKQ6rlc6LpeToqKquG0vWkeKq7jay56CanRdp32Wk/SE+N82buw+UxSFUreX3KJqTEYDbVwOHObmv1o43j7LlnZCxaXolAZKyHcXYjVYyHFkY9Wj78zkpopcdz4BPUCOI4skNZn6vlmPFJvL5WxcvOK40rq7m4pml55
|
|||
|
"text/plain": [
|
|||
|
"<Figure size 474.35x360 with 1 Axes>"
|
|||
|
]
|
|||
|
},
|
|||
|
"metadata": {},
|
|||
|
"output_type": "display_data"
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"import seaborn as sns\n",
|
|||
|
"sns.set_theme()\n",
|
|||
|
"sns.relplot(data=iris, x=\"PetalLengthCm\", y=\"PetalWidthCm\", hue=\"Species\")\n",
|
|||
|
"sns.relplot(data=iris, x=\"SepalLengthCm\", y=\"SepalWidthCm\", hue=\"Species\")"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 25,
|
|||
|
"metadata": {
|
|||
|
"scrolled": true,
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"text/plain": [
|
|||
|
"<seaborn.axisgrid.FacetGrid at 0x7f97ef942eb0>"
|
|||
|
]
|
|||
|
},
|
|||
|
"execution_count": 25,
|
|||
|
"metadata": {},
|
|||
|
"output_type": "execute_result"
|
|||
|
},
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAdgAAAFtCAYAAACk3ntfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABjbUlEQVR4nO3deXwU9f348dfMnjk29yYECKfcgtwICMpRFQWJKC22HvXAIiL8bClUrSAeVNBqK6VfvFtbsRYUFIhoERREQBALQhERlCPkPsi12Wvm90dkw5KbZDdZ8n4+Hj7MfuazM+8ddve9M/OZz1vRdV1HCCGEEE1Kbe4AhBBCiIuRJFghhBAiACTBCiGEEAEgCVYIIYQIAEmwQgghRABIghVCCCECwNjcAQRTXl4Jmta0dyXFxoZTUFDWpOsMtFCLOdTihdCLOdTihdCLubp47XZbM0UjgkGOYBvJaDQ0dwgNFmoxh1q8EHoxh1q8EHoxh1q8ovEkwQohhBABIAlWCCGECABJsEIIIUQASIIVQgghAkASrBBCCBEAkmCFEEKIAJAEK4QQQgSAJFghhBAiACTBCiFaJUUBo1FFURq3HlVVMBrlq1RU1aqmShRCCACLpxDPiX2U/7Afa6f+GDr0w2mMbvB6rOXZOL/7gvLcE4T3GIGe1AO3GhaAiEUokgQrhGhVTLqT4k0v4zxxEADHd19i7TKA8PH34cZc7/VY3QXkrX4SrayoYj3ffkHMVbei9BiP3rRTnosQJec1hBCtilqS5UuuZ5Uf+wq1NLtB69HyTviS61lnPl+N2VPc6BjFxUESrBCidanp8LKBR516dU/Q9YavSFy0JMEKIVoV3ZaEuV0PvzZLxz5okfYGrccQl4IaFunXFnV5Km5TVKNjFBcHuQYrhGhVXIqVqKtn4D62G+ex/2LpOhBT58E4sTRoPeXmeOJv/j3lh7bhzjlOeJ8roW0f3FqAAhchRxKsEKLVcZpiUXpeQ1ifa/F6dZwXOCqpPKwNhiE/xagouD2SWYW/oCXYmTNncurUKVRVJTw8nEcffZRevXr59Zk3bx6HDx/2PT58+DDLly9n3LhxLFu2jJUrV5KYmAjAwIEDWbhwYbDCF0JcZHRdx+Np/PVSr1euu4rqBS3BLlmyBJvNBsCmTZt4+OGHWbNmjV+fpUuX+v7+5ptvuOOOOxg1apSvLTU1lfnz5wcnYCGEEKIRgjbI6WxyBSgpKUGpY/qU1atXM2nSJMzm+t+XJoQQQrQUQb0G+8gjj7B9+3Z0XeeVV16psZ/L5WLdunX87W9/82vfsGEDn332GXa7nQceeIABAwYEOGIhhBDiwii6Hvw5R9auXcuGDRt4+eWXq12elpbGyy+/7HcKOScnh5iYGEwmE9u3b2fu3LmkpaURGxsbrLCFEEKIemuWUcSpqaksWLCAgoKCahPkO++8w0033eTXZrdX3qM2cuRIkpOTOXLkCEOHDq33dvPyStC0pv09YbfbyMkJrZlbQi3mUIsXQi/mUIsXQi/m6uK122019BYXg6Bcgy0tLSUjI8P3ePPmzURHRxMTE1Olb2ZmJl9++SUTJ070a8/KyvL9fejQIdLT0+ncuXPAYhZCCCEaIyhHsA6Hgzlz5uBwOFBVlejoaFasWIGiKEyfPp3Zs2fTt29fANasWcOYMWOqJN/nnnuOgwcPoqoqJpOJpUuX+h3VCiGEEC1Js1yDbS5yirhCqMUcavFC6MUcavFC6MUsp4hbH5mLWAghhAgASbBCCCFEAEiCFUIIIQJAEqwQQggRAJJghRBCiACQBCuEEEIEgCRYIYQQIgAkwQohhBABIAlWCCGECABJsEIIIUQASIIVQgghAkASrBBCCBEAkmCFEEKIAJAEK4QQQgSAJFghhBAiACTBCiEaTVEULLoDi16GoijNHY4QLYKxuQMQQoQ2o1aOcuorCravAiBq+E3QcRAexdrMkQnRvOQIVgjRKGrOEfI3voi3OB9vcT4FH72Mmv1tc4clRLOTBCuEuGBGo0rZoa1V2su+3ozRKF8vonWTT4AQ4oJpmo4xOqlKuyEmCV3XmyEiIVoOSbBCiAumaTrWniNRzGG+NsVsJaz3lXi9kmBF6yaDnIQQjeIMTybhlsfx5vwAuo7B3glnWCJIfhWtnCRYIUSj6DqUW+zQ3g6AGyS5CoGcIhZCCCECQhKsEEIIEQCSYIUQQogAkAQrhBBCBIAkWCGEECIAJMEKIYQQASAJVgghhAgASbBCCCFEAARtoomZM2dy6tQpVFUlPDycRx99lF69evn1WbZsGStXriQxMRGAgQMHsnDhQgC8Xi9PPvkk27ZtQ1EU7r33XqZOnRqs8IUQQogGCVqCXbJkCTabDYBNmzbx8MMPs2bNmir9UlNTmT9/fpX2devWceLECT766CMKCwtJTU1l+PDhtG/fPuCxCyGEEA0VtFPEZ5MrQElJCYqiNOj5aWlpTJ06FVVViYuLY/z48WzcuLGpwxRCCCGaRFDnIn7kkUfYvn07uq7zyiuvVNtnw4YNfPbZZ9jtdh544AEGDBgAQEZGBm3btvX1S05OJjMzMyhxCyGEEA0V1AT71FNPAbB27VqWLl3Kyy+/7Ld82rRpzJgxA5PJxPbt25k5cyZpaWnExsY2yfbj4yObZD3ns9ttdXdqYUIt5lCLF0Iv5lCLF0Iv5lCLVzROs1TTSU1NZcGCBRQUFPglT7vd7vt75MiRJCcnc+TIEYYOHUpycjKnT5+mX79+QNUj2vrIyytB05q2zIfdbiMnp7hJ1xlooRZzqMULoRdzqMULoRdzdfFKwr24BeUabGlpKRkZGb7HmzdvJjo6mpiYGL9+WVlZvr8PHTpEeno6nTt3BuDaa69l1apVaJpGfn4+mzZt4pprrglG+EIIIUSDBeUI1uFwMGfOHBwOB6qqEh0dzYoVK1AUhenTpzN79mz69u3Lc889x8GDB1FVFZPJxNKlS31HtZMnT2bfvn1cffXVANx///2kpKQEI3whhBCiwRRd11tNaWQ5RVwh1GIOtXgh9GIOtXgh9GKWU8Stj8zkJIQQQgSAJFghhBAiACTBCiGEEAEgCVYIIYQIAEmwQgghRABIghVCCCECQBKsEEIIEQDNMlWiEPWhKAqFZW4KThQQaTFgUhtWgekso+LFUJYLuoYWnoAbU5U+JtyoZbmgqHjDE/DohsaGL4Ro5STBihbJq+l8/r9M/vHBN3i8Gu0TI/l/PxtAXETV5Fgbs6eE8j3vUrp/C6Bj7TqIyNG34TTF+PpYPIWUbv0Hju++BBQi+l2FdcgUXAaZBEAIceHkFLFokTILHby+/n94vBoAp7JLeOODQ2gNXI+eeYjS/ZuBihm8yo9+ifvoLtQfj4YVRcF9dPePyRVAp3T/FrSMQ03zQoQQrZYkWNEiZRc4qrTt/y4Xh8tb73UYDCrO4weqtDuOfIFBqViPUdUoP7KrSh/nD/sxGOTjIYS4cPINIlqkWJulSluXdtFYTPV/y2qahrntJVXaLSl98P54jdWrq5hT+lTt07Y7mtbQ42UhhKgkCVa0SG3jI7h2eEff43Crkbsn9cGo1H+gk66DoX1fzG17+NqMcW2x9h7tK/qgaTrWnqMwxlXWFja37YYhpR+tpwyGECIQZJCTaJHMBoUpo7ow+rJ2uLw6sZEmoqzGBic9pykG2/X/D6UoE13zokQnU65G+PUptyQQPeUROJOBoqroUck4lbAmfDVCiNZIEqxosYyqQpsYq6/M14UeUbqUMIjuXHsfNQJiq55OFkKICyWniIUQQogAkAQrhBBCBIAkWCGEECIAJMEKIYQQASAJVgghhAgASbBCCCFEAEiCFUIIIQJAEqwQQggRADLRhBCAwVuKJ79iJidDTDJeQ/PN5KSiY3WcRivKRbXF4YpIxqPLR1WIUCOfWtHqqY4cij9Yhjf3BACGtr2I/Mm9aJbY4MeigunkbrI2vgReD6gG4sb/EkPn4XglyQoRUuQUsWjVjEaV8kOf+ZIrgPf0IVzHv0Zthk+H1ZFJ3kevViRXAM1L/qa/YSnNDH4wQohGkQQrWjVV0eH0wSrt3tO
|
|||
|
"text/plain": [
|
|||
|
"<Figure size 474.35x360 with 1 Axes>"
|
|||
|
]
|
|||
|
},
|
|||
|
"metadata": {},
|
|||
|
"output_type": "display_data"
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"irisv = iris[iris[\"Species\"] != \"Iris-setosa\"]\n",
|
|||
|
"sns.relplot(data=irisv, x=\"SepalLengthCm\", y=\"SepalWidthCm\", hue=\"Species\")"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 26,
|
|||
|
"metadata": {
|
|||
|
"scrolled": false,
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"text/plain": [
|
|||
|
"<seaborn.axisgrid.PairGrid at 0x7f97f2ad3550>"
|
|||
|
]
|
|||
|
},
|
|||
|
"execution_count": 26,
|
|||
|
"metadata": {},
|
|||
|
"output_type": "execute_result"
|
|||
|
},
|
|||
|
{
|
|||
|
"data": {
|
|||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAzwAAALDCAYAAADQRQWWAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAEAAElEQVR4nOydd3gcV7m435nZ3nelVW+2Zcu23HtN7z1OQggtEEJNIEBCgAvcGzoXuPy4l8ClBgiXmp6Q3uPe495lq3dppV1t35nfHyutvJZsy/aqOed9Hj+P59szM9+Mzsyc75yvSJqmaQgEAoFAIBAIBALBeYg82goIBAKBQCAQCAQCwXAhDB6BQCAQCAQCgUBw3iIMHoFAIBAIBAKBQHDeIgwegUAgEAgEAoFAcN4iDB6BQCAQCAQCgUBw3iIMHoFAIBAIBAKBQHDeohttBU7kzTff5L//+7/RNA1VVfnc5z7HFVdcMeT929sDqOrQMm273RY6O4Nnq+q44Hy/xtG4Pq/XPuS2Z9Ifz4bx8PcdDzrC+NVzqP1xuPviqRgP91boeO5k8t041q/1dAj9Rxe324JOp4y2GoIxxJgyeDRN48EHH+Qvf/kLU6ZMYf/+/dxxxx1cdtllyHLmF6PeCw/D+X6N5/v1nY7xcP3jQUcQeg4n40FnoePYYrxfq9B/dBnv+gsyz5hzaZNlGb/fD4Df7ycnJ2dYjB2BQCAQCAQCgUBw/iNpmjY6Pg4nYf369XzhC1/AYrHQ09PDr3/9a+bOnTvaagkEAoFAIBAIBIJxyJhyaYvH4/z617/ml7/8JfPnz2fr1q188Ytf5Pnnn8dqtQ7pGGfip+712mlt9Z+LymOe8/0aR+P6xlIMz3j4+46mjpIEXaqPxp4mFFmh0FqAWRv8XTIe7iUM1HM8xPCMh3s7XnXs6+MNPY3oZT0F1vyT9vGR0G+onK4/joe/x6kYr/r34Ke+pxFFkck15WDDMdoqnRVn0hcF7w3GlMGzb98+WlpamD9/PgDz58/HbDZz5MgRZs2aNcraCQSC8UZrooUfrnmYUDwMQL4th/sWfRL7OP2ICwQn0hJv5gdrHyYSjwBQaM/j8wvvHrcDVcHo0aV18J/rf0lXuBsAh9HOV5Z+FpecNcqaCQTnzpgKjsnLy6OpqYmqqioAjhw5QltbGyUlJaOsmUAgGG9IssaLh99IGTsAjYEWDnQcQpJGUTGBIFPIGs8dfCVl7ADU+5s47KsaRaUE4xFZltjc+G7K2AHojvhZ37ANWRYvTMH4Z0yt8Hi9Xh566CHuu+8+pN4RyQ9+8ANcLtfoKpYhQpE4+6o7mV2ehSISMQgEw0pCSlDb3TBA3tTTipQtMcbCFwWCMyZBnDp/0wB5S08bskcaNRdGwfhDliWqu+oGyI/5apDLZFQ1MQpaCQSZY0wZPAA33HADN9xww2irMSz87l972XOsgysWFLPqwkmjrY5AcF6jqHouLFnCX3c/nSavzJ4iBoKC8wI9Bi4oWcxje/+VJq/IKhd9XHBGxOMqS4vms61xd5p8RfEi4nFh7AjGP2KZYYRo84U4WOvjQ5dP4Y3t9cQT6mirJBCc12iaxrycOVxVfhGKrGDWmfjI7FspthSNtmoCQUZQVY1FefO4fOJKFEnGojfzsTm3U2guHG3VBOOQcsckbp1+HQZFj17Rs2rq1UxxTh5ttQSCjDDmVnjOV7YdamNykYtspxmn1cCR+i4qStyjrZZAcF5jxsINZddwaelKJGSskk3MfAvOKyzYuGnidVwx4SJkZCyijwvOEoNm4pKCC1iSPw+TSY8cMaKJuVnBeYJY4Rkhdle1U5JrA6Akx8a+6s5R1kggeG+gqWDR7Jg1qxgICs5PVAmLZsck+rjgHNFUMGs2cmzZwtgRnFeIFZ4RQNM0qhq6uWB2AQAF2VYO1PpGVymB4FRIGq3xFhoDzVj05kHr1/TQTV2gkZgao9CWj0fJTksEEJMiNIaa2H20G5feRZ4pF51mGNnLkCQ6Em00BJJ1eIpthVgR9RkE4xBJ43D7MWq6GjDrTRRa8jFjO2lzWU6mZa/prkdGosRZiB4jtYF6NE2l0FaAS3YjcncI+pAk6FZ91AUaORCUyDfn4hykjySkOM2RZlqDbTiNDgos+cSJURuopzvix2vNotBciF41js6FCASDIAyeEaDVF0Kvk7GZ9QDkeSy8umVgNhSBYCwgSVAVOsp/rf91yoAp95Tx6bl3poweP138ZP3/0hbqAECv6Pm35feSo8sHQJUTvHzsDV48/GbquLdOu5aLCi9AUkcuxWlzrJEfrP050UQMgCyzmweWfAaH5BoxHQSCTFATruHH6/4XtXfavcxZxD0L7sKiDW70NMYa+PG6XxGKJdOy2w1Wbpx2Jf+340kAzDoTX1v+ObIU78hcgGDM06G288O1DxOI9gBg0Zv52vLP4ZGzU20kGba0buPRHY+nZPcu+igb6razpWFHSvbBWTexImcZqlglEowRhEvbCFDdHCDXY0lt28x6EqpGV090FLUSCAYnJkX4847H01ZrDncco74nmeJZkmB/+6GUsQMQS8T416HXkOTkPh2x9jRjB+DJ/S/SFR85V05J0Xjh8OspYwegPdTJflGHRzDOSMhR/rLzyZSxA3Csq476wMC06wB6vcLb1RtTxg6AP9pDo7+FLHMydjQUD/N2zToURQwDBMm01Jvqt6WMHYBgLMS6us0oSv8L06928bfdz6TvLJFm7AA8vucFOtT2YdVZIDgTxJtuBKhr8ZPtNKW2JUkix2WmoTUwiloJBIMT1+K0hQYaJsFYCEj23/ZBfm8KtJIgmb70+IFWH6qmEo4PlA8XKipNPa0D5G3B9lSdL4FgPBDT4mkTDH30xIKD7yBrtPS0DRB3hnw4TP0rQvX+ZkD4tAmSBk9DoHmAvM7fiCT1DxUjiSix4yaRAMLHFb49vt1gcoFgtBAGzwhQ0xxIM3gA3HYjDe0n+VgJBKOISbJwQeniNJkkSeTbcoFkKtzp3ikD9rt4wjIULem2mW3OwmFMj5XJsWbjMXqGSeuByKqOi8uWDZBXeqeKwG7BuMIsWbiobGmaTEKiwJY3oK0kSbSF2llaPG/Ab+VZZdT46lPbF5UuJZEQz4IgWYdnWdGCAfILSpbQFGlih28HBwL7MetNlLmK09rYDVaMuvR4nQnuYrJG8H0vEJwOYfCMAI3twZMYPD0n2UMgGEVUiasnXJo0YGSFXGs29y/5JF5dTqpJoamQT8//ME6jHaNi4OapVzE3e1bKDc4oGblr7vuY5ClFkiQqsifxodmrMDBySQs0TWNO9gxWTbsao86I02jnU/M/RJGoUSIYZ2gqXFKykismXYBO1uG1ePjCkk/g1ecMaNsab+Kh1T8hFA2xatrVmPUmrAYLH5x1M3m2HMx6M2adiffPuJEpLlFjRdDPRPsEPjLrFqx6Cxa9mQ/MvJksi4uH3vkJv9n2F/5n0yP896bf8fF572de/gwkSWKCq4Qsk4cvLb2bEmchkiQxO286d819P/qE6fQnFQhGCEnTzq8cLe3tgSHP3nq9dlpb/cOqTzyh8tmfvs3nb5mF7jhf6cP1Xeyv6eSB988d1vOPxDWOJqNxfV7v0LN8nUl/PFtdhu36ZY2wFkQn6dANkm1HkiQiUghVUzFLlrQUpm3xFr61+qfMy59BkTOfY5117Gzex7cv/DJuOWt49D0JsgwhLYgkSRg1Cyd75Y2XZ+VEPYfaH4e7L56K8XBvx4OO7iwLjR1tKJKCQTMOyJ4lK/CXA4+xpmYzkFzRuar8IjwmJ4XGElRVJUQQ0DCT+RTWmXw3joe/x6kYr/rLskSIIBazgXhE48ebfkm1Lz3J0r0LP8ZU5xQihNFjQFGTK/tRJUxUi2CVrEiJ0c2JdSZ9UfDeQGRpG2ZafSEcVkOasQPgthlp6QyNklYCwRBQJUxYT+rir2kaBs3U+//030LxMKqmsqVhJ1sadqbkkUR0xNeVVRWMWEADTcQrCMYxOlnBpPX15YFoaDQF+uPWDrcf4+H2P3Jx2TJun1yMqmoYMQOgimdBMAh9fcRjsVMXbKE
|
|||
|
"text/plain": [
|
|||
|
"<Figure size 834.35x720 with 20 Axes>"
|
|||
|
]
|
|||
|
},
|
|||
|
"metadata": {},
|
|||
|
"output_type": "display_data"
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"sns.pairplot(data=iris.drop(columns=[\"Id\"]), hue=\"Species\")"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"# Podział danych\n",
|
|||
|
" - ### Zbiór trenujący (\"training set\")\n",
|
|||
|
" - Służy do dopasowania parametrów modelu (np. wag w sieci neuronowej).\n",
|
|||
|
" - Podczas trenowania algorytm minimalizuje funkcję kosztu obliczoną na zbiorze treningowym \n",
|
|||
|
" - ### Zbiór walidujący/walidacyjny (\"validation set\" aka. \"dev set\")\n",
|
|||
|
" - Służy do porównania modeli powstałych przy użyciu różnych hiperparametrów (np. architektura sieci, ilość iteracji trenowania)\n",
|
|||
|
" - Pomaga uniknąć przetrenowania (overfitting) modelu na zbiorze trenującym poprzez zastosowanie tzw. early stopping\n",
|
|||
|
" - ### Zbiór testujący (\"test set\")\n",
|
|||
|
" - Służy do ewaluacji finalnego modelu wybranego/wytrenowanego za pomocą zbiorów trenującego i walidującego"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"## Podział danych\n",
|
|||
|
"- Zbiory trenujący, walidacyjny i testowy powinny być niezależne, ale pochodzić z tego samego rozkładu\n",
|
|||
|
"- W przypadku klasyfikacji, rozkład klas w zbiorach powinien być zbliżony\n",
|
|||
|
"- Bardzo istotne jest to, żeby zbiory walidujący i testujący dobrze odzwierciedlały nasze cele biznesowe i rzeczywiste dane, na których będzie działał nasz model\n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"source": [
|
|||
|
"### Metody podziału:\n",
|
|||
|
"- Skorzystać z gotowego podziału danych :)\n",
|
|||
|
"- Jeśli dzielimy zbiór sami:\n",
|
|||
|
" - \"Klasyczne\" podejście: proporcja Train:Dev:Test 6:2:2 lub 8:1:1\n",
|
|||
|
" - Uczenie głębokie: \n",
|
|||
|
" - metody \"głębokie\" mają bardzo duże zapotrzebowanie na dane, zbiory rzędu > 1 000 000 przykładów\n",
|
|||
|
" - Załóżmy, że cały zbiór ma 1 000 000 przykładów\n",
|
|||
|
" - wielkości zbiorów dev i test ustalamy bezwzględnie, np. na 1000 albo 10 000 przykładów\n",
|
|||
|
" - 10 000 przykładów to (wystarczająco) dużo, choć stanowi jedynie 1% z całego zbioru\n",
|
|||
|
" - szkoda \"marnować\" dodatkowe 180 000 przykładów na zbiory testujące i walidacyjne, lepiej mieć większy zbiór trenujący \n"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "markdown",
|
|||
|
"metadata": {},
|
|||
|
"source": [
|
|||
|
"### Przykładowy podział z pomocą standardowych narzędzi Bash"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 27,
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"--2021-03-15 11:16:36-- https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data\n",
|
|||
|
"Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252\n",
|
|||
|
"Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:443... connected.\n",
|
|||
|
"HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable\n",
|
|||
|
"\n",
|
|||
|
" The file is already fully retrieved; nothing to do.\n",
|
|||
|
"\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"# Pobierzmy plik ze zbiorem z repozytorium\n",
|
|||
|
"!cd IUM_02; wget -c https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 29,
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"151 IUM_02/iris.data\r\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"#Sprawdźmy wielkość zbioru\n",
|
|||
|
"!wc -l IUM_02/iris.data"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 30,
|
|||
|
"metadata": {
|
|||
|
"scrolled": true,
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"5.1,3.5,1.4,0.2,Iris-setosa\r\n",
|
|||
|
"4.9,3.0,1.4,0.2,Iris-setosa\r\n",
|
|||
|
"4.7,3.2,1.3,0.2,Iris-setosa\r\n",
|
|||
|
"4.6,3.1,1.5,0.2,Iris-setosa\r\n",
|
|||
|
"5.0,3.6,1.4,0.2,Iris-setosa\r\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"#Sprawdźmy strukturę\n",
|
|||
|
"!head -n 5 IUM_02/iris.data"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 31,
|
|||
|
"metadata": {
|
|||
|
"scrolled": true,
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
" 1 \r\n",
|
|||
|
" 50 Iris-setosa\r\n",
|
|||
|
" 50 Iris-versicolor\r\n",
|
|||
|
" 50 Iris-virginica\r\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"#Sprawdźmy jakie są klasy i ile każda ma przykładów:\n",
|
|||
|
"!cut -f 5 -d \",\" IUM_02/iris.data | sort | uniq -c"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 32,
|
|||
|
"metadata": {
|
|||
|
"scrolled": true,
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text": [
|
|||
|
"151:\r\n"
|
|||
|
]
|
|||
|
}
|
|||
|
],
|
|||
|
"source": [
|
|||
|
"# Znajdźmy pustą linijkę:\n",
|
|||
|
"! grep -P \"^$\" -n IUM_02/iris.data"
|
|||
|
]
|
|||
|
},
|
|||
|
{
|
|||
|
"cell_type": "code",
|
|||
|
"execution_count": 33,
|
|||
|
"metadata": {
|
|||
|
"slideshow": {
|
|||
|
"slide_type": "slide"
|
|||
|
}
|
|||
|
},
|
|||
|
"outputs": [
|
|||
|
{
|
|||
|
"name": "stdout",
|
|||
|
"output_type": "stream",
|
|||
|
"text& |