2021-03-15 11:51:20 +01:00
{
"cells": [
2021-09-28 10:56:21 +02:00
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"![Logo 1](https://git.wmi.amu.edu.pl/AITech/Szablon/raw/branch/master/Logotyp_AITech1.jpg)\n",
"<div class=\"alert alert-block alert-info\">\n",
"<h1> Inżynieria uczenia maszynowego </h1>\n",
"<h2> 2. <i>Dane</i> [laboratoria]</h2> \n",
2023-03-08 21:46:47 +01:00
"<h3> Tomasz Ziętkiewicz (2023)</h3>\n",
2021-09-28 10:56:21 +02:00
"</div>\n",
"\n",
"![Logo 2](https://git.wmi.amu.edu.pl/AITech/Szablon/raw/branch/master/Logotyp_AITech2.jpg)"
]
},
2021-03-15 11:51:20 +01:00
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Plan na dzisiaj\n",
"1. Motywacja\n",
"2. Podział danych\n",
"3. Skąd wziąć dane?\n",
"4. Przygotowanie danych\n",
"5. Zadanie"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Motywacja\n",
"- Zasada \"Garbage in - garbage out\"\n",
"- Im lepszej jakości dane - tym lepszy model\n",
"- Najlepsza architektura, najpotężniejsze zasoby obliczeniowe i najbardziej wyrafinowane metody nie pomogą, jeśli dane użyte do rozwoju modelu nie odpowiadają tym, z którymi będzie on używany, albo jeśli w danych nie będzie żadnych zależności\n",
"- Możemy stracić dużo czasu, energii i zasobów optymalizując nasz model w złym kierunku, jeśli dane są źle dobrane"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Źródła danych\n",
"- Gotowe zbiory:\n",
2023-03-08 21:46:47 +01:00
" - Otwarte wyzwania (challenge)\n",
" - Repozytoria otwartych zbiorów danych\n",
" - Dane udostępniane przez firmy\n",
" - Repozytoria zbiorów komercyjnych\n",
" - Dane wewnętrzne (np. firmy)"
2021-03-15 11:51:20 +01:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Źródła danych\n",
"- Tworzenie danych:\n",
" - Generowanie syntetyczne\n",
2022-03-14 09:09:50 +01:00
" - np. generowanie korpusów mowy za pomocą TTS (syntezy mowy)\n",
2021-03-15 11:51:20 +01:00
" - Crowdsourcing\n",
2022-03-14 09:09:50 +01:00
" - Data scrapping"
2021-03-15 11:51:20 +01:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Otwarte wyzwania (shared task / challenge)\n",
"- Kaggle: https://www.kaggle.com/datasets\n",
"- Gonito: https://gonito.net/list-challenges - polski (+poznański +z UAM) Kaggle\n",
"- Semeval: https://semeval.github.io/ - zadania z semantyki\n",
"- Poleval: http://poleval.pl/ - przetwarzanie języka polskiego\n",
"- WMT http://www.statmt.org/wmt20/ (tłumaczenie maszynowe)\n",
2023-03-08 21:46:47 +01:00
"- IWSLT https://iwslt.org/2021/#shared-tasks (tłumaczenie mowy)\n",
"- CNLPS - Challenges for Natural Language Processing - https://fedcsis.org/sessions/aaia/cnlps"
2021-03-15 11:51:20 +01:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Repozytoria/wyszukiwarki otwartych zbiorów danych\n",
2023-03-08 21:46:47 +01:00
"- Huggingface Datasets: https://huggingface.co/datasets\n",
2021-03-15 11:51:20 +01:00
"- Papers with code: https://paperswithcode.com/datasets\n",
2022-03-14 09:09:50 +01:00
"- UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/ (University of California)\n",
2021-03-15 11:51:20 +01:00
"- Google dataset search: https://datasetsearch.research.google.com/\n",
"- Zbiory google:https://research.google/tools/datasets/\n",
2022-03-14 09:09:50 +01:00
"- Otwarte zbiory na Amazon AWS: https://registry.opendata.aws/\n",
2021-03-15 11:51:20 +01:00
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Otwarte zbiory\n",
"- Rozpoznawanie mowy:\n",
2023-03-08 21:46:47 +01:00
" - https://www.openslr.org/ - Libri Speech, TED Lium\n",
" - Mozilla Open Voice: https://commonvoice.mozilla.org/\n",
2021-03-15 11:51:20 +01:00
"- NLP:\n",
2023-03-08 21:46:47 +01:00
" - Clarin: https://clarin-pl.eu/index.php/zasoby/\n",
" - NKJP: http://nkjp.pl/\n",
2021-03-15 11:51:20 +01:00
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Crowdsourcing\n",
"- reCAPTCHA\n",
2022-03-14 09:09:50 +01:00
"<img src=\"img/ReCAPTCHA_idea.jpg\">\n",
"<img src=\"img/cat_captcha.png\">\n",
"\n",
"<sub>Źródło: https://pl.wikipedia.org/wiki/ReCAPTCHA#/media/Plik:ReCAPTCHA_idea.jpg</sub>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"- Amazon Mechanical Turk: https://www.mturk.com/\n",
2021-09-28 11:58:53 +02:00
"<img src=\"img/Tuerkischer_schachspieler_windisch4.jpg\">\n",
"\n",
"<sub>Źródło: https://en.wikipedia.org/wiki/Mechanical_Turk#/media/File:Tuerkischer_schachspieler_windisch4.jpg</sub>"
2021-03-15 11:51:20 +01:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Licencje\n",
"- Przed podjęciem decyzji o użyciu danego zbioru koniecznie sprawdź jego licencję!\n",
"- Wiele dostępnych w internecie zbiorów jest udostępniana na podstawie otwartych licencji\n",
"- Zazwyczaj jednak ich użycie wymaga spełnienia pewnych warunków, np. podania źródła\n",
"- Wiele ogólnie dostępnych zbiorów nie może być jednak użytych za darmo w celach komercyjnych!\n",
"- Niektóre z nich mogą nawet powodować, że praca pochodna, która zostanie stworzona z ich wykorzystaniem, będzie musiała być udostępniona na tej samej licencji (GPL). Jest to \"niebezpieczeństwo\" w przypadku wykorzystania zasobów przez firmę komercyjną!\n",
"- Zasady działania licencji CC: https://creativecommons.pl/\n",
"- Najbardziej popularne licencje:\n",
" - Przyjazne również w zastosowaniach komercyjnych: MIT, BSD, Appache, CC (bez dopisku NC)\n",
" - GPL (GNU Public License) - \"zaraźliwa\" licencja Open Source"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Przykład \n",
"- Za pomocą standardowych narzędzi bash dokonamy wstępnej inspekcji i podziału danych\n",
"- Jako przykładu użyjemy klasycznego zbioru IRIS: https://archive.ics.uci.edu/ml/datasets/Iris\n",
"- Zbiór zawiera dane dotyczące długości i szerokości płatków kwiatowych trzech gatunków irysa:\n",
" - Iris Setosa\n",
" - Iris Versicolour\n",
" - Iris Virginica\n",
2021-09-28 11:58:53 +02:00
" \n",
"<img src=IUM_02/iris.png/>\n",
"\n",
"<sub>Źródło: https://www.kaggle.com/vinayshaw/iris-species-100-accuracy-using-naive-bayes<br>\n",
"Licencja: [Apache 2.0](http://www.apache.org/licenses/LICENSE-2.0)</sub>"
2021-03-15 11:51:20 +01:00
]
},
{
"cell_type": "markdown",
2022-03-14 09:09:50 +01:00
"metadata": {},
2021-03-15 11:51:20 +01:00
"source": [
2022-03-14 09:09:50 +01:00
"## Pobranie danych"
2021-03-15 11:51:20 +01:00
]
},
{
"cell_type": "code",
2022-03-14 09:09:50 +01:00
"execution_count": 1,
2021-03-15 11:51:20 +01:00
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2022-03-14 09:09:50 +01:00
"Collecting kaggle\n",
" Using cached kaggle-1.5.12.tar.gz (58 kB)\n",
"Requirement already satisfied: six>=1.10 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from kaggle) (1.15.0)\n",
"Requirement already satisfied: certifi in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from kaggle) (2021.5.30)\n",
"Requirement already satisfied: python-dateutil in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from kaggle) (2.8.1)\n",
"Requirement already satisfied: requests in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from kaggle) (2.25.1)\n",
"Requirement already satisfied: tqdm in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from kaggle) (4.59.0)\n",
"Requirement already satisfied: python-slugify in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from kaggle) (5.0.2)\n",
"Requirement already satisfied: urllib3 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from kaggle) (1.26.4)\n",
"Requirement already satisfied: text-unidecode>=1.3 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from python-slugify->kaggle) (1.3)\n",
"Requirement already satisfied: idna<3,>=2.5 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from requests->kaggle) (2.10)\n",
"Requirement already satisfied: chardet<5,>=3.0.2 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from requests->kaggle) (4.0.0)\n",
"Building wheels for collected packages: kaggle\n",
" Building wheel for kaggle (setup.py) ... \u001b[?25ldone\n",
"\u001b[?25h Created wheel for kaggle: filename=kaggle-1.5.12-py3-none-any.whl size=73053 sha256=1e6240d540651324d97a9772ad1ced30da7d7b5dc5956dc974eeeddf7c48844b\n",
" Stored in directory: /home/tomek/.cache/pip/wheels/ac/b2/c3/fa4706d469b5879105991d1c8be9a3c2ef329ba9fe2ce5085e\n",
"Successfully built kaggle\n",
"Installing collected packages: kaggle\n",
"Successfully installed kaggle-1.5.12\n",
"Requirement already satisfied: pandas in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (1.2.4)\n",
"Requirement already satisfied: python-dateutil>=2.7.3 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from pandas) (2.8.1)\n",
"Requirement already satisfied: numpy>=1.16.5 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from pandas) (1.20.2)\n",
"Requirement already satisfied: pytz>=2017.3 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from pandas) (2021.1)\n",
"Requirement already satisfied: six>=1.5 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)\n"
2021-03-15 11:51:20 +01:00
]
}
],
"source": [
"#Zainstalujmy potrzebne biblioteki \n",
"!pip install --user kaggle #API Kaggle, do pobrania zbioru\n",
"!pip install --user pandas"
]
},
2022-03-14 09:09:50 +01:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
" - Pobierzemy zbiór Iris z Kaggle: https://www.kaggle.com/uciml/iris\n",
" - Licencja to \"Public Domain\", więc możemy z niego korzystać bez ograniczeń."
]
},
2021-03-15 11:51:20 +01:00
{
"cell_type": "code",
2022-03-14 09:09:50 +01:00
"execution_count": 2,
2021-03-15 11:51:20 +01:00
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
2022-03-14 09:09:50 +01:00
"Downloading iris.zip to /home/tomek/AITech/repo/aitech-ium\n",
" 0%| | 0.00/3.60k [00:00<?, ?B/s]\n",
"100%|██████████████████████████████████████| 3.60k/3.60k [00:00<00:00, 1.63MB/s]\n"
2021-03-15 11:51:20 +01:00
]
}
],
"source": [
2021-05-10 12:53:57 +02:00
"# Żeby poniższa komenda zadziałała, musisz posiadać plik ~/.kaggle/kaggle.json, zawierający Kaggle API token.\n",
2021-03-15 11:51:20 +01:00
"# Instrukcje: https://www.kaggle.com/docs/api\n",
"!kaggle datasets download -d uciml/iris"
]
},
{
"cell_type": "code",
2022-03-14 09:09:50 +01:00
"execution_count": 3,
2021-03-15 11:51:20 +01:00
"metadata": {
2022-03-14 09:09:50 +01:00
"scrolled": true,
2021-03-15 11:51:20 +01:00
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Archive: iris.zip\r\n",
" inflating: Iris.csv \r\n",
" inflating: database.sqlite \r\n"
]
}
],
"source": [
"!unzip -o iris.zip"
]
},
2022-03-14 09:09:50 +01:00
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Inspekcja\n",
"- Zanim zaczniemy trenować model na danych, powinniśmy poznać ich specyfikę\n",
"- Pozwoli nam to:\n",
" - usunąć lub naprawić nieprawidłowe przykłady\n",
" - dokonać selekcji cech, których użyjemy w naszym modelu\n",
" - wybrać odpowiedni algorytm uczenia\n",
" - podjąć dezycję dotyczącą podziału zbioru i ewentualnej normalizacji\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Inspekcja\n",
"- Do inspekcji danych użyjemy popularnej biblioteki pythonowej Pandas: https://pandas.pydata.org/\n",
"- Do wizualizacji użyjemy biblioteki Seaborn: https://seaborn.pydata.org/index.html\n",
"- Służy ona do analizy i operowania na danych tabelarycznych jak i szeregach czasowych"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: pandas in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (1.2.4)\n",
"Requirement already satisfied: pytz>=2017.3 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from pandas) (2021.1)\n",
"Requirement already satisfied: numpy>=1.16.5 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from pandas) (1.20.2)\n",
"Requirement already satisfied: python-dateutil>=2.7.3 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from pandas) (2.8.1)\n",
"Requirement already satisfied: six>=1.5 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)\n",
"Collecting seaborn\n",
" Downloading seaborn-0.11.2-py3-none-any.whl (292 kB)\n",
"\u001b[K |████████████████████████████████| 292 kB 1.1 MB/s eta 0:00:01\n",
"\u001b[?25hCollecting matplotlib>=2.2\n",
" Downloading matplotlib-3.5.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl (11.2 MB)\n",
"\u001b[K |████████████████████████████████| 11.2 MB 10.8 MB/s eta 0:00:01\n",
"\u001b[?25hRequirement already satisfied: pandas>=0.23 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from seaborn) (1.2.4)\n",
"Requirement already satisfied: numpy>=1.15 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from seaborn) (1.20.2)\n",
"Requirement already satisfied: scipy>=1.0 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from seaborn) (1.6.3)\n",
"Requirement already satisfied: packaging>=20.0 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from matplotlib>=2.2->seaborn) (20.9)\n",
"Requirement already satisfied: python-dateutil>=2.7 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from matplotlib>=2.2->seaborn) (2.8.1)\n",
"Collecting cycler>=0.10\n",
" Downloading cycler-0.11.0-py3-none-any.whl (6.4 kB)\n",
"Requirement already satisfied: pyparsing>=2.2.1 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from matplotlib>=2.2->seaborn) (2.4.7)\n",
"Collecting fonttools>=4.22.0\n",
" Downloading fonttools-4.30.0-py3-none-any.whl (898 kB)\n",
"\u001b[K |████████████████████████████████| 898 kB 4.9 MB/s eta 0:00:01\n",
"\u001b[?25hRequirement already satisfied: pillow>=6.2.0 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from matplotlib>=2.2->seaborn) (8.2.0)\n",
"Collecting kiwisolver>=1.0.1\n",
" Downloading kiwisolver-1.3.2-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.6 MB)\n",
"\u001b[K |████████████████████████████████| 1.6 MB 7.7 MB/s eta 0:00:01\n",
"\u001b[?25hRequirement already satisfied: pytz>=2017.3 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from pandas>=0.23->seaborn) (2021.1)\n",
"Requirement already satisfied: six>=1.5 in /media/tomek/Linux_data/home/tomek/miniconda3/lib/python3.9/site-packages (from python-dateutil>=2.7->matplotlib>=2.2->seaborn) (1.15.0)\n",
"Installing collected packages: kiwisolver, fonttools, cycler, matplotlib, seaborn\n",
"Successfully installed cycler-0.11.0 fonttools-4.30.0 kiwisolver-1.3.2 matplotlib-3.5.1 seaborn-0.11.2\n"
]
}
],
"source": [
"!pip install --user pandas\n",
"!pip install --user seaborn"
]
},
2021-03-15 11:51:20 +01:00
{
"cell_type": "code",
2022-03-14 09:09:50 +01:00
"execution_count": 4,
2021-03-15 11:51:20 +01:00
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Id,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species\r\n",
"1,5.1,3.5,1.4,0.2,Iris-setosa\r\n",
"2,4.9,3.0,1.4,0.2,Iris-setosa\r\n",
"3,4.7,3.2,1.3,0.2,Iris-setosa\r\n",
"4,4.6,3.1,1.5,0.2,Iris-setosa\r\n"
]
}
],
"source": [
"!head -n 5 Iris.csv"
]
},
{
"cell_type": "code",
2022-03-14 09:09:50 +01:00
"execution_count": 5,
2021-03-15 11:51:20 +01:00
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Id</th>\n",
" <th>SepalLengthCm</th>\n",
" <th>SepalWidthCm</th>\n",
" <th>PetalLengthCm</th>\n",
" <th>PetalWidthCm</th>\n",
" <th>Species</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>5.1</td>\n",
" <td>3.5</td>\n",
" <td>1.4</td>\n",
" <td>0.2</td>\n",
" <td>Iris-setosa</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>4.9</td>\n",
" <td>3.0</td>\n",
" <td>1.4</td>\n",
" <td>0.2</td>\n",
" <td>Iris-setosa</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>4.7</td>\n",
" <td>3.2</td>\n",
" <td>1.3</td>\n",
" <td>0.2</td>\n",
" <td>Iris-setosa</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>4.6</td>\n",
" <td>3.1</td>\n",
" <td>1.5</td>\n",
" <td>0.2</td>\n",
" <td>Iris-setosa</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>5.0</td>\n",
" <td>3.6</td>\n",
" <td>1.4</td>\n",
" <td>0.2</td>\n",
" <td>Iris-setosa</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>145</th>\n",
" <td>146</td>\n",
" <td>6.7</td>\n",
" <td>3.0</td>\n",
" <td>5.2</td>\n",
" <td>2.3</td>\n",
" <td>Iris-virginica</td>\n",
" </tr>\n",
" <tr>\n",
" <th>146</th>\n",
" <td>147</td>\n",
" <td>6.3</td>\n",
" <td>2.5</td>\n",
" <td>5.0</td>\n",
" <td>1.9</td>\n",
" <td>Iris-virginica</td>\n",
" </tr>\n",
" <tr>\n",
" <th>147</th>\n",
" <td>148</td>\n",
" <td>6.5</td>\n",
" <td>3.0</td>\n",
" <td>5.2</td>\n",
" <td>2.0</td>\n",
" <td>Iris-virginica</td>\n",
" </tr>\n",
" <tr>\n",
" <th>148</th>\n",
" <td>149</td>\n",
" <td>6.2</td>\n",
" <td>3.4</td>\n",
" <td>5.4</td>\n",
" <td>2.3</td>\n",
" <td>Iris-virginica</td>\n",
" </tr>\n",
" <tr>\n",
" <th>149</th>\n",
" <td>150</td>\n",
" <td>5.9</td>\n",
" <td>3.0</td>\n",
" <td>5.1</td>\n",
" <td>1.8</td>\n",
" <td>Iris-virginica</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>150 rows × 6 columns</p>\n",
"</div>"
],
"text/plain": [
" Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm \\\n",
"0 1 5.1 3.5 1.4 0.2 \n",
"1 2 4.9 3.0 1.4 0.2 \n",
"2 3 4.7 3.2 1.3 0.2 \n",
"3 4 4.6 3.1 1.5 0.2 \n",
"4 5 5.0 3.6 1.4 0.2 \n",
".. ... ... ... ... ... \n",
"145 146 6.7 3.0 5.2 2.3 \n",
"146 147 6.3 2.5 5.0 1.9 \n",
"147 148 6.5 3.0 5.2 2.0 \n",
"148 149 6.2 3.4 5.4 2.3 \n",
"149 150 5.9 3.0 5.1 1.8 \n",
"\n",
" Species \n",
"0 Iris-setosa \n",
"1 Iris-setosa \n",
"2 Iris-setosa \n",
"3 Iris-setosa \n",
"4 Iris-setosa \n",
".. ... \n",
"145 Iris-virginica \n",
"146 Iris-virginica \n",
"147 Iris-virginica \n",
"148 Iris-virginica \n",
"149 Iris-virginica \n",
"\n",
"[150 rows x 6 columns]"
]
},
2022-03-14 09:09:50 +01:00
"execution_count": 5,
2021-03-15 11:51:20 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"iris=pd.read_csv('Iris.csv')\n",
"iris"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Id</th>\n",
" <th>SepalLengthCm</th>\n",
" <th>SepalWidthCm</th>\n",
" <th>PetalLengthCm</th>\n",
" <th>PetalWidthCm</th>\n",
" <th>Species</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>150.000000</td>\n",
" <td>150.000000</td>\n",
" <td>150.000000</td>\n",
" <td>150.000000</td>\n",
" <td>150.000000</td>\n",
" <td>150</td>\n",
" </tr>\n",
" <tr>\n",
" <th>unique</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>top</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>Iris-virginica</td>\n",
" </tr>\n",
" <tr>\n",
" <th>freq</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>75.500000</td>\n",
" <td>5.843333</td>\n",
" <td>3.054000</td>\n",
" <td>3.758667</td>\n",
" <td>1.198667</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>43.445368</td>\n",
" <td>0.828066</td>\n",
" <td>0.433594</td>\n",
" <td>1.764420</td>\n",
" <td>0.763161</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>1.000000</td>\n",
" <td>4.300000</td>\n",
" <td>2.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.100000</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>38.250000</td>\n",
" <td>5.100000</td>\n",
" <td>2.800000</td>\n",
" <td>1.600000</td>\n",
" <td>0.300000</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>75.500000</td>\n",
" <td>5.800000</td>\n",
" <td>3.000000</td>\n",
" <td>4.350000</td>\n",
" <td>1.300000</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>112.750000</td>\n",
" <td>6.400000</td>\n",
" <td>3.300000</td>\n",
" <td>5.100000</td>\n",
" <td>1.800000</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>150.000000</td>\n",
" <td>7.900000</td>\n",
" <td>4.400000</td>\n",
" <td>6.900000</td>\n",
" <td>2.500000</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm \\\n",
"count 150.000000 150.000000 150.000000 150.000000 150.000000 \n",
"unique NaN NaN NaN NaN NaN \n",
"top NaN NaN NaN NaN NaN \n",
"freq NaN NaN NaN NaN NaN \n",
"mean 75.500000 5.843333 3.054000 3.758667 1.198667 \n",
"std 43.445368 0.828066 0.433594 1.764420 0.763161 \n",
"min 1.000000 4.300000 2.000000 1.000000 0.100000 \n",
"25% 38.250000 5.100000 2.800000 1.600000 0.300000 \n",
"50% 75.500000 5.800000 3.000000 4.350000 1.300000 \n",
"75% 112.750000 6.400000 3.300000 5.100000 1.800000 \n",
"max 150.000000 7.900000 4.400000 6.900000 2.500000 \n",
"\n",
" Species \n",
"count 150 \n",
"unique 3 \n",
"top Iris-virginica \n",
"freq 50 \n",
"mean NaN \n",
"std NaN \n",
"min NaN \n",
"25% NaN \n",
"50% NaN \n",
"75% NaN \n",
"max NaN "
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"iris.describe(include='all')"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Iris-virginica 50\n",
"Iris-setosa 50\n",
"Iris-versicolor 50\n",
"Name: Species, dtype: int64"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"iris[\"Species\"].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:>"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAEyCAYAAADjiYtYAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAASkklEQVR4nO3de6xlZX3G8e8zgOKNCuFAplwcbFGrlpujEaGaglhaVKgVkaqdGCq9YEtTi4HeEmusWBPjpd5GRKf1SivIFI1CByiSEHC4CkGD5aYyMgNVGcEil1//2OvIdDgzZ5+zz9lr3tnfT3Ky9nr33rN/yTrznLXf9b7vSlUhSWrPkr4LkCTNjwEuSY0ywCWpUQa4JDXKAJekRhngktSoHcf5YbvvvnstW7ZsnB8pSc27+uqr76mqqc3bxxrgy5YtY+3ateP8SElqXpI7Zmq3C0WSGmWAS1KjDHBJapQBLkmNMsAlqVFDjUJJcjuwEXgEeLiqlifZDfgisAy4HXhdVf1occqUJG1uLmfgv1lVB1XV8m7/dGBNVe0PrOn2JUljMkoXyrHAqu7xKuC4kauRJA1t2Ik8BVyYpICPV9VKYM+qWgdQVeuS7DHTG5OcDJwMsO+++y5AycNbdvpXxvp543b7mcf0XcKi8di1zeM3HsMG+GFVdVcX0hcl+fawH9CF/UqA5cuXe/sfSVogQ3WhVNVd3XY9cB7wIuDuJEsBuu36xSpSkvR4swZ4kqckedr0Y+AVwI3AamBF97IVwPmLVaQk6fGG6ULZEzgvyfTrP1dVX0vyTeCcJCcBdwLHL16ZkqTNzRrgVXUrcOAM7fcCRy5GUZKk2TkTU5IaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktSooQM8yQ5Jrk1yQbe/W5KLktzSbXddvDIlSZubyxn4qcDNm+yfDqypqv2BNd2+JGlMhgrwJHsDxwBnbdJ8LLCqe7wKOG5BK5MkbdWwZ+DvB94OPLpJ255VtQ6g2+6xsKVJkrZm1gBP8kpgfVVdPZ8PSHJykrVJ1m7YsGE+/4QkaQbDnIEfBrw6ye3AF4AjknwGuDvJUoBuu36mN1fVyqpaXlXLp6amFqhsSdKsAV5VZ1TV3lW1DHg9cHFVvRFYDazoXrYCOH/RqpQkPc4o48DPBI5KcgtwVLcvSRqTHefy4qq6FLi0e3wvcOTClyRJGoYzMSWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNmjXAk+yc5Kok1ye5Kck7uvbdklyU5JZuu+vilytJmjbMGfiDwBFVdSBwEHB0khcDpwNrqmp/YE23L0kak1kDvAZ+2u3u1P0UcCywqmtfBRy3GAVKkmY2VB94kh2SXAesBy6qqiuBPatqHUC33WPRqpQkPc5QAV5Vj1TVQcDewIuSPH/YD0hycpK1SdZu2LBhnmVKkjY3p1EoVfVj4FLgaODuJEsBuu36LbxnZVUtr6rlU1NTo1UrSfqFYUahTCV5evf4ScDLgW8Dq4EV3ctWAOcvUo2SpBnsOMRrlgKrkuzAIPDPqaoLklwBnJPkJOBO4PhFrFOStJlZA7yqbgAOnqH9XuDIxShKkjQ7Z2JKUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjZg3wJPskuSTJzUluSnJq175bkouS3NJtd138ciVJ04Y5A38YeFtV/RrwYuCUJM8FTgfWVNX+wJpuX5I0JrMGeFWtq6pruscbgZuBvYBjgVXdy1YBxy1SjZKkGcypDzzJMuBg4Epgz6paB4OQB/ZY8OokSVs0dIAneSrwJeAvquq+Obzv5CRrk6zdsGHDfGqUJM1gqABPshOD8P5sVZ3bNd+dZGn3/FJg/UzvraqVVbW8qpZPTU0tRM2SJIYbhRLgk8DNVfW+TZ5aDazoHq8Azl/48iRJW7LjEK85DHgT8K0k13Vtfw2cCZyT5CTgTuD4RalQkjSjWQO8qi4HsoWnj1zYciRJw3ImpiQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRswZ4krOTrE9y4yZtuyW5KMkt3XbXxS1TkrS5Yc7APw0cvVnb6cCaqtofWNPtS5LGaNYAr6rLgP/ZrPlYYFX3eBVw3MKWJUmazXz7wPesqnUA3XaPhStJkjSMRb+ImeTkJGuTrN2wYcNif5wkTYz5BvjdSZYCdNv1W3phVa2squVVtXxqamqeHydJ2tx8A3w1sKJ7vAI4f2HKkSQNa5hhhJ8HrgCeneT7SU4CzgSOSnILcFS3L0kaox1ne0FVnbiFp45c4FokSXPgTExJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWrUSAGe5Ogk30ny3SSnL1RRkqTZzTvAk+wAfBj4beC5wIlJnrtQhUmStm6UM/AXAd+tqlur6ufAF4BjF6YsSdJsRgnwvYDvbbL//a5NkjQGO47w3szQVo97UXIycHK3+9Mk3xnhM7d1uwP3jOvD8p5xfdJE8Ni1bXs/fs+YqXGUAP8+sM8m+3sDd23+oqpaCawc4XOakWRtVS3vuw7NnceubZN6/EbpQvkmsH+S/ZI8AXg9sHphypIkzWbeZ+BV9XCStwJfB3YAzq6qmxasMknSVo3ShUJVfRX46gLVsj2YiK6i7ZTHrm0TefxS9bjrjpKkBjiVXpIaZYBLUqMMcEnNSbIkyUv6rqNv9oEvgCTHAM8Ddp5uq6p/6K8iDctj164kV1TVoX3X0SfPwEeU5GPACcCfMZidejxbmDWlbYvHrnkXJvm9JDPNCp8InoGPKMkNVXXAJtunAudW1Sv6rk1b57FrW5KNwFOAR4CfMfgjXFW1S6+FjdFI48AFDH5xAB5I8svAvcB+Pdaj4XnsGlZVT+u7hr4Z4KO7IMnTgfcC1zBY0OusXivSsDx2jUvyauCl3e6lVXVBn/WMm10oCyjJE4Gdq+onfdeiufHYtSfJmcALgc92TScCV1fVxNwdzIuYI0pySncWR1U9CCxJ8qf9VqVhJDk+yfTX8NOATyU5uM+aNCe/AxxVVWdX1dnA0V3bxDDAR/eWqvrx9E5V/Qh4S3/laA7+rqo2Jjkc+C1gFfCxnmvS3Dx9k8e/1FcRfTHAR7dk02FM3b1Cn9BjPRreI932GOCjVXU+HruWvBu4Nsmnk6wCrgb+seeaxso+8BEleS+wjMGZWwF/DHyvqt7WZ12aXZILgB8ALwdewGBUylVVdWCvhWloSZYy6AcPcGVV/bDnksbKAB9RkiXAHwFHMvgluhA4q6oe2eob1bskT2b
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"iris[\"Species\"].value_counts().plot(kind=\"bar\")"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>PetalLengthCm</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Species</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Iris-setosa</th>\n",
" <td>1.464</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Iris-versicolor</th>\n",
" <td>4.260</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Iris-virginica</th>\n",
" <td>5.552</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" PetalLengthCm\n",
"Species \n",
"Iris-setosa 1.464\n",
"Iris-versicolor 4.260\n",
"Iris-virginica 5.552"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"iris[[\"Species\",\"PetalLengthCm\"]].groupby(\"Species\").mean()"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='Species'>"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAWoAAAFACAYAAACV7zazAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAY+ElEQVR4nO3dfZRU9Z3n8c+nGxQSMG603WPEBFRGI0+NNixCIFHiw4qTmU1iiJKsZ+LT7IYdNpnokTiYE0ej2XjUjJPEIIO46xNO8GnUzGhURs1xeZIGRXQh2kZGFDQZRPAB8Lt/1K22hYa+jV11f9X1fp1Tp+reunXr21TXh1//7u/+riNCAIB0NRRdAABgzwhqAEgcQQ0AiSOoASBxBDUAJI6gBoDE9anETg888MAYPHhwJXYNAL3SsmXLXo+Ips6eq0hQDx48WEuXLq3ErgGgV7L90u6eo+sDABJHUANA4ghqAEhcRfqoO7Nt2zatW7dO77zzTrXeEj2gX79+GjRokPr27Vt0KUDdqlpQr1u3TgMHDtTgwYNlu1pvi48gIvTGG29o3bp1GjJkSNHlAHWral0f77zzjg444ABCuobY1gEHHMBfQUDBqtpHTUjXHj4zoHh1dTCxsbFRzc3NGj58uE4//XRt3bp1t9u2trbqgQce6HKfCxcu1GmnnSZJmjdvnqZPn95j9e6sra1Nt956a/vynt7vrbfe0vnnn6/DDz9cw4YN06RJk7Ro0aKK1QagcqrWR72zwRfd36P7a7tySpfb9O/fX62trZKkadOm6frrr9d3v/vdTrdtbW3V0qVLdeqpp/ZkmR9JOajPPPPMLrc955xzNGTIEK1Zs0YNDQ164YUXtHr16ipUiXrS09/jlOTJlGqpqxZ1RxMnTtTatWu1ZcsWfetb39KYMWM0evRo3XPPPXrvvfd0ySWXaP78+Wpubtb8+fO1ePFijR8/XqNHj9b48eP1/PPP536vm2++WWPHjlVzc7POP/987dixQ5I0YMAAXXzxxRo1apTGjRun1157TZL0u9/9TuPGjdOYMWN0ySWXaMCAAZKkiy66SI8//riam5t1zTXXSJJeeeUVnXLKKRo6dKguvPDC9tcvWrRIl112mRoaSh/xYYcdpilTpqitrU1HHXWUzjnnHA0fPlzTpk3Tb37zG02YMEFDhw7V4sWLe+zfGEDPqMug3r59u379619rxIgRuvzyy3XCCSdoyZIlevTRR3XBBRdo27ZtuvTSSzV16lS1trZq6tSpOuqoo/TYY49p+fLluvTSS/X9738/13utXr1a8+fP129/+1u1traqsbFRt9xyiyRpy5YtGjdunFasWKFJkybphhtukCTNmDFDM2bM0JIlS/SpT32qfV9XXnmlJk6cqNbWVn3nO9+RVGr5z58/X08//bTmz5+vl19+WatWrVJzc7MaGxs7rWnt2rWaMWOGVq5cqeeee0633nqrnnjiCV111VX60Y9+9FH+aQFUQGFdH0V4++231dzcLKnUoj777LM1fvx43XvvvbrqqqsklUan/P73v9/ltZs2bdJZZ52lNWvWyLa2bduW6z0ffvhhLVu2TGPGjGmv4aCDDpIk7bPPPu3928cee6weeughSdKTTz6pu+++W5J05pln6nvf+95u9z958mR94hOfkCQdffTReuml3U4X0G7IkCEaMWKEJGnYsGGaPHmybGvEiBFqa2vL9XMBqJ66CuqOfdRlEaEFCxboyCOP/ND6nQ+8zZo1S8cff7zuuusutbW16Qtf+EKu94wInXXWWbriiit2ea5v377toyoaGxu1ffv2/D9MZt99921/XN7HsGHDtGLFCr3//vvtXR+7e01DQ0P7ckNDw17VAKCy6rLro6OTTz5Z1113ncpXY1++fLkkaeDAgdq8eXP7dps2bdIhhxwiqTTaIq/JkyfrV7/6lTZs2CBJ+sMf/tBlq3fcuHFasGCBJOn2229vX79zTbtz+OGHq6WlRT/4wQ/af641a9bonnvuyV03gHTUfVDPmjVL27Zt08iRIzV8+HDNmjVLknT88cfr2WefbT+YeOGFF2rmzJmaMGFC+8HAzsybN0+DBg1qv+2333667LLLdNJJJ2nkyJE68cQTtX79+j3WdO211+rqq6/W2LFjtX79+vaujZEjR6pPnz4aNWpU+8HE3ZkzZ45effVVHXHEERoxYoTOPffcD/V3A6gdLre4elJLS0vsPB/16tWr9dnPfrbH36s32rp1q/r37y/buv3223XbbbcV2hrms8PuMDyv59heFhEtnT1XV33UtWLZsmWaPn26IkL777+/5s6dW3RJAApEUCdo4sSJWrFiRdFlAEhE3fdRA0DqqhrUlegPR2XxmQHFq1pQ9+vXT2+88QZf/BpSno+6X79+RZcC1LWq9VEPGjRI69at08aNG6v1lugB5Su8AChO1YK6b9++XCUEAPYCBxMBIHEENQAkLlfXh+02SZsl7ZC0fXdnzwAAel53+qiPj4jXK1YJAKBTdH0AQOLyBnVIetD2MtvnVbIgAMCH5e36mBARr9g+SNJDtp+LiMc6bpAF+HmS9OlPf7qHywSA+pWrRR0Rr2T3GyTdJWlsJ9vMjoiWiGhpamrq2SoBoI51GdS2P257YPmxpJMkPVPpwgAAJXm6Pv6jpLuya/v1kXRrRPxzRasCALTrMqgj4gVJo6pQCwCgEwzPA4DEEdQAkDiCGgASR1ADQOIIagBIHEENAIkjqAEgcQQ1ACSOoAaAxBHUAJA4ghoAEkdQA0DiCGoASBxBDQCJI6gBIHEENQAkjqAGgMQR1ACQOIIaABJHUANA4ghqAEgcQQ0AiSOoASBxfYouAPVt8EX3F11CRbVdOaXoEtAL0KIGgMQR1ACQOIIaABJHUANA4nIHte1G28tt31fJggAAH9adFvUMSasrVQgAoHO5gtr2IElTJM2pbDkAgJ3lbVFfK+lCSe9XrhQAQGe6DGrbp0naEBHLutjuPNtLbS/duHFjjxUIAPUuT4t6gqQv2W6TdLukE2zfvPNGETE7IloioqWpqamHywSA+tVlUEfEzIgYFBGDJX1d0iMR8Y2KVwYAkMQ4agBIXrcmZYqIhZIWVqQSAECnaFEDQOIIagBIHEENAIkjqAEgcQQ1ACSOoAaAxBHUAJA4ghoAEkdQA0DiCGoASBxBDQCJI6gBIHEENQAkjqAGgMQR1ACQOIIaABJHUANA4ghqAEgcQQ0AiSOoASBxBDUAJI6gBoDEEdQAkDiCGgASR1ADQOIIagBIHEENAIkjqAEgcQQ1ACSuy6C23c/2YtsrbK+y/cNqFAYAKOmTY5t3JZ0QEW/Z7ivpCdu/joj/W+HaAADKEdQREZLeyhb7ZreoZFEAgA/k6qO23Wi7VdIGSQ9FxKKKVgUAaJcrqCNiR0Q0Sxokaazt4TtvY/s820ttL924cWMPlwkA9atboz4i4t8lLZR0SifPzY6IlohoaWpq6pnqAAC5Rn002d4/e9xf0hclPVfhugAAmTyjPg6WdJPtRpWC/Y6IuK+yZQEAyvKM+lgpaXQVagEAdIIzEwEgcQQ1ACSOoAaAxBHUAJA4ghoAEkdQA0DiCGoASBxBDQCJI6gBIHEENQAkjqAGgMQR1ACQOIIaABJHUANA4ghqAEgcQQ0AiSOoASBxBDUAJI6gBoDEEdQAkDiCGgASR1ADQOIIagBIHEENAIkjqAEgcQQ1ACSOoAaAxBHUAJC4LoPa9qG2H7W92vYq2zOqURgAoKRPjm22S/rriHjK9kBJy2w/FBHPVrg2AIBytKgjYn1EPJU93ixptaRDKl0YAKCkW33UtgdLGi1pUUWqAQDsIndQ2x4gaYGk/xkRb3by/Hm2l9peunHjxp6sEQDqWq6gtt1XpZC+JSLu7GybiJgdES0R0dLU1NSTNQJAXcsz6sOS/kHS6oi4uvIlAQA6ytOiniDpm5JOsN2a3U6tcF0AgEyXw/Mi4glJrkItAIBOcGYiACSOoAaAxBHUAJA4ghoAEkdQA0DiCGoASBxBDQCJI6gBIHEENQAkjqAGgMQR1ACQOIIaABJHUANA4ghqAEgcQQ0
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"iris[[\"Species\",\"PetalLengthCm\"]].groupby(\"Species\").mean().plot(kind=\"bar\")"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.FacetGrid at 0x7f97eed545b0>"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAdoAAAFtCAYAAACgK6tiAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABg1ElEQVR4nO3deXwU9f348dfMnrnvC8KNHMohEO5LARUBgYIHXtSq0NYDa2ulnigiX1FbWlGLtrX+rEetiiKIioCK3CAooIDIkQC5T5JNstfM74/AwrIhB2R3s+H9fDx4PNhP5j3zzhLy3pn5zPuj6LquI4QQQgi/UIOdgBBCCNGSSaEVQggh/EgKrRBCCOFHUmiFEEIIP5JCK4QQQviRFFohhBDCj4zBTqCpFRVVoGmNe2IpLi6ckpJKP2XkP6GaN0juwRCqeUPLzz0pKSpA2YhgkDNawGg0BDuFcxKqeYPkHgyhmjdI7iK0SaEVQggh/EgKrRBCCOFHUmiFEEIIP5JCK4QQQviRFFohhBDCj6TQCiGEEH4khVYIIYTwIym0QgghhB8FpDNUSUkJDz74IFlZWZjNZtq1a8fcuXOJj4/32m7RokW8/fbbJCcnA9C3b1/mzJkTiBSFEH5mMCiAgtutnUOcN6NRRdP0RneBEyIYAlJoFUXhzjvvZODAgQAsWLCA559/nvnz5/tsO3nyZGbPnh2ItIQQAaAokO/KY8PhrdhcVQxrM4B0azqqXnfHJEWBPGceGw9vpXJfFUPbDCA1LJnD5VlsOPotaZHJDGjVh3g1AV3qrWjGAlJoY2NjPUUW4NJLL+Wdd94JxKGFEEFW4Mpn3jd/w6W5AFiftZU/DP41HcM61Rv39LpTcbvz9zGhy2je3PmhZ5vVh9bx6NDfEa3E+i1/Ic5XwO/RaprGO++8w6hRo2r9+ieffMI111zD7bffzo4dOwKcnRCiKSmKwq6CPZ5iedKyn74Aw9kvISuKws5877hBbfry0d6VXtvZHJUcrchu2qSFaGIBX73nqaeeIjw8nFtuucXna9OmTeM3v/kNJpOJ9evXc9ddd7FixQri4uIavP+EhMhzyitUV88I1bxBcg+GYOSt5PqO6ehER4dhMZrPHpjjfT1YQUGv5RqxwaA2+3+P5p6f8K+AFtoFCxaQmZnJ4sWLUVXfk+mkpCTP34cOHUpaWhr79+9nwIABDT7GuSyTl5QURUFBeaNimoNQzRsk92AIVt6XJHRlifIpbv3UGez4zmM4XmIH7GeN65HYjY+Uzzxxm4/u4JquY/jv7o8924SZrKSGpTTrf4+GvO9SiFu2gBXahQsXsnv3bl599VXM5to/xebl5ZGSkgLAnj17OHbsGB06dAhUikIIP0gypfDI8PtYc3g9Nmclo9sPo214W6jn83DyaXGVzkpGtR9Gq/AUEsLi+TpzI62iUhneZiCxapxMhhLNmqLXdi2mie3fv58JEybQvn17rFYrAOnp6bz00kvMmDGDWbNm0bNnT2bPns0PP/yAqqqYTCZmzZrFyJEjG3UsOaMNDZJ74AU7b4Oh5ipW4x/vUYmLC6ewsMIzZjSq6LqO2938K6yc0YqAFNpAkkIbGiT3wAvVvKHl5y6FtmWTzlBCCCGEH0mhFUIIIfxICq0QQgjhR1JohbhAqQZQDP6ZomEwKGByYzTKrxghAt6wQggRZIpOtuMYn+75Epujkqs6jaRTVCeMuqlJdl+k5/PNwc38XHyYnindGdiqD7EkNMm+hQhFUmiFuMDkOfOYv26Rp8vSvqID3NP/NrpHXXze+65Uj/PSxtfJsxUCcKjkCAeLM5nZ+1YM7jq6QAnRgsl1HSEuIIqisLtgr08rw0/2r0ZX3ee9/2xbnqfInrQ7fx/59oLz3rcQoUoKrRAXFB2zwfcSscVoQfFd9rXRDIrvrxQFBbWWcSEuFPLTL8QFRNfh4oQuPsX2mi5XgLvu9WEbIi08lc7x7b3GhrbNIMmSfN77FiJUyT1aIS4wicZkHh32O3bk7cLmrCIjrRdpltb19h5uCKsWwe2XTuPHwp84WJJJt8TOdI3vjOo6/yIuRKiSQivEBUbXdRIMSVyZPhqgpmVpEz7lE0M8Q5MHMSJtKE6nu0n3LUQokkIrxAWqsT3BG7dv0LTzn1wlREsg92iFEEIIP5JCK4QQQviRFFohhBDCj6TQCnGB0gwu3KrD8/ysqio4VQe64bR7q6qGU7WjqPpZ4871eLVS9ZrjKWe/f1xrns2AprpwncP7Ilo+mQwlxAVGUzQyKw+zZO8KbM4qxl80mm7xndmes5PVh9aRGB7PlO7jiDRFsGzvSvYWHaBfWk+u6DCS/MpCluxdQeWJuJ7xl2DSLXUeT1c0Dlce5oO9K6iqI65EL+KTvV+wr+gg/dJ6Mrr9CKKI8dqmChtbcraz5tB6kiMSmNptPKnmNNCDV900xc0h2yGW7F2B3eVgQpcxXBLXvd73RVw4FP3MXmwhrqiootGzKZOSoigoKPdTRv4TqnmD5B4MJ/POcR5j3jd/84ynRiaR0bo3y/et8owZVAM39ZrEf75b4hm7o+8N/Gv7u177nNnvFi6N7UVdv0WyHUd5et0LXmO/7ncLvU+Lq1JszFu/kNLq455tuiR05J4+d2DQTCQlRVFYVM5nWav4+KeVXnk+MfwPxBsSG/VeNKVjjiPMX7fIa+y3GdPpEd0DaNjPS1JSlN/yE8Enl46FuICoqsKu/L1eY31b9WT1wXVeY27NTaWj2vM6PiyWgyVHfPb32c9fotXRI1lVFXbm7/GNO+Adl19V4FVkAX4qOkiJs8Tzulqv5PMDX/nkecyWc9bj+5uqKmzP3e0zvvLA12DQgpCRaI6k0ApxAdF1nUhzhNdYlbOKSFO4z7an9ye2uxyEmaw+28RYolA4+2VbXdeJskT4jMdYolFO+/VjVn37L6uKium0cVUxEGH2zdOsBm9VIF2HaEukz3iMNbrO90VcWKTQCnEB0XW4OLGLV8Hacux7rusxwWu7pPAEok4rIDZnJZ3j2xNxWkFWFZVrul4J7rP/GtF1uCShq29clyvAfaoQJVmT6JvW0yt23EWjiTHEel6bdQs39ZzstU1yRCLpka3q/qb9SNd1eiVd7PUhxKCojOs8Ct0thVbUkHu0hP49t1AkuQfeybwVBUrdxewvPYTdZeei+I4kmhPJrc5lf8khYi3RdIptjxETmRVZZFfk0Ta6NW0j21DptrG/5FRcsiml3olIigIlWjE/n4jrEt+JZFMy+hlxVdjIrDhCTkUebaPTaRuR7plQdDJ3TXGTY8/hZ0+eHYgk2m/vWUMoChS7i/i59BBOt5OL4juSZEz2vC9yj1ZIoSX0f3GGIsk98GrLW1Hwmsh05uvGjDVEQ+Jq2+bM3M/1+P7WkNxrI4W2ZZNLx0JcwM4sCrUVr4aOncvx/LlNMDTXvERwSaEVQggh/EgKrRBCCOFHUmiFEEIIP5JCK4RocprBRZViQ6+jmYWi6lQrlbhURwAzazxFJSTyFM2X9DoWQjQZRYF8Vx7vfP8RB0oy6ZXSjWu7XUOMEue1nY1yVh38mq8yNxIfFsetvabQPqxDUHsW18ZGOV8c/IqvMzeREBbHLb2m0j6sfbPLUzRvckYrhGgyNr2c5zb+nX1FB3BpLrbn7Obv2/8fztPOBhVV54vDX7Py4Focbie5Ffk8v/EVCpz5Qcy8FqrO54fW8MXBb3C4neRU5PP8xsUUugqCnZkIMVJohRBNJr+qCJuj0mvsSFk2pY5TPYur9Cq+ytzotY2u62Tb8gKSY0NV6ZV8lbnJa0zXdXKaWZ6i+ZNCK4RoMmFG337IBtWAxXBqyTijYiTBGuuzXXgtscFkVIzEW2N8xmv7HoWoixRaIUSTSTQncFn7wV5j13UfT4x6qmAZNTO39J7q1XS/Q2wb0iOC17O4Nmbdwq29r/XKs2NcW1o3szxF8yctGGlZLfVCheQeeIHK20E12VW5lNpLSQxLINWailE/Y3UeRafAmU+2LY9wo5X0iFaE4bsKzklBe88VjXxnPjm2/Jo8I1sTpvuuRlQXacEoZNaxEKJJmbHWzMw
"text/plain": [
"<Figure size 474.35x360 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAdoAAAFtCAYAAACgK6tiAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAB0yElEQVR4nO3dd5hU5fnw8e8502d2ts82ekekShOwgcQuIJZgosYSkVjwF2PktSJGMZZoFDXYExM1VlSqBrECIkiR3qVt77uzu9POef8YGBhmYWfZndkF7s91cV3sc9q9Z87Ofcpz7kfRdV1HCCGEEDGhtnQAQgghxIlMEq0QQggRQ5JohRBCiBiSRCuEEELEkCRaIYQQIoYk0QohhBAxFPdE+8ILL9CjRw+2bNkSMW3GjBkMGzaMsWPHMnbsWKZNmxbv8IQQQohmZYznxtavX8/q1avJyck54jzjxo1jypQpx7yNkpJqNC1+rwanpNgpK6uJ2/ai1VrjgtYbm8TVOBJX4x0pNpfL2QLRiHiJ2xWt1+vlkUceYerUqSiKEq/NxpzRaGjpEOrVWuOC1hubxNU4ElfjtebYROzELdE+99xzjBkzhnbt2h11vrlz53LppZdy4403smrVqjhFJ4QQQsSGEo8SjKtWreLZZ5/lX//6F4qiMGrUKGbOnEn37t3D5isqKiI5ORmTycTixYu5++67mTdvHikpKbEOUQghhIiJuDyjXb58OTt27ODcc88FID8/n5tuuonHH3+cM844IzSfy+UK/X/EiBFkZ2ezdetWhgwZEvW24v2M1uVyUlRUFbftRau1xgWtNzaJq3EkrsY7UmzyjPbEFpdEO3HiRCZOnBj6+UhXtAUFBWRmZgKwceNG9u3bR6dOneIRohBCCBETce11XJ+bb76ZyZMn06dPH5555hnWr1+PqqqYTCaefPLJsKtcIYQQ4njTIol20aJFof+/+uqrof8/8cQTLRGOEEIIETNSGUoIIYSIIUm0QgghRAxJohVCCCFiSBKtOC5oQHGVh/IaX0uHIoQQjdLivY6FaEhVnZ83521g9ZZiVFVh7JmdOX9wO8xGOU8UQrR+8k0lWjVFVfhq5V5WbykGQNN0Zn2znZ0FrbMggRBCHE4SrWjVfH6dH9bnR7Rv3VOOqp44g1MIIU5ckmhFq2YyKPRoH1nrun2mM66lNoUQ4lhJohWtmq7rXHpGJ1KcllBbn65pdGmT1IJRCSFE9KQzlGj1Uh1m/nLz6eSX1mAyGshMsWI2yDmiEOL4IIlWHBfsZgOds2SEEyHE8UcuC4QQQogYkkQrhBBCxJAkWiGEECKGJNEKIYQQMSSJVgghhIghSbRCCCFEDEmiFUIIIWJIEq0QQggRQ5JohRBCiBiSRCuEEELEkCRaIYQQIoYk0QohhBAxJIlWCCGEiCFJtEIIIUQMSaIVQgghYkgSrRBCCBFDMvC7iIuArlNYUUdRWS2uijrSE8yYDHKeJ4Q48UmiFTGnKPDT5mJmzlobart4eEfGntEJo6q0YGRCCBF7ckkhYq6i1s+bczaEtc1d8gvFlXUtFJEQQsSPJFoRc3VePx5fIKK9utbXAtEIIUR8SaIVMZeSYKFtRkJYm8VsICPF3kIRCSFE/EiiFTFnUhX+79f96dkhBYCcdAf3/24wSTZTC0cmhBCxJ52hRFykOszcNaE/tZ4AaSl2vLVedF1v6bCEECLm5IpWxI1RUXBajSQlWFo6FCGEiBtJtEIIIUQMSaIVQgghYkgSrRBCCBFDkmhFBEWKNQkhRLORXscixKfp7C1ys31fBZmpdjrnJOIwG1o6LCGEOK5JohUAKCosWZvPv+ZuDLX17JDC5Cv7YTXKjQ8hhDhW8g0qAKis9fPfL7aEtW3aVUZeSU0LRSSEECcGSbQCgEBAr7cesbeeNiGEENGTRCsASLSbGNYnO6zNbjWSne5ooYiEEOLEIM9oBRA847p6dDcyU2x8/3MuHbOTuGJkV5JsRqRSohBCHDtJtCIkwWJk7BmduGBoB8wGFdAlyQohRBPJrWMRRtd0zAYFkAwrhBDNQRKtEEIIEUOSaIUQQogYkkQrhBBCxFDcE+0LL7xAjx492LJlS8S0QCDAtGnTGD16NL/61a/44IMP4h2eOE4pChiNKooUahZCtDJx7XW8fv16Vq9eTU5OTr3TZ8+eze7du/niiy8oLy9n3LhxDBs2jLZt28YzTHGccXsCrN5ezMpNhfTvls6A7i4SLNKhXgjROsTtitbr9fLII48wderUI151zJs3jyuvvBJVVUlNTWX06NEsWLAgXiGK45Bf13lz3kZe/2w9q7YU8ebcjcyctQ6fJr2mhRCtQ9wS7XPPPceYMWNo167dEefJy8sLu9rNzs4mPz8/HuGJ41RJpYeVmwvD2jb8UkpRRV0LRSSEEOHicn9t1apVrF27lrvvvjvm20pLS4j5Ng7ncjnjvs1otNa4oPliK6v119tuMhmOaRutdZ9JXI3TWuOC1h2biI24JNrly5ezY8cOzj33XADy8/O56aabePzxxznjjDNC82VnZ5Obm0vfvn2ByCvcaJSUVKPF8bahy+WkqKgqbtuLVmuNC5o3NqfVSP/uLlZvKQq19eyQQrLN1OhttNZ9JnE1TmuNC44cmyTfE1tcEu3EiROZOHFi6OdRo0Yxc+ZMunfvHjbfBRdcwAcffMB5551HeXk5Cxcu5O23345HiOI4ZVTgxotPYVV3Fys3FdKvm4uBPVyYDNL7WAjROrR418ybb76ZyZMn06dPH8aOHcuaNWs477zzALjtttuO+kxXCAjWaD6rTzYj+7chENDQpUCzEKIVaZFEu2jRotD/X3311dD/DQYD06ZNa4mQxHFO13X8fhk7VwjR+khlKCGEECKGJNEKIYQQMSSJVgghhIghSbSi2agq+HSQJ6VCCHFQi/c6FieGOr/Gmu0lzF28E5NR5bKzu9KzXSJGVc7lhBAnN/kWFM1i/a4yXp61lr2F1ezMreSZd1eys8Dd0mEJIUSLk0Qrmkw1qiz8cXdE+7L1+ZhMcogJIU5u8i0omkxRwGk3R7Q77SY0rQUCEkKIVkQSrWiygE/j4uEdMagHyx7aLEYG98okEJBMK4Q4uUlnKNEsOmQ6mPr709mwswSjUeWUjqm0SbHKFa0Q4qQniVY0Dw3aptpon942+KOGJFkhhEASrWhmklyFECKcPKMVQgghYkgSrRBCCBFDkmiFEEKIGJJEe5JRjVBe66e81o/BqDS8QGuhgNsboM6noarHUdzi+KHquKnCo9SiKI07xlRVoU5xU6NUo8i3qjiMdIY6iVR5Any7bB9zvt8JwCVndOLMfjkkWlv3YVDjDTB/2S4WLN2F1WLkmgt6MrBbOkZJuKKZuKnisy2fs3jPcpzmBK7rezk9E3ug6IYGl/UrXn4qXMUHG+bi0/xc0PUcRrY9Ayv2OEQujgdy7nUS2fBLKR99tQ2PL4DHF+Cjr7axcVdZS4d1VKqqsHR9PnMX/0JA03HX+nh51lr2FEkdZdE8FFXnf798w/e7f0TXdSo9Vbyw/J8UeAuiWn63ew9v/fwRtf46/JqfOVsWsqZkXaOvisWJSxLtScJuN/Pj+vyI9h/W5WOvp3xia+ENaCz6aW9E+8ZfSuUWsmgWtXot3+/5MaJ9X3Xk38vhVFXh58INEe3f7PoBXfU3S3zi+CeJ9iTh9wfIcSVEtLdxJeD3t94RZI2qQpt64s5ItaNpegtEJE40JsVEpiM9oj3RHHncHU7XdbITMiPa2yXmRHXbWZwcJNGeJLzeACP65pCUcPDqNdFhZkS/bLze1pto0eHyc7pgMR/80mrjSqBHu+SWi0mcUAyaid/0vgyDevAY65raibYJbRpcVtehV3oP0u2poTab0cp5nc9Gl+ItYj9F1/UT6rKgpKQ6rlc6LpeToqKquG0vWkeKq7jay56CanRdp32Wk/SE+N82buw+UxSFUreX3KJqTEYDbVwOHObmv1o43j7LlnZCxaXolAZKyHcXYjVYyHFkY9Wj78zkpopcdz4BPUCOI4skNZn6vlmPFJvL5WxcvOK40rq7m4pml55
"text/plain": [
"<Figure size 474.35x360 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import seaborn as sns\n",
"sns.set_theme()\n",
"sns.relplot(data=iris, x=\"PetalLengthCm\", y=\"PetalWidthCm\", hue=\"Species\")\n",
"sns.relplot(data=iris, x=\"SepalLengthCm\", y=\"SepalWidthCm\", hue=\"Species\")"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.FacetGrid at 0x7f97ef942eb0>"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAdgAAAFtCAYAAACk3ntfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABjbUlEQVR4nO3deXwU9f348dfMnjk29yYECKfcgtwICMpRFQWJKC22HvXAIiL8bClUrSAeVNBqK6VfvFtbsRYUFIhoERREQBALQhERlCPkPsi12Wvm90dkw5KbZDdZ8n4+Hj7MfuazM+8ddve9M/OZz1vRdV1HCCGEEE1Kbe4AhBBCiIuRJFghhBAiACTBCiGEEAEgCVYIIYQIAEmwQgghRABIghVCCCECwNjcAQRTXl4Jmta0dyXFxoZTUFDWpOsMtFCLOdTihdCLOdTihdCLubp47XZbM0UjgkGOYBvJaDQ0dwgNFmoxh1q8EHoxh1q8EHoxh1q8ovEkwQohhBABIAlWCCGECABJsEIIIUQASIIVQgghAkASrBBCCBEAkmCFEEKIAJAEK4QQQgSAJFghhBAiACTBCiFaJUUBo1FFURq3HlVVMBrlq1RU1aqmShRCCACLpxDPiX2U/7Afa6f+GDr0w2mMbvB6rOXZOL/7gvLcE4T3GIGe1AO3GhaAiEUokgQrhGhVTLqT4k0v4zxxEADHd19i7TKA8PH34cZc7/VY3QXkrX4SrayoYj3ffkHMVbei9BiP3rRTnosQJec1hBCtilqS5UuuZ5Uf+wq1NLtB69HyTviS61lnPl+N2VPc6BjFxUESrBCidanp8LKBR516dU/Q9YavSFy0JMEKIVoV3ZaEuV0PvzZLxz5okfYGrccQl4IaFunXFnV5Km5TVKNjFBcHuQYrhGhVXIqVqKtn4D62G+ex/2LpOhBT58E4sTRoPeXmeOJv/j3lh7bhzjlOeJ8roW0f3FqAAhchRxKsEKLVcZpiUXpeQ1ifa/F6dZwXOCqpPKwNhiE/xagouD2SWYW/oCXYmTNncurUKVRVJTw8nEcffZRevXr59Zk3bx6HDx/2PT58+DDLly9n3LhxLFu2jJUrV5KYmAjAwIEDWbhwYbDCF0JcZHRdx+Np/PVSr1euu4rqBS3BLlmyBJvNBsCmTZt4+OGHWbNmjV+fpUuX+v7+5ptvuOOOOxg1apSvLTU1lfnz5wcnYCGEEKIRgjbI6WxyBSgpKUGpY/qU1atXM2nSJMzm+t+XJoQQQrQUQb0G+8gjj7B9+3Z0XeeVV16psZ/L5WLdunX87W9/82vfsGEDn332GXa7nQceeIABAwYEOGIhhBDiwii6Hvw5R9auXcuGDRt4+eWXq12elpbGyy+/7HcKOScnh5iYGEwmE9u3b2fu3LmkpaURGxsbrLCFEEKIemuWUcSpqaksWLCAgoKCahPkO++8w0033eTXZrdX3qM2cuRIkpOTOXLkCEOHDq33dvPyStC0pv09YbfbyMkJrZlbQi3mUIsXQi/mUIsXQi/m6uK122019BYXg6Bcgy0tLSUjI8P3ePPmzURHRxMTE1Olb2ZmJl9++SUTJ070a8/KyvL9fejQIdLT0+ncuXPAYhZCCCEaIyhHsA6Hgzlz5uBwOFBVlejoaFasWIGiKEyfPp3Zs2fTt29fANasWcOYMWOqJN/nnnuOgwcPoqoqJpOJpUuX+h3VCiGEEC1Js1yDbS5yirhCqMUcavFC6MUcavFC6MUsp4hbH5mLWAghhAgASbBCCCFEAEiCFUIIIQJAEqwQQggRAJJghRBCiACQBCuEEEIEgCRYIYQQIgAkwQohhBABIAlWCCGECABJsEIIIUQASIIVQgghAkASrBBCCBEAkmCFEEKIAJAEK4QQQgSAJFghhBAiACTBCiEaTVEULLoDi16GoijNHY4QLYKxuQMQQoQ2o1aOcuorCravAiBq+E3QcRAexdrMkQnRvOQIVgjRKGrOEfI3voi3OB9vcT4FH72Mmv1tc4clRLOTBCuEuGBGo0rZoa1V2su+3ozRKF8vonWTT4AQ4oJpmo4xOqlKuyEmCV3XmyEiIVoOSbBCiAumaTrWniNRzGG+NsVsJaz3lXi9kmBF6yaDnIQQjeIMTybhlsfx5vwAuo7B3glnWCJIfhWtnCRYIUSj6DqUW+zQ3g6AGyS5CoGcIhZCCCECQhKsEEIIEQCSYIUQQogAkAQrhBBCBIAkWCGEECIAJMEKIYQQASAJVgghhAgASbBCCCFEAARtoomZM2dy6tQpVFUlPDycRx99lF69evn1WbZsGStXriQxMRGAgQMHsnDhQgC8Xi9PPvkk27ZtQ1EU7r33XqZOnRqs8IUQQogGCVqCXbJkCTabDYBNmzbx8MMPs2bNmir9UlNTmT9/fpX2devWceLECT766CMKCwtJTU1l+PDhtG/fPuCxCyGEEA0VtFPEZ5MrQElJCYqiNOj5aWlpTJ06FVVViYuLY/z48WzcuLGpwxRCCCGaRFDnIn7kkUfYvn07uq7zyiuvVNtnw4YNfPbZZ9jtdh544AEGDBgAQEZGBm3btvX1S05OJjMzMyhxCyGEEA0V1AT71FNPAbB27VqWLl3Kyy+/7Ld82rRpzJgxA5PJxPbt25k5cyZpaWnExsY2yfbj4yObZD3ns9ttdXdqYUIt5lCLF0Iv5lCLF0Iv5lCLVzROs1TTSU1NZcGCBRQUFPglT7vd7vt75MiRJCcnc+TIEYYOHUpycjKnT5+mX79+QNUj2vrIyytB05q2zIfdbiMnp7hJ1xlooRZzqMULoRdzqMULoRdzdfFKwr24BeUabGlpKRkZGb7HmzdvJjo6mpiYGL9+WVlZvr8PHTpEeno6nTt3BuDaa69l1apVaJpGfn4+mzZt4pprrglG+EIIIUSDBeUI1uFwMGfOHBwOB6qqEh0dzYoVK1AUhenTpzN79mz69u3Lc889x8GDB1FVFZPJxNKlS31HtZMnT2bfvn1cffXVANx///2kpKQEI3whhBCiwRRd11tNaWQ5RVwh1GIOtXgh9GIOtXgh9GKWU8Stj8zkJIQQQgSAJFghhBAiACTBCiGEEAEgCVYIIYQIAEmwQgghRABIghVCCCECQBKsEEIIEQDNMlWiEPWhKAqFZW4KThQQaTFgUhtWgekso+LFUJYLuoYWnoAbU5U+JtyoZbmgqHjDE/DohsaGL4Ro5STBihbJq+l8/r9M/vHBN3i8Gu0TI/l/PxtAXETV5Fgbs6eE8j3vUrp/C6Bj7TqIyNG34TTF+PpYPIWUbv0Hju++BBQi+l2FdcgUXAaZBEAIceHkFLFokTILHby+/n94vBoAp7JLeOODQ2gNXI+eeYjS/ZuBihm8yo9+ifvoLtQfj4YVRcF9dPePyRVAp3T/FrSMQ03zQoQQrZYkWNEiZRc4qrTt/y4Xh8tb73UYDCrO4weqtDuOfIFBqViPUdUoP7KrSh/nD/sxGOTjIYS4cPINIlqkWJulSluXdtFYTPV/y2qahrntJVXaLSl98P54jdWrq5hT+lTt07Y7mtbQ42UhhKgkCVa0SG3jI7h2eEff43Crkbsn9cGo1H+gk66DoX1fzG17+NqMcW2x9h7tK/qgaTrWnqMwxlXWFja37YYhpR+tpwyGECIQZJCTaJHMBoUpo7ow+rJ2uLw6sZEmoqzGBic9pykG2/X/D6UoE13zokQnU65G+PUptyQQPeUROJOBoqroUck4lbAmfDVCiNZIEqxosYyqQpsYq6/M14UeUbqUMIjuXHsfNQJiq55OFkKICyWniIUQQogAkAQrhBBCBIAkWCGEECIAJMEKIYQQASAJVgghhAgASbBCCCFEAEiCFUIIIQJAEqwQQggRADLRhBCAwVuKJ79iJidDTDJeQ/PN5KSiY3WcRivKRbXF4YpIxqPLR1WIUCOfWtHqqY4cij9Yhjf3BACGtr2I/Mm9aJbY4MeigunkbrI2vgReD6gG4sb/EkPn4XglyQoRUuQUsWjVjEaV8kOf+ZIrgPf0IVzHv0Zthk+H1ZFJ3kevViRXAM1L/qa/YSnNDH4wQohGkQQrWjVV0eH0wSrt3tO
"text/plain": [
"<Figure size 474.35x360 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"irisv = iris[iris[\"Species\"] != \"Iris-setosa\"]\n",
"sns.relplot(data=irisv, x=\"SepalLengthCm\", y=\"SepalWidthCm\", hue=\"Species\")"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"scrolled": false,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"<seaborn.axisgrid.PairGrid at 0x7f97f2ad3550>"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAzwAAALDCAYAAADQRQWWAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAEAAElEQVR4nOydd3gcV7m435nZ3nelVW+2Zcu23HtN7z1OQggtEEJNIEBCgAvcGzoXuPy4l8ClBgiXmp6Q3uPe495lq3dppV1t35nfHyutvJZsy/aqOed9Hj+P59szM9+Mzsyc75yvSJqmaQgEAoFAIBAIBALBeYg82goIBAKBQCAQCAQCwXAhDB6BQCAQCAQCgUBw3iIMHoFAIBAIBAKBQHDeIgwegUAgEAgEAoFAcN4iDB6BQCAQCAQCgUBw3iIMHoFAIBAIBAKBQHDeohttBU7kzTff5L//+7/RNA1VVfnc5z7HFVdcMeT929sDqOrQMm273RY6O4Nnq+q44Hy/xtG4Pq/XPuS2Z9Ifz4bx8PcdDzrC+NVzqP1xuPviqRgP91boeO5k8t041q/1dAj9Rxe324JOp4y2GoIxxJgyeDRN48EHH+Qvf/kLU6ZMYf/+/dxxxx1cdtllyHLmF6PeCw/D+X6N5/v1nY7xcP3jQUcQeg4n40FnoePYYrxfq9B/dBnv+gsyz5hzaZNlGb/fD4Df7ycnJ2dYjB2BQCAQCAQCgUBw/iNpmjY6Pg4nYf369XzhC1/AYrHQ09PDr3/9a+bOnTvaagkEAoFAIBAIBIJxyJhyaYvH4/z617/ml7/8JfPnz2fr1q188Ytf5Pnnn8dqtQ7pGGfip+712mlt9Z+LymOe8/0aR+P6xlIMz3j4+46mjpIEXaqPxp4mFFmh0FqAWRv8XTIe7iUM1HM8xPCMh3s7XnXs6+MNPY3oZT0F1vyT9vGR0G+onK4/joe/x6kYr/r34Ke+pxFFkck15WDDMdoqnRVn0hcF7w3GlMGzb98+WlpamD9/PgDz58/HbDZz5MgRZs2aNcraCQSC8UZrooUfrnmYUDwMQL4th/sWfRL7OP2ICwQn0hJv5gdrHyYSjwBQaM/j8wvvHrcDVcHo0aV18J/rf0lXuBsAh9HOV5Z+FpecNcqaCQTnzpgKjsnLy6OpqYmqqioAjhw5QltbGyUlJaOsmUAgGG9IssaLh99IGTsAjYEWDnQcQpJGUTGBIFPIGs8dfCVl7ADU+5s47KsaRaUE4xFZltjc+G7K2AHojvhZ37ANWRYvTMH4Z0yt8Hi9Xh566CHuu+8+pN4RyQ9+8ANcLtfoKpYhQpE4+6o7mV2ehSISMQgEw0pCSlDb3TBA3tTTipQtMcbCFwWCMyZBnDp/0wB5S08bskcaNRdGwfhDliWqu+oGyI/5apDLZFQ1MQpaCQSZY0wZPAA33HADN9xww2irMSz87l972XOsgysWFLPqwkmjrY5AcF6jqHouLFnCX3c/nSavzJ4iBoKC8wI9Bi4oWcxje/+VJq/IKhd9XHBGxOMqS4vms61xd5p8RfEi4nFh7AjGP2KZYYRo84U4WOvjQ5dP4Y3t9cQT6mirJBCc12iaxrycOVxVfhGKrGDWmfjI7FspthSNtmoCQUZQVY1FefO4fOJKFEnGojfzsTm3U2guHG3VBOOQcsckbp1+HQZFj17Rs2rq1UxxTh5ttQSCjDDmVnjOV7YdamNykYtspxmn1cCR+i4qStyjrZZAcF5jxsINZddwaelKJGSskk3MfAvOKyzYuGnidVwx4SJkZCyijwvOEoNm4pKCC1iSPw+TSY8cMaKJuVnBeYJY4Rkhdle1U5JrA6Akx8a+6s5R1kggeG+gqWDR7Jg1qxgICs5PVAmLZsck+rjgHNFUMGs2cmzZwtgRnFeIFZ4RQNM0qhq6uWB2AQAF2VYO1PpGVymB4FRIGq3xFhoDzVj05kHr1/TQTV2gkZgao9CWj0fJTksEEJMiNIaa2H20G5feRZ4pF51mGNnLkCQ6Em00BJJ1eIpthVgR9RkE4xBJ43D7MWq6GjDrTRRa8jFjO2lzWU6mZa/prkdGosRZiB4jtYF6NE2l0FaAS3YjcncI+pAk6FZ91AUaORCUyDfn4hykjySkOM2RZlqDbTiNDgos+cSJURuopzvix2vNotBciF41js6FCASDIAyeEaDVF0Kvk7GZ9QDkeSy8umVgNhSBYCwgSVAVOsp/rf91yoAp95Tx6bl3poweP138ZP3/0hbqAECv6Pm35feSo8sHQJUTvHzsDV48/GbquLdOu5aLCi9AUkcuxWlzrJEfrP050UQMgCyzmweWfAaH5BoxHQSCTFATruHH6/4XtXfavcxZxD0L7sKiDW70NMYa+PG6XxGKJdOy2w1Wbpx2Jf+340kAzDoTX1v+ObIU78hcgGDM06G288O1DxOI9gBg0Zv52vLP4ZGzU20kGba0buPRHY+nZPcu+igb6razpWFHSvbBWTexImcZqlglEowRhEvbCFDdHCDXY0lt28x6EqpGV090FLUSCAYnJkX4847H01ZrDncco74nmeJZkmB/+6GUsQMQS8T416HXkOTkPh2x9jRjB+DJ/S/SFR85V05J0Xjh8OspYwegPdTJflGHRzDOSMhR/rLzyZSxA3Csq476wMC06wB6vcLb1RtTxg6AP9pDo7+FLHMydjQUD/N2zToURQwDBMm01Jvqt6WMHYBgLMS6us0oSv8L06928bfdz6TvLJFm7AA8vucFOtT2YdVZIDgTxJtuBKhr8ZPtNKW2JUkix2WmoTUwiloJBIMT1+K0hQYaJsFYCEj23/ZBfm8KtJIgmb70+IFWH6qmEo4PlA8XKipNPa0D5G3B9lSdL4FgPBDT4mkTDH30xIKD7yBrtPS0DRB3hnw4TP0rQvX+ZkD4tAmSBk9DoHmAvM7fiCT1DxUjiSix4yaRAMLHFb49vt1gcoFgtBAGzwhQ0xxIM3gA3HYjDe0n+VgJBKOISbJwQeniNJkkSeTbcoFkKtzp3ikD9rt4wjIULem2mW3OwmFMj5XJsWbjMXqGSeuByKqOi8uWDZBXeqeKwG7BuMIsWbiobGmaTEKiwJY3oK0kSbSF2llaPG/Ab+VZZdT46lPbF5UuJZEQz4IgWYdnWdGCAfILSpbQFGlih28HBwL7MetNlLmK09rYDVaMuvR4nQnuYrJG8H0vEJwOYfCMAI3twZMYPD0n2UMgGEVUiasnXJo0YGSFXGs29y/5JF5dTqpJoamQT8//ME6jHaNi4OapVzE3e1bKDc4oGblr7vuY5ClFkiQqsifxodmrMDBySQs0TWNO9gxWTbsao86I02jnU/M/RJGoUSIYZ2gqXFKykismXYBO1uG1ePjCkk/g1ecMaNsab+Kh1T8hFA2xatrVmPUmrAYLH5x1M3m2HMx6M2adiffPuJEpLlFjRdDPRPsEPjLrFqx6Cxa9mQ/MvJksi4uH3vkJv9n2F/5n0yP896bf8fF572de/gwkSWKCq4Qsk4cvLb2bEmchkiQxO286d819P/qE6fQnFQhGCEnTzq8cLe3tgSHP3nq9dlpb/cOqTzyh8tmfvs3nb5mF7jhf6cP1Xeyv6eSB988d1vOPxDWOJqNxfV7v0LN8nUl/PFtdhu36ZY2wFkQn6dANkm1HkiQiUghVUzFLlrQUpm3xFr61+qfMy59BkTOfY5117Gzex7cv/DJuOWt49D0JsgwhLYgkSRg1Cyd75Y2XZ+VEPYfaH4e7L56K8XBvx4OO7iwLjR1tKJKCQTMOyJ4lK/CXA4+xpmYzkFzRuar8IjwmJ4XGElRVJUQQ0DCT+RTWmXw3joe/x6kYr/rLskSIIBazgXhE48ebfkm1Lz3J0r0LP8ZU5xQihNFjQFGTK/tRJUxUi2CVrEiJ0c2JdSZ9UfDeQGRpG2ZafSEcVkOasQPgthlp6QyNklYCwRBQJUxYT+rir2kaBs3U+//030LxMKqmsqVhJ1sadqbkkUR0xNeVVRWMWEADTcQrCMYxOlnBpPX15YFoaDQF+uPWDrcf4+H2P3Jx2TJun1yMqmoYMQOgimdBMAh9fcRjsVMXbKE
"text/plain": [
"<Figure size 834.35x720 with 20 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.pairplot(data=iris.drop(columns=[\"Id\"]), hue=\"Species\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Podział danych\n",
" - ### Zbiór trenujący (\"training set\")\n",
" - Służy do dopasowania parametrów modelu (np. wag w sieci neuronowej).\n",
" - Podczas trenowania algorytm minimalizuje funkcję kosztu obliczoną na zbiorze treningowym \n",
" - ### Zbiór walidujący/walidacyjny (\"validation set\" aka. \"dev set\")\n",
" - Służy do porównania modeli powstałych przy użyciu różnych hiperparametrów (np. architektura sieci, ilość iteracji trenowania)\n",
" - Pomaga uniknąć przetrenowania (overfitting) modelu na zbiorze trenującym poprzez zastosowanie tzw. early stopping\n",
" - ### Zbiór testujący (\"test set\")\n",
" - Służy do ewaluacji finalnego modelu wybranego/wytrenowanego za pomocą zbiorów trenującego i walidującego"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},