{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "IUM_1_434788.ipynb",
"provenance": [],
"collapsed_sections": [],
"toc_visible": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "shaFKPEixPn4"
},
"source": [
"# 1. Pobranie zbioru danych z Repozytorium"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "-03GDjWtxD7W",
"outputId": "3cefd33d-3ef4-4c16-963e-ffa6e9e781de"
},
"source": [
"!curl -OL https://git.wmi.amu.edu.pl/s434788/ium_434788/raw/branch/master/winequality-red.csv"
],
"execution_count": 1,
"outputs": [
{
"output_type": "stream",
"text": [
" % Total % Received % Xferd Average Speed Time Time Time Current\n",
" Dload Upload Total Spent Left Speed\n",
"100 98k 0 98k 0 0 74502 0 --:--:-- 0:00:01 --:--:-- 74502\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 419
},
"id": "sAUNi0ylxWUm",
"outputId": "fe879388-072d-4845-f3b5-f06a4fca5f1e"
},
"source": [
"import pandas as pd\n",
"wine=pd.read_csv('winequality-red.csv')\n",
"wine"
],
"execution_count": 2,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" fixed acidity | \n",
" volatile acidity | \n",
" citric acid | \n",
" residual sugar | \n",
" chlorides | \n",
" free sulfur dioxide | \n",
" total sulfur dioxide | \n",
" density | \n",
" pH | \n",
" sulphates | \n",
" alcohol | \n",
" quality | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 7.4 | \n",
" 0.700 | \n",
" 0.00 | \n",
" 1.9 | \n",
" 0.076 | \n",
" 11.0 | \n",
" 34.0 | \n",
" 0.99780 | \n",
" 3.51 | \n",
" 0.56 | \n",
" 9.4 | \n",
" 5 | \n",
"
\n",
" \n",
" 1 | \n",
" 7.8 | \n",
" 0.880 | \n",
" 0.00 | \n",
" 2.6 | \n",
" 0.098 | \n",
" 25.0 | \n",
" 67.0 | \n",
" 0.99680 | \n",
" 3.20 | \n",
" 0.68 | \n",
" 9.8 | \n",
" 5 | \n",
"
\n",
" \n",
" 2 | \n",
" 7.8 | \n",
" 0.760 | \n",
" 0.04 | \n",
" 2.3 | \n",
" 0.092 | \n",
" 15.0 | \n",
" 54.0 | \n",
" 0.99700 | \n",
" 3.26 | \n",
" 0.65 | \n",
" 9.8 | \n",
" 5 | \n",
"
\n",
" \n",
" 3 | \n",
" 11.2 | \n",
" 0.280 | \n",
" 0.56 | \n",
" 1.9 | \n",
" 0.075 | \n",
" 17.0 | \n",
" 60.0 | \n",
" 0.99800 | \n",
" 3.16 | \n",
" 0.58 | \n",
" 9.8 | \n",
" 6 | \n",
"
\n",
" \n",
" 4 | \n",
" 7.4 | \n",
" 0.700 | \n",
" 0.00 | \n",
" 1.9 | \n",
" 0.076 | \n",
" 11.0 | \n",
" 34.0 | \n",
" 0.99780 | \n",
" 3.51 | \n",
" 0.56 | \n",
" 9.4 | \n",
" 5 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 1594 | \n",
" 6.2 | \n",
" 0.600 | \n",
" 0.08 | \n",
" 2.0 | \n",
" 0.090 | \n",
" 32.0 | \n",
" 44.0 | \n",
" 0.99490 | \n",
" 3.45 | \n",
" 0.58 | \n",
" 10.5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1595 | \n",
" 5.9 | \n",
" 0.550 | \n",
" 0.10 | \n",
" 2.2 | \n",
" 0.062 | \n",
" 39.0 | \n",
" 51.0 | \n",
" 0.99512 | \n",
" 3.52 | \n",
" 0.76 | \n",
" 11.2 | \n",
" 6 | \n",
"
\n",
" \n",
" 1596 | \n",
" 6.3 | \n",
" 0.510 | \n",
" 0.13 | \n",
" 2.3 | \n",
" 0.076 | \n",
" 29.0 | \n",
" 40.0 | \n",
" 0.99574 | \n",
" 3.42 | \n",
" 0.75 | \n",
" 11.0 | \n",
" 6 | \n",
"
\n",
" \n",
" 1597 | \n",
" 5.9 | \n",
" 0.645 | \n",
" 0.12 | \n",
" 2.0 | \n",
" 0.075 | \n",
" 32.0 | \n",
" 44.0 | \n",
" 0.99547 | \n",
" 3.57 | \n",
" 0.71 | \n",
" 10.2 | \n",
" 5 | \n",
"
\n",
" \n",
" 1598 | \n",
" 6.0 | \n",
" 0.310 | \n",
" 0.47 | \n",
" 3.6 | \n",
" 0.067 | \n",
" 18.0 | \n",
" 42.0 | \n",
" 0.99549 | \n",
" 3.39 | \n",
" 0.66 | \n",
" 11.0 | \n",
" 6 | \n",
"
\n",
" \n",
"
\n",
"
1599 rows × 12 columns
\n",
"
"
],
"text/plain": [
" fixed acidity volatile acidity citric acid ... sulphates alcohol quality\n",
"0 7.4 0.700 0.00 ... 0.56 9.4 5\n",
"1 7.8 0.880 0.00 ... 0.68 9.8 5\n",
"2 7.8 0.760 0.04 ... 0.65 9.8 5\n",
"3 11.2 0.280 0.56 ... 0.58 9.8 6\n",
"4 7.4 0.700 0.00 ... 0.56 9.4 5\n",
"... ... ... ... ... ... ... ...\n",
"1594 6.2 0.600 0.08 ... 0.58 10.5 5\n",
"1595 5.9 0.550 0.10 ... 0.76 11.2 6\n",
"1596 6.3 0.510 0.13 ... 0.75 11.0 6\n",
"1597 5.9 0.645 0.12 ... 0.71 10.2 5\n",
"1598 6.0 0.310 0.47 ... 0.66 11.0 6\n",
"\n",
"[1599 rows x 12 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 2
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4H-i6DJlxduP"
},
"source": [
"# 2. Podział na zbiory test/train przy pomocy SciKit + (poprawka z 26.03.2021 przy pomocy basha)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Rf49qKC-eqEU"
},
"source": [
"## 2.1 SciKit"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "nZO_naLatT0o"
},
"source": [
"Próbowałem również podzielić na podzbiory Train:Dev:Test 6:2:2 Przy pomocy basha ale uznałem, że wygodniejsze jest korzystanie z \"train_test_split()\". Docelowo podział będzie dokonywany na 4 zmienne ` X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)`, jednak chciałem zachować konwencje z przykładu, z ćwiczeń."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ebHl5Aw1uuK1"
},
"source": [
"https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html"
]
},
{
"cell_type": "code",
"metadata": {
"id": "X88VMhb0x3gJ"
},
"source": [
"from sklearn.model_selection import train_test_split\n",
"\n",
"wine_train, wine_test = train_test_split(wine, test_size=360,train_size=959, random_state=1)"
],
"execution_count": 3,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "OzjEfgNOyAWs",
"outputId": "7e7bb70f-2b1e-422c-9500-d411884d8d5a"
},
"source": [
"wine_test[\"quality\"].value_counts()"
],
"execution_count": 4,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"5 155\n",
"6 149\n",
"7 37\n",
"4 16\n",
"8 2\n",
"3 1\n",
"Name: quality, dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 4
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "SpQZIuSxyAd0",
"outputId": "96505a9a-d2e7-44a1-b2cf-ee40d6d7d3d0"
},
"source": [
"wine_train[\"quality\"].value_counts()"
],
"execution_count": 5,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"5 400\n",
"6 388\n",
"7 125\n",
"4 30\n",
"8 11\n",
"3 5\n",
"Name: quality, dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 5
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "YK0491tAeupD"
},
"source": [
"## 2.2 Bash"
]
},
{
"cell_type": "code",
"metadata": {
"id": "1idNUz-9eyfJ"
},
"source": [
"!head -n 1 winequality-red.csv > header.csv\n",
"!tail -n +2 winequality-red.csv | shuf > data.shuffled\n",
"\n",
"!head -n 266 data.shuffled > wine.data.test\n",
"!head -n 532 data.shuffled | tail -n 266 > wine.data.dev\n",
"!tail -n +333 data.shuffled > wine.data.train\n",
"\n",
"!cat header.csv wine.data.test > test.csv\n",
"!cat header.csv wine.data.dev > dev.csv\n",
"!cat header.csv wine.data.train > train.csv"
],
"execution_count": 6,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "-C4RRDH2fFEp",
"outputId": "93944a72-838c-4e2b-a907-de4b0902fcb1"
},
"source": [
"!wc -l test.csv\n",
"!wc -l dev.csv\n",
"!wc -l train.csv"
],
"execution_count": 7,
"outputs": [
{
"output_type": "stream",
"text": [
"267 test.csv\n",
"267 dev.csv\n",
"1268 train.csv\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "wLlI-k_jfb70"
},
"source": [
"wine_test_bash=pd.read_csv('test.csv')\n",
"wine_dev_bash=pd.read_csv('dev.csv')\n",
"wine_train_bash=pd.read_csv('train.csv')"
],
"execution_count": 8,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "wAq8KmNdyNOm"
},
"source": [
"# 3. Statystyki dla zbiorów"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Wcq9YSTfXbs1"
},
"source": [
"from matplotlib import pyplot as plt\n",
"import seaborn as sns"
],
"execution_count": 9,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "EjDFpgdPy_of"
},
"source": [
"## 3.1. Zbiór Train (bash)"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 419
},
"id": "SscUak3AydG0",
"outputId": "5f0bd8df-1753-4211-e3a6-8ce2685146f9"
},
"source": [
"wine_train_bash"
],
"execution_count": 10,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" fixed acidity | \n",
" volatile acidity | \n",
" citric acid | \n",
" residual sugar | \n",
" chlorides | \n",
" free sulfur dioxide | \n",
" total sulfur dioxide | \n",
" density | \n",
" pH | \n",
" sulphates | \n",
" alcohol | \n",
" quality | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 10.0 | \n",
" 0.380 | \n",
" 0.38 | \n",
" 1.6 | \n",
" 0.169 | \n",
" 27.0 | \n",
" 90.0 | \n",
" 0.99914 | \n",
" 3.15 | \n",
" 0.65 | \n",
" 8.5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1 | \n",
" 6.7 | \n",
" 0.460 | \n",
" 0.24 | \n",
" 1.7 | \n",
" 0.077 | \n",
" 18.0 | \n",
" 34.0 | \n",
" 0.99480 | \n",
" 3.39 | \n",
" 0.60 | \n",
" 10.6 | \n",
" 6 | \n",
"
\n",
" \n",
" 2 | \n",
" 7.2 | \n",
" 0.695 | \n",
" 0.13 | \n",
" 2.0 | \n",
" 0.076 | \n",
" 12.0 | \n",
" 20.0 | \n",
" 0.99546 | \n",
" 3.29 | \n",
" 0.54 | \n",
" 10.1 | \n",
" 5 | \n",
"
\n",
" \n",
" 3 | \n",
" 12.5 | \n",
" 0.600 | \n",
" 0.49 | \n",
" 4.3 | \n",
" 0.100 | \n",
" 5.0 | \n",
" 14.0 | \n",
" 1.00100 | \n",
" 3.25 | \n",
" 0.74 | \n",
" 11.9 | \n",
" 6 | \n",
"
\n",
" \n",
" 4 | \n",
" 8.3 | \n",
" 0.560 | \n",
" 0.22 | \n",
" 2.4 | \n",
" 0.082 | \n",
" 10.0 | \n",
" 86.0 | \n",
" 0.99830 | \n",
" 3.37 | \n",
" 0.62 | \n",
" 9.5 | \n",
" 5 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 1262 | \n",
" 7.8 | \n",
" 0.560 | \n",
" 0.12 | \n",
" 2.0 | \n",
" 0.082 | \n",
" 7.0 | \n",
" 28.0 | \n",
" 0.99700 | \n",
" 3.37 | \n",
" 0.50 | \n",
" 9.4 | \n",
" 6 | \n",
"
\n",
" \n",
" 1263 | \n",
" 5.8 | \n",
" 0.680 | \n",
" 0.02 | \n",
" 1.8 | \n",
" 0.087 | \n",
" 21.0 | \n",
" 94.0 | \n",
" 0.99440 | \n",
" 3.54 | \n",
" 0.52 | \n",
" 10.0 | \n",
" 5 | \n",
"
\n",
" \n",
" 1264 | \n",
" 7.7 | \n",
" 0.630 | \n",
" 0.08 | \n",
" 1.9 | \n",
" 0.076 | \n",
" 15.0 | \n",
" 27.0 | \n",
" 0.99670 | \n",
" 3.32 | \n",
" 0.54 | \n",
" 9.5 | \n",
" 6 | \n",
"
\n",
" \n",
" 1265 | \n",
" 7.1 | \n",
" 0.600 | \n",
" 0.00 | \n",
" 1.8 | \n",
" 0.074 | \n",
" 16.0 | \n",
" 34.0 | \n",
" 0.99720 | \n",
" 3.47 | \n",
" 0.70 | \n",
" 9.9 | \n",
" 6 | \n",
"
\n",
" \n",
" 1266 | \n",
" 10.4 | \n",
" 0.610 | \n",
" 0.49 | \n",
" 2.1 | \n",
" 0.200 | \n",
" 5.0 | \n",
" 16.0 | \n",
" 0.99940 | \n",
" 3.16 | \n",
" 0.63 | \n",
" 8.4 | \n",
" 3 | \n",
"
\n",
" \n",
"
\n",
"
1267 rows × 12 columns
\n",
"
"
],
"text/plain": [
" fixed acidity volatile acidity citric acid ... sulphates alcohol quality\n",
"0 10.0 0.380 0.38 ... 0.65 8.5 5\n",
"1 6.7 0.460 0.24 ... 0.60 10.6 6\n",
"2 7.2 0.695 0.13 ... 0.54 10.1 5\n",
"3 12.5 0.600 0.49 ... 0.74 11.9 6\n",
"4 8.3 0.560 0.22 ... 0.62 9.5 5\n",
"... ... ... ... ... ... ... ...\n",
"1262 7.8 0.560 0.12 ... 0.50 9.4 6\n",
"1263 5.8 0.680 0.02 ... 0.52 10.0 5\n",
"1264 7.7 0.630 0.08 ... 0.54 9.5 6\n",
"1265 7.1 0.600 0.00 ... 0.70 9.9 6\n",
"1266 10.4 0.610 0.49 ... 0.63 8.4 3\n",
"\n",
"[1267 rows x 12 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 10
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "hZAn8j4byMF2",
"outputId": "c47596aa-0d54-490f-c892-6ee5987a372d"
},
"source": [
"wine_train_bash[\"quality\"].value_counts()"
],
"execution_count": 11,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"5 550\n",
"6 498\n",
"7 157\n",
"4 39\n",
"8 15\n",
"3 8\n",
"Name: quality, dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 11
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 297
},
"id": "EOEuj8sRyL8v",
"outputId": "d2f102f6-d10c-4dc4-ae3f-fd34dc4e5985"
},
"source": [
"wine_train_bash.describe(include='all')"
],
"execution_count": 12,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" fixed acidity | \n",
" volatile acidity | \n",
" citric acid | \n",
" residual sugar | \n",
" chlorides | \n",
" free sulfur dioxide | \n",
" total sulfur dioxide | \n",
" density | \n",
" pH | \n",
" sulphates | \n",
" alcohol | \n",
" quality | \n",
"
\n",
" \n",
" \n",
" \n",
" count | \n",
" 1267.000000 | \n",
" 1267.000000 | \n",
" 1267.000000 | \n",
" 1267.000000 | \n",
" 1267.000000 | \n",
" 1267.000000 | \n",
" 1267.000000 | \n",
" 1267.000000 | \n",
" 1267.000000 | \n",
" 1267.000000 | \n",
" 1267.000000 | \n",
" 1267.000000 | \n",
"
\n",
" \n",
" mean | \n",
" 8.344199 | \n",
" 0.525888 | \n",
" 0.273891 | \n",
" 2.574033 | \n",
" 0.087419 | \n",
" 15.889897 | \n",
" 46.146014 | \n",
" 0.996799 | \n",
" 3.310016 | \n",
" 0.655730 | \n",
" 10.396725 | \n",
" 5.632991 | \n",
"
\n",
" \n",
" std | \n",
" 1.789253 | \n",
" 0.177804 | \n",
" 0.196141 | \n",
" 1.453463 | \n",
" 0.046754 | \n",
" 10.603674 | \n",
" 32.734818 | \n",
" 0.001893 | \n",
" 0.154047 | \n",
" 0.166206 | \n",
" 1.042353 | \n",
" 0.806931 | \n",
"
\n",
" \n",
" min | \n",
" 4.700000 | \n",
" 0.120000 | \n",
" 0.000000 | \n",
" 0.900000 | \n",
" 0.012000 | \n",
" 1.000000 | \n",
" 6.000000 | \n",
" 0.990070 | \n",
" 2.740000 | \n",
" 0.370000 | \n",
" 8.400000 | \n",
" 3.000000 | \n",
"
\n",
" \n",
" 25% | \n",
" 7.100000 | \n",
" 0.390000 | \n",
" 0.090000 | \n",
" 1.900000 | \n",
" 0.071000 | \n",
" 7.000000 | \n",
" 22.000000 | \n",
" 0.995660 | \n",
" 3.210000 | \n",
" 0.550000 | \n",
" 9.500000 | \n",
" 5.000000 | \n",
"
\n",
" \n",
" 50% | \n",
" 7.900000 | \n",
" 0.520000 | \n",
" 0.260000 | \n",
" 2.200000 | \n",
" 0.080000 | \n",
" 13.000000 | \n",
" 37.000000 | \n",
" 0.996800 | \n",
" 3.310000 | \n",
" 0.620000 | \n",
" 10.200000 | \n",
" 6.000000 | \n",
"
\n",
" \n",
" 75% | \n",
" 9.300000 | \n",
" 0.640000 | \n",
" 0.430000 | \n",
" 2.600000 | \n",
" 0.090000 | \n",
" 22.000000 | \n",
" 62.000000 | \n",
" 0.997870 | \n",
" 3.400000 | \n",
" 0.730000 | \n",
" 11.000000 | \n",
" 6.000000 | \n",
"
\n",
" \n",
" max | \n",
" 15.900000 | \n",
" 1.580000 | \n",
" 1.000000 | \n",
" 15.500000 | \n",
" 0.611000 | \n",
" 72.000000 | \n",
" 278.000000 | \n",
" 1.003690 | \n",
" 4.010000 | \n",
" 2.000000 | \n",
" 14.900000 | \n",
" 8.000000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" fixed acidity volatile acidity ... alcohol quality\n",
"count 1267.000000 1267.000000 ... 1267.000000 1267.000000\n",
"mean 8.344199 0.525888 ... 10.396725 5.632991\n",
"std 1.789253 0.177804 ... 1.042353 0.806931\n",
"min 4.700000 0.120000 ... 8.400000 3.000000\n",
"25% 7.100000 0.390000 ... 9.500000 5.000000\n",
"50% 7.900000 0.520000 ... 10.200000 6.000000\n",
"75% 9.300000 0.640000 ... 11.000000 6.000000\n",
"max 15.900000 1.580000 ... 14.900000 8.000000\n",
"\n",
"[8 rows x 12 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 12
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JWXJ2CZQuylE"
},
"source": [
"Testowy Wykres (quality, volatile acidity)"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 408
},
"id": "HbsfwCL7XpNe",
"outputId": "249d8110-1b17-41ad-e1b1-18b0aa12ff06"
},
"source": [
"fig = plt.figure(figsize = (10,6))\n",
"sns.barplot(x = 'quality', y = 'volatile acidity', data = wine_train_bash)"
],
"execution_count": 13,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
""
]
},
"metadata": {
"tags": []
},
"execution_count": 13
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1W_oRCVczIgJ"
},
"source": [
"## 3.2. Zbiór Test (bash)"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 419
},
"id": "LJzygNqKzOWY",
"outputId": "d4f8dd3b-793c-4e02-a6ea-fbdb8fbf7a19"
},
"source": [
"wine_test_bash"
],
"execution_count": 14,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" fixed acidity | \n",
" volatile acidity | \n",
" citric acid | \n",
" residual sugar | \n",
" chlorides | \n",
" free sulfur dioxide | \n",
" total sulfur dioxide | \n",
" density | \n",
" pH | \n",
" sulphates | \n",
" alcohol | \n",
" quality | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 7.1 | \n",
" 0.60 | \n",
" 0.01 | \n",
" 2.3 | \n",
" 0.079 | \n",
" 24.0 | \n",
" 37.0 | \n",
" 0.99514 | \n",
" 3.40 | \n",
" 0.61 | \n",
" 10.9 | \n",
" 6 | \n",
"
\n",
" \n",
" 1 | \n",
" 7.8 | \n",
" 0.61 | \n",
" 0.29 | \n",
" 1.6 | \n",
" 0.114 | \n",
" 9.0 | \n",
" 29.0 | \n",
" 0.99740 | \n",
" 3.26 | \n",
" 1.56 | \n",
" 9.1 | \n",
" 5 | \n",
"
\n",
" \n",
" 2 | \n",
" 7.1 | \n",
" 0.63 | \n",
" 0.06 | \n",
" 2.0 | \n",
" 0.083 | \n",
" 8.0 | \n",
" 29.0 | \n",
" 0.99855 | \n",
" 3.67 | \n",
" 0.73 | \n",
" 9.6 | \n",
" 5 | \n",
"
\n",
" \n",
" 3 | \n",
" 9.1 | \n",
" 0.30 | \n",
" 0.41 | \n",
" 2.0 | \n",
" 0.068 | \n",
" 10.0 | \n",
" 24.0 | \n",
" 0.99523 | \n",
" 3.27 | \n",
" 0.85 | \n",
" 11.7 | \n",
" 7 | \n",
"
\n",
" \n",
" 4 | \n",
" 9.0 | \n",
" 0.46 | \n",
" 0.31 | \n",
" 2.8 | \n",
" 0.093 | \n",
" 19.0 | \n",
" 98.0 | \n",
" 0.99815 | \n",
" 3.32 | \n",
" 0.63 | \n",
" 9.5 | \n",
" 6 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 261 | \n",
" 7.2 | \n",
" 0.60 | \n",
" 0.04 | \n",
" 2.5 | \n",
" 0.076 | \n",
" 18.0 | \n",
" 88.0 | \n",
" 0.99745 | \n",
" 3.53 | \n",
" 0.55 | \n",
" 9.5 | \n",
" 5 | \n",
"
\n",
" \n",
" 262 | \n",
" 8.4 | \n",
" 0.67 | \n",
" 0.19 | \n",
" 2.2 | \n",
" 0.093 | \n",
" 11.0 | \n",
" 75.0 | \n",
" 0.99736 | \n",
" 3.20 | \n",
" 0.59 | \n",
" 9.2 | \n",
" 4 | \n",
"
\n",
" \n",
" 263 | \n",
" 8.8 | \n",
" 0.61 | \n",
" 0.19 | \n",
" 4.0 | \n",
" 0.094 | \n",
" 30.0 | \n",
" 69.0 | \n",
" 0.99787 | \n",
" 3.22 | \n",
" 0.50 | \n",
" 10.0 | \n",
" 6 | \n",
"
\n",
" \n",
" 264 | \n",
" 9.6 | \n",
" 0.68 | \n",
" 0.24 | \n",
" 2.2 | \n",
" 0.087 | \n",
" 5.0 | \n",
" 28.0 | \n",
" 0.99880 | \n",
" 3.14 | \n",
" 0.60 | \n",
" 10.2 | \n",
" 5 | \n",
"
\n",
" \n",
" 265 | \n",
" 10.5 | \n",
" 0.43 | \n",
" 0.35 | \n",
" 3.3 | \n",
" 0.092 | \n",
" 24.0 | \n",
" 70.0 | \n",
" 0.99798 | \n",
" 3.21 | \n",
" 0.69 | \n",
" 10.5 | \n",
" 6 | \n",
"
\n",
" \n",
"
\n",
"
266 rows × 12 columns
\n",
"
"
],
"text/plain": [
" fixed acidity volatile acidity citric acid ... sulphates alcohol quality\n",
"0 7.1 0.60 0.01 ... 0.61 10.9 6\n",
"1 7.8 0.61 0.29 ... 1.56 9.1 5\n",
"2 7.1 0.63 0.06 ... 0.73 9.6 5\n",
"3 9.1 0.30 0.41 ... 0.85 11.7 7\n",
"4 9.0 0.46 0.31 ... 0.63 9.5 6\n",
".. ... ... ... ... ... ... ...\n",
"261 7.2 0.60 0.04 ... 0.55 9.5 5\n",
"262 8.4 0.67 0.19 ... 0.59 9.2 4\n",
"263 8.8 0.61 0.19 ... 0.50 10.0 6\n",
"264 9.6 0.68 0.24 ... 0.60 10.2 5\n",
"265 10.5 0.43 0.35 ... 0.69 10.5 6\n",
"\n",
"[266 rows x 12 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 14
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "1IAtBylEzS8w",
"outputId": "1f047c20-f723-490d-ada3-474f5d14db3a"
},
"source": [
"wine_test_bash[\"quality\"].value_counts()"
],
"execution_count": 15,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"6 109\n",
"5 108\n",
"7 37\n",
"4 8\n",
"8 2\n",
"3 2\n",
"Name: quality, dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 15
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 297
},
"id": "V-9cwcrczS-3",
"outputId": "a8a26e7f-a2c4-4a44-c91a-6ce57be85386"
},
"source": [
"wine_test_bash.describe(include='all')"
],
"execution_count": 16,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" fixed acidity | \n",
" volatile acidity | \n",
" citric acid | \n",
" residual sugar | \n",
" chlorides | \n",
" free sulfur dioxide | \n",
" total sulfur dioxide | \n",
" density | \n",
" pH | \n",
" sulphates | \n",
" alcohol | \n",
" quality | \n",
"
\n",
" \n",
" \n",
" \n",
" count | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
"
\n",
" \n",
" mean | \n",
" 8.245865 | \n",
" 0.529455 | \n",
" 0.266203 | \n",
" 2.373308 | \n",
" 0.086823 | \n",
" 15.840226 | \n",
" 47.447368 | \n",
" 0.996499 | \n",
" 3.313195 | \n",
" 0.676241 | \n",
" 10.569925 | \n",
" 5.665414 | \n",
"
\n",
" \n",
" std | \n",
" 1.526175 | \n",
" 0.181583 | \n",
" 0.191968 | \n",
" 1.005345 | \n",
" 0.046159 | \n",
" 10.163096 | \n",
" 34.610379 | \n",
" 0.001772 | \n",
" 0.158871 | \n",
" 0.187786 | \n",
" 1.149728 | \n",
" 0.808497 | \n",
"
\n",
" \n",
" min | \n",
" 4.600000 | \n",
" 0.180000 | \n",
" 0.000000 | \n",
" 1.200000 | \n",
" 0.039000 | \n",
" 1.000000 | \n",
" 7.000000 | \n",
" 0.990840 | \n",
" 2.880000 | \n",
" 0.390000 | \n",
" 9.000000 | \n",
" 3.000000 | \n",
"
\n",
" \n",
" 25% | \n",
" 7.200000 | \n",
" 0.392500 | \n",
" 0.100000 | \n",
" 1.900000 | \n",
" 0.068000 | \n",
" 7.000000 | \n",
" 22.250000 | \n",
" 0.995318 | \n",
" 3.200000 | \n",
" 0.560000 | \n",
" 9.500000 | \n",
" 5.000000 | \n",
"
\n",
" \n",
" 50% | \n",
" 8.000000 | \n",
" 0.520000 | \n",
" 0.260000 | \n",
" 2.100000 | \n",
" 0.078000 | \n",
" 14.000000 | \n",
" 40.000000 | \n",
" 0.996520 | \n",
" 3.310000 | \n",
" 0.640000 | \n",
" 10.250000 | \n",
" 6.000000 | \n",
"
\n",
" \n",
" 75% | \n",
" 9.100000 | \n",
" 0.630000 | \n",
" 0.400000 | \n",
" 2.500000 | \n",
" 0.092000 | \n",
" 21.000000 | \n",
" 62.750000 | \n",
" 0.997600 | \n",
" 3.400000 | \n",
" 0.750000 | \n",
" 11.400000 | \n",
" 6.000000 | \n",
"
\n",
" \n",
" max | \n",
" 13.300000 | \n",
" 1.330000 | \n",
" 0.740000 | \n",
" 8.800000 | \n",
" 0.467000 | \n",
" 51.000000 | \n",
" 289.000000 | \n",
" 1.002600 | \n",
" 3.900000 | \n",
" 1.980000 | \n",
" 14.000000 | \n",
" 8.000000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" fixed acidity volatile acidity ... alcohol quality\n",
"count 266.000000 266.000000 ... 266.000000 266.000000\n",
"mean 8.245865 0.529455 ... 10.569925 5.665414\n",
"std 1.526175 0.181583 ... 1.149728 0.808497\n",
"min 4.600000 0.180000 ... 9.000000 3.000000\n",
"25% 7.200000 0.392500 ... 9.500000 5.000000\n",
"50% 8.000000 0.520000 ... 10.250000 6.000000\n",
"75% 9.100000 0.630000 ... 11.400000 6.000000\n",
"max 13.300000 1.330000 ... 14.000000 8.000000\n",
"\n",
"[8 rows x 12 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 16
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wzaUXARnu824"
},
"source": [
"Testowy Wykres (quality, volatile acidity)"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 405
},
"id": "3GksWzExaHV7",
"outputId": "21b77c09-445c-4e06-fcea-6f26d3717870"
},
"source": [
"fig = plt.figure(figsize = (10,6))\n",
"sns.barplot(x = 'quality', y = 'volatile acidity', data = wine_test_bash)"
],
"execution_count": 17,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
""
]
},
"metadata": {
"tags": []
},
"execution_count": 17
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "w5xmkUgGzdxs"
},
"source": [
"## 3.3. Cały zbiór"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 419
},
"id": "thGHHVJXzeGe",
"outputId": "a1bbe5c6-3aef-4a70-82ec-adc2b9d6daf5"
},
"source": [
"wine"
],
"execution_count": 18,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" fixed acidity | \n",
" volatile acidity | \n",
" citric acid | \n",
" residual sugar | \n",
" chlorides | \n",
" free sulfur dioxide | \n",
" total sulfur dioxide | \n",
" density | \n",
" pH | \n",
" sulphates | \n",
" alcohol | \n",
" quality | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 7.4 | \n",
" 0.700 | \n",
" 0.00 | \n",
" 1.9 | \n",
" 0.076 | \n",
" 11.0 | \n",
" 34.0 | \n",
" 0.99780 | \n",
" 3.51 | \n",
" 0.56 | \n",
" 9.4 | \n",
" 5 | \n",
"
\n",
" \n",
" 1 | \n",
" 7.8 | \n",
" 0.880 | \n",
" 0.00 | \n",
" 2.6 | \n",
" 0.098 | \n",
" 25.0 | \n",
" 67.0 | \n",
" 0.99680 | \n",
" 3.20 | \n",
" 0.68 | \n",
" 9.8 | \n",
" 5 | \n",
"
\n",
" \n",
" 2 | \n",
" 7.8 | \n",
" 0.760 | \n",
" 0.04 | \n",
" 2.3 | \n",
" 0.092 | \n",
" 15.0 | \n",
" 54.0 | \n",
" 0.99700 | \n",
" 3.26 | \n",
" 0.65 | \n",
" 9.8 | \n",
" 5 | \n",
"
\n",
" \n",
" 3 | \n",
" 11.2 | \n",
" 0.280 | \n",
" 0.56 | \n",
" 1.9 | \n",
" 0.075 | \n",
" 17.0 | \n",
" 60.0 | \n",
" 0.99800 | \n",
" 3.16 | \n",
" 0.58 | \n",
" 9.8 | \n",
" 6 | \n",
"
\n",
" \n",
" 4 | \n",
" 7.4 | \n",
" 0.700 | \n",
" 0.00 | \n",
" 1.9 | \n",
" 0.076 | \n",
" 11.0 | \n",
" 34.0 | \n",
" 0.99780 | \n",
" 3.51 | \n",
" 0.56 | \n",
" 9.4 | \n",
" 5 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 1594 | \n",
" 6.2 | \n",
" 0.600 | \n",
" 0.08 | \n",
" 2.0 | \n",
" 0.090 | \n",
" 32.0 | \n",
" 44.0 | \n",
" 0.99490 | \n",
" 3.45 | \n",
" 0.58 | \n",
" 10.5 | \n",
" 5 | \n",
"
\n",
" \n",
" 1595 | \n",
" 5.9 | \n",
" 0.550 | \n",
" 0.10 | \n",
" 2.2 | \n",
" 0.062 | \n",
" 39.0 | \n",
" 51.0 | \n",
" 0.99512 | \n",
" 3.52 | \n",
" 0.76 | \n",
" 11.2 | \n",
" 6 | \n",
"
\n",
" \n",
" 1596 | \n",
" 6.3 | \n",
" 0.510 | \n",
" 0.13 | \n",
" 2.3 | \n",
" 0.076 | \n",
" 29.0 | \n",
" 40.0 | \n",
" 0.99574 | \n",
" 3.42 | \n",
" 0.75 | \n",
" 11.0 | \n",
" 6 | \n",
"
\n",
" \n",
" 1597 | \n",
" 5.9 | \n",
" 0.645 | \n",
" 0.12 | \n",
" 2.0 | \n",
" 0.075 | \n",
" 32.0 | \n",
" 44.0 | \n",
" 0.99547 | \n",
" 3.57 | \n",
" 0.71 | \n",
" 10.2 | \n",
" 5 | \n",
"
\n",
" \n",
" 1598 | \n",
" 6.0 | \n",
" 0.310 | \n",
" 0.47 | \n",
" 3.6 | \n",
" 0.067 | \n",
" 18.0 | \n",
" 42.0 | \n",
" 0.99549 | \n",
" 3.39 | \n",
" 0.66 | \n",
" 11.0 | \n",
" 6 | \n",
"
\n",
" \n",
"
\n",
"
1599 rows × 12 columns
\n",
"
"
],
"text/plain": [
" fixed acidity volatile acidity citric acid ... sulphates alcohol quality\n",
"0 7.4 0.700 0.00 ... 0.56 9.4 5\n",
"1 7.8 0.880 0.00 ... 0.68 9.8 5\n",
"2 7.8 0.760 0.04 ... 0.65 9.8 5\n",
"3 11.2 0.280 0.56 ... 0.58 9.8 6\n",
"4 7.4 0.700 0.00 ... 0.56 9.4 5\n",
"... ... ... ... ... ... ... ...\n",
"1594 6.2 0.600 0.08 ... 0.58 10.5 5\n",
"1595 5.9 0.550 0.10 ... 0.76 11.2 6\n",
"1596 6.3 0.510 0.13 ... 0.75 11.0 6\n",
"1597 5.9 0.645 0.12 ... 0.71 10.2 5\n",
"1598 6.0 0.310 0.47 ... 0.66 11.0 6\n",
"\n",
"[1599 rows x 12 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 18
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Ua_ctPpVzeKJ",
"outputId": "da95e47b-9e44-42e0-efc0-66631dba99f1"
},
"source": [
"wine[\"quality\"].value_counts()"
],
"execution_count": 19,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"5 681\n",
"6 638\n",
"7 199\n",
"4 53\n",
"8 18\n",
"3 10\n",
"Name: quality, dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 19
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 297
},
"id": "-06v1i7XzeOz",
"outputId": "b0da7e9b-98aa-4af6-8131-359a54c2ac69"
},
"source": [
"wine.describe(include='all')"
],
"execution_count": 20,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" fixed acidity | \n",
" volatile acidity | \n",
" citric acid | \n",
" residual sugar | \n",
" chlorides | \n",
" free sulfur dioxide | \n",
" total sulfur dioxide | \n",
" density | \n",
" pH | \n",
" sulphates | \n",
" alcohol | \n",
" quality | \n",
"
\n",
" \n",
" \n",
" \n",
" count | \n",
" 1599.000000 | \n",
" 1599.000000 | \n",
" 1599.000000 | \n",
" 1599.000000 | \n",
" 1599.000000 | \n",
" 1599.000000 | \n",
" 1599.000000 | \n",
" 1599.000000 | \n",
" 1599.000000 | \n",
" 1599.000000 | \n",
" 1599.000000 | \n",
" 1599.000000 | \n",
"
\n",
" \n",
" mean | \n",
" 8.319637 | \n",
" 0.527821 | \n",
" 0.270976 | \n",
" 2.538806 | \n",
" 0.087467 | \n",
" 15.874922 | \n",
" 46.467792 | \n",
" 0.996747 | \n",
" 3.311113 | \n",
" 0.658149 | \n",
" 10.422983 | \n",
" 5.636023 | \n",
"
\n",
" \n",
" std | \n",
" 1.741096 | \n",
" 0.179060 | \n",
" 0.194801 | \n",
" 1.409928 | \n",
" 0.047065 | \n",
" 10.460157 | \n",
" 32.895324 | \n",
" 0.001887 | \n",
" 0.154386 | \n",
" 0.169507 | \n",
" 1.065668 | \n",
" 0.807569 | \n",
"
\n",
" \n",
" min | \n",
" 4.600000 | \n",
" 0.120000 | \n",
" 0.000000 | \n",
" 0.900000 | \n",
" 0.012000 | \n",
" 1.000000 | \n",
" 6.000000 | \n",
" 0.990070 | \n",
" 2.740000 | \n",
" 0.330000 | \n",
" 8.400000 | \n",
" 3.000000 | \n",
"
\n",
" \n",
" 25% | \n",
" 7.100000 | \n",
" 0.390000 | \n",
" 0.090000 | \n",
" 1.900000 | \n",
" 0.070000 | \n",
" 7.000000 | \n",
" 22.000000 | \n",
" 0.995600 | \n",
" 3.210000 | \n",
" 0.550000 | \n",
" 9.500000 | \n",
" 5.000000 | \n",
"
\n",
" \n",
" 50% | \n",
" 7.900000 | \n",
" 0.520000 | \n",
" 0.260000 | \n",
" 2.200000 | \n",
" 0.079000 | \n",
" 14.000000 | \n",
" 38.000000 | \n",
" 0.996750 | \n",
" 3.310000 | \n",
" 0.620000 | \n",
" 10.200000 | \n",
" 6.000000 | \n",
"
\n",
" \n",
" 75% | \n",
" 9.200000 | \n",
" 0.640000 | \n",
" 0.420000 | \n",
" 2.600000 | \n",
" 0.090000 | \n",
" 21.000000 | \n",
" 62.000000 | \n",
" 0.997835 | \n",
" 3.400000 | \n",
" 0.730000 | \n",
" 11.100000 | \n",
" 6.000000 | \n",
"
\n",
" \n",
" max | \n",
" 15.900000 | \n",
" 1.580000 | \n",
" 1.000000 | \n",
" 15.500000 | \n",
" 0.611000 | \n",
" 72.000000 | \n",
" 289.000000 | \n",
" 1.003690 | \n",
" 4.010000 | \n",
" 2.000000 | \n",
" 14.900000 | \n",
" 8.000000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" fixed acidity volatile acidity ... alcohol quality\n",
"count 1599.000000 1599.000000 ... 1599.000000 1599.000000\n",
"mean 8.319637 0.527821 ... 10.422983 5.636023\n",
"std 1.741096 0.179060 ... 1.065668 0.807569\n",
"min 4.600000 0.120000 ... 8.400000 3.000000\n",
"25% 7.100000 0.390000 ... 9.500000 5.000000\n",
"50% 7.900000 0.520000 ... 10.200000 6.000000\n",
"75% 9.200000 0.640000 ... 11.100000 6.000000\n",
"max 15.900000 1.580000 ... 14.900000 8.000000\n",
"\n",
"[8 rows x 12 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 20
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "t8Y53QPyu_fO"
},
"source": [
"Testowy Wykres (quality, volatile acidity)"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 405
},
"id": "hEe3BYcJaKnF",
"outputId": "cd03275d-d09e-4517-ef76-22b40d9ffa9e"
},
"source": [
"fig = plt.figure(figsize = (10,6))\n",
"sns.barplot(x = 'quality', y = 'volatile acidity', data = wine)"
],
"execution_count": 21,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
""
]
},
"metadata": {
"tags": []
},
"execution_count": 21
},
{
"output_type": "display_data",
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAmEAAAFzCAYAAAB2A95GAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAYzUlEQVR4nO3de/TndV0n8OeLQQJZvNTMNgUUbEu2VKY2sRalnswCM9g1LWm18pjUrpSXcg4e91ja7p7jqFtbYi1rmV2UiNXCdowu3sqSGAQviNaEIjP5i0HzriHw2j9+36kfw1y+g/P5vX/z/T0e5/zO9/u5/L7fJ5/Dgefv/f5cqrsDAMDqOmZ0AACA9UgJAwAYQAkDABhACQMAGEAJAwAYQAkDABjg2NEBDtfGjRv7tNNOGx0DAOCQrr322tu6e9P+th11Jey0007Ljh07RscAADikqrr5QNtMRwIADKCEAQAMoIQBAAyghAEADKCEAQAMoIQBAAyghAEADKCEAQAMoIQBAAyghAEADKCEAQAMoIQBAAxw1D3A+2i1devWLC0tZfPmzdm2bdvoOADAYErYKllaWsru3btHxwAA1gjTkQAAAyhhAAADKGEAAAMoYQAAAyhhAAADKGEAAAMoYQAAAyhhAAADKGEAAAMoYQAAAyhhAAADKGEAAAMoYQAAAyhhAAADKGEAAAMoYQAAAyhhAAADKGEAAANMVsKq6ter6taqeu8BtldV/VJV7ayqd1fVw6bKAgCw1kw5EvYbSc45yPZzk5wx+7kwya9MmAUAYE2ZrIR199uSfOwgu5yf5Dd72TuSPKCqvmKqPAAAa8nIc8JOTnLLiuVds3UAAAvvqDgxv6ourKodVbVjz549o+MAAHzRRpaw3UlOXbF8ymzdPXT3pd29pbu3bNq0aVXCAQBMaWQJuzLJD8+uknx4kk9090cG5gEAWDXHTvXBVfXaJI9KsrGqdiX52ST3SZLu/tUk25M8NsnOJJ9N8tSpsgAArDWTlbDuvuAQ2zvJM6b6fgCAteyoODEfAGDRKGEAAAMoYQAAAyhhAAADKGEAAAMoYQAAAyhhAAADKGEAAAMoYQAAAyhhAAADKGEAAAMoYQAAAyhhAAADKGEAAAMoYQAAAyhhAAADKGEAAAMoYQAAAyhhAAADKGEAAAMoYQAAAyhhAAADKGEAAAMoYQAAAyhhAAADKGEAAAMoYQAAAyhhAAADHDs6wBS++bm/OTrCPZx026eyIcmHb/vUmsp37Ut+eHQEAFiXjIQBAAyghAEADKCEAQAMoIQBAAyghAEADKCEAQAMoIQBAAyghAEADKCEAQAMoIQBAAyghAEADKCEAQAMoIQBAAyghAEADKCEAQAMoIQBAAyghAEADKCEAQAMoIQBAAyghAEADDBpCauqc6rqA1W1s6ou3s/2r6qqN1fVdVX17qp67JR5AADWislKWFVtSHJJknOTnJnkgqo6c5/d/muSy7v7oUmelOQVU+UBAFhLphwJOyvJzu6+qbtvT3JZkvP32aeT3G/2/v5J/n7CPAAAa8aUJezkJLesWN41W7fSzyV5clXtSrI9yU/u74Oq6sKq2lFVO/bs2TNFVgCAVTX6xPwLkvxGd5+S5LFJfquq7pGpuy/t7i3dvWXTpk2rHhIA4EibsoTtTnLqiuVTZutWelqSy5Oku/8qyfFJNk6YCQBgTZiyhF2T5IyqOr2qjsvyifdX7rPPh5M8Okmq6t9luYSZbwQAFt5kJay770hyUZKrktyY5asgb6iqF1XVebPdfjrJ06vqXUlem+RHu7unygQAsFYcO+WHd/f2LJ9wv3LdC1a8f1+Ss6fMAACwFo0+MR8AYF1SwgAABlDCAAAGUMIAAAZQwgAABlDCAAAGUMIAAAaY9D5hMNLWrVuztLSUzZs3Z9u2baPjAMDdKGEsrKWlpezeve/jSgFgbTAdCQAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwwLGjA6wXdx134t1eF82HX/SNoyPcwx0f+9Ikx+aOj928pvJ91QveMzoCAGuAErZKPnPGd4+OAACsIYecjqyqa6vqGVX1wNUIBACwHsxzTtgPJvnKJNdU1WVV9T1VVRPnAgBYaIcsYd29s7ufn+Rrk7wmya8nubmqXlhVXzp1QACARTTX1ZFV9eAkL0vykiT/N8kTk3wyyZumiwYAsLgOeWJ+VV2b5ONJfi3Jxd39T7NNV1fV2VOGAwBYVPNcHfnE7r5p5YqqOr27P9jdj58oFwDAQptnOvKKOdcBADCnA46EVdXXJfn6JPevqpUjXvdLcvzUwQAAFtnBpiMflORxSR6Q5PtWrP9UkqdPGQoAYNEdsIR19x8k+YOq+tbu/qtVzAQAsPAONh25tbu3Jfmhqrpg3+3d/VOTJgMAWGAHm468cfa6YzWCAACsJwebjnzD7PXVqxcHAGB9ONh05BuS9IG2d/d5kyQCAFgHDjYd+dLZ6+OTbE7y27PlC5L8w5ShAAAW3cGmI9+aJFX1su7esmLTG6rKeWIAAF+Eee6Yf2JV/Zu9C1V1epITp4sER8bG4+/Kl59wRzYef9foKABwD/M8O/LZSd5SVTclqSRfneTHJ00FR8DPPPjjoyMAwAEdsoR19x9V1RlJvm626v3d/U/TxgIAWGwHuzryO7v7Tfs8NzJJvqaq0t2vmzgbAMDCOthI2COTvCl3f27kXp1ECQMAuJcOdnXkz85en7p6cQAA1odDXh1ZVf+jqh6wYvmBVfXf5vnwqjqnqj5QVTur6uID7PMDVfW+qrqhql4zf3QAgKPXPLeoOLe7//kys+7+xySPPdQvVdWGJJckOTfJmUkuqKoz99nnjCTPS3J2d399kmcdRnYAgKPWPCVsQ1V9yd6FqjohyZccZP+9zkqys7tv6u7bk1yW5Px99nl6kktmxS7dfet8sQEAjm7zlLDfSfJnVfW0qnpakj9JMs9DvU9OcsuK5V2zdSt9bZKvraq3V9U7quqc/X1QVV1YVTuqaseePXvm+GoAgLVtnvuEvbiq3p3k0bNVP9/dVx3B7z8jyaOSnJLkbVX1jSunP2cZLk1yaZJs2bLlgA8VBwA4Wsxzx/x09xuTvPEwP3t3klNXLJ8yW7fSriRXd/cXknywqv4my6XsmsP8LmAN2Lp1a5aWlrJ58+Zs27ZtdByANW2eqyMfXlXXVNWnq+r2qrqzqj45x2dfk+SMqjq9qo5L8qQkV+6zz+9neRQsVbUxy9OTNx3WPwGwZiwtLWX37t1ZWloaHQVgzZvnnLCXJ7kgyd8mOSHJj2X5qseD6u47klyU5KokNya5vLtvqKoXVdV5s92uSvLRqnpfkjcneW53f/Tw/zEAAI4u805H7qyqDd19Z5JXVdV1Wb61xKF+b3uS7fuse8GK953kObMfAIB1Y54S9tnZdOL1VbUtyUcy3wgaAAAHME8Je0qWS9dFSZ6d5ZPtv3/KUMChnf3LZ4+OcA/Hffy4HJNjcsvHb1lT+d7+k28fHQHgHua5RcXNs7efT/LCaeMAAKwPphUBAAZQwgAABpi7hFXVfacMAhz9+r6du068K31fD7YAOJR5btb6bbP7eL1/tvxNVfWKyZMBR50vnP2F3P6Y2/OFs78wOgrAmjfPSNgvJPmeJB9Nku5+V5JHTBkKAGDRzTUd2d237LPqzgmyAACsG/PcJ+yWqvq2JF1V90nyzCw/hggAgHtpnpGwn0jyjCQnJ9md5CGzZQAA7qV5btZ6W5L/tApZAADWjQOWsKr65SQHvM68u39qkkQAAOvAwUbCdqxaCgCAdeaAJay7X72aQQAA1pODTUf+Ync/q6rekP1MS3b3eZMmAwBYYAebjvyt2etLVyMIAMB6crDpyGtnbx/S3f9r5baqemaSt04ZDABgkc1zn7Af2c+6Hz3COQAA1pWDnRN2QZIfSnJ6VV25YtNJST42dTAADm3r1q1ZWlrK5s2bs23bttFxgMNwsHPC/jLJR5JsTPKyFes/leTdU4YCYD5LS0vZvXv36BjAvXCwc8JuTnJzkm9dvTgAAOvDIc8Jq6qHV9U1VfXpqrq9qu6sqk+uRjgAgEU1z4n5L09yQZK/TXJCkh9LcsmUoQAAFt0hH+CdJN29s6o2dPedSV5VVdcled600QDWlrc+4pGjI9zD547dkFTlc7t2ral8j3ybuxhx5CzqBSjzlLDPVtVxSa6vqm1ZPll/nhE0AIAv2qJegDJPmXpKkg1JLkrymSSnJvn+KUMBACy6Q46Eza6STJLPJXnhtHEAANaHg92s9T3Zz4O79+ruB0+SCIC5PaD7bq/A0eNgI2GPW7UUANwrT77zrtERgHvpUDdrTZJU1Zcn+ZbZ4l93961TBwMAWGTz3Kz1B5L8dZInJvmBJFdX1ROmDgYAsMjmuUXF85N8y97Rr6ralORPk1wxZTAAgEU2zy0qjtln+vGjc/4eAAAHMM9I2B9V1VVJXjtb/sEk26eLBACw+Oa5T9hzq+rxSb59turS7n79tLEAABbbIUtYVT0nye929+tWIQ8AwLowz7ldJyX546r686q6aHa7CgAAvgiHLGHd/cLu/vokz0jyFUneWlV/OnkyAIAFdjhXOd6aZCnLV0f+62niAACsD/PcrPW/VNVbkvxZki9L8nTPjQQA+OLMc4uKU5M8q7uvnzoMAMB6Mc8tKp63GkEAANYTd74HABhgnulIAGBm69atWVpayubNm7Nt27bRcTiKKWEAcBiWlpaye/fu0TEm89+f/ITREe7hY7d+Yvl16SNrKt/zf/uKL+r3TUcCAAyghAEADDBpCauqc6rqA1W1s6ouPsh+319VXVVbpswDALBWTFbCqmpDkkuSnJvkzCQXVNWZ+9nvpCTPTHL1VFkAANaaKU/MPyvJzu6+KUmq6rIk5yd53z77/XySFyd57oRZADgKvfyn3zA6wj18/LbP/PPrWsp30cu+b3QEDtOU05EnJ7llxfKu2bp/VlUPS3Jqd/+/CXMAAKw5w07Mr6pjkvzPJD89x74XVtWOqtqxZ8+e6cMBAExsyhK2O8vPndzrlNm6vU5K8g1J3lJVH0ry8CRX7u/k/O6+tLu3dPeWTZs2TRgZAGB1TFnCrklyRlWdXlXHJXlSkiv3buzuT3T3xu4+rbtPS/KOJOd1944JMwEArAmTlbDuviPJRUmuSnJjksu7+4aqelFVnTfV9wIAHA0mfWxRd29Psn2fdS84wL6PmjILAMBa4tmRAHAYTjzufnd7hXtLCQOAw3D21zx+dAQWhGdHAgAMoIQBAAyghAEADKCEAQAM4MR8AGBNO37DMXd7XRRKGACwpj30y04aHWESi1UpAQCOEkoYAMAAShgAwABKGADAAEoYAMAAShgAwABKGADAAEoYAMAAShgAwABKGADAAEoYAMAAShgAwABKGADAAEoYAMAAShgAwABKGADAAEoYAMAAShgAwABKGADAAEoYAMAAShgAwABKGADAAEoYAMAAShgAwABKGADAAEoYAMAAShgAwABKGADAAEoYAMAAShgAwABKGADAAEoYAMAAShgAwABKGADAAEoYAMAAShgAwABKGADAAEoYAMAAShgAwABKGADAAEoYAMAAk5awqjqnqj5QVTur6uL9bH9OVb2vqt5dVX9WVV89ZR4AgLVishJWVRuSXJLk3CRnJrmgqs7cZ7frkmzp7gcnuSLJtqnyAACsJVOOhJ2VZGd339Tdtye5LMn5K3fo7jd392dni+9IcsqEeQAA1owpS9jJSW5Zsbxrtu5AnpbkjfvbUFUXVtWOqtqxZ8+eIxgRAGCMNXFiflU9OcmWJC/Z3/buvrS7t3T3lk2bNq1uOACACRw74WfvTnLqiuVTZuvupqq+K8nzkzyyu/9pwjwAAGvGlCNh1yQ5o6pOr6rjkjwpyZUrd6iqhyb530nO6+5bJ8wCALCmTFbCuvuOJBcluSrJjUku7+4bqupFVXXebLeXJPlXSX6vqq6vqisP8HEAAAtlyunIdPf2JNv3WfeCFe+/a8rvBwBYq9bEifkAAOuNEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwgBIGADCAEgYAMIASBgAwwKQlrKrOqaoPVNXOqrp4P9u/pKp+d7b96qo6bco8AABrxWQlrKo2JLkkyblJzkxyQVWduc9uT0vyj939b5P8QpIXT5UHAGAtmXIk7KwkO7v7pu6+PcllSc7fZ5/zk7x69v6KJI+uqpowEwDAmjBlCTs5yS0rlnfN1u13n+6+I8knknzZhJkAANaE6u5pPrjqCUnO6e4fmy0/Jcm/7+6LVuzz3tk+u2bLfzfb57Z9PuvCJBfOFh+U5AOThJ7exiS3HXIvjiTHfPU55qvPMV99jvnqO1qP+Vd396b9bTh2wi/dneTUFcunzNbtb59dVXVskvsn+ei+H9Tdlya5dKKcq6aqdnT3ltE51hPHfPU55qvPMV99jvnqW8RjPuV05DVJzqiq06vquCRPSnLlPvtcmeRHZu+fkORNPdXQHADAGjLZSFh331FVFyW5KsmGJL/e3TdU1YuS7OjuK5P8WpLfqqqdST6W5aIGALDwppyOTHdvT7J9n3UvWPH+80meOGWGNeaon1I9Cjnmq88xX32O+epzzFffwh3zyU7MBwDgwDy2CABgACVsYlV1fFX9dVW9q6puqKoXjs60XlTVhqq6rqr+cHSW9aCqPlRV76mq66tqx+g860FVPaCqrqiq91fVjVX1raMzLbKqetDs3++9P5+sqmeNzrXoqurZs/9/vreqXltVx4/OdKSYjpzY7AkAJ3b3p6vqPkn+Iskzu/sdg6MtvKp6TpItSe7X3Y8bnWfRVdWHkmzZ9z5/TKeqXp3kz7v7lbOr0O/b3R8fnWs9mD2ab3eW72158+g8i6qqTs7y/zfP7O7PVdXlSbZ392+MTXZkGAmbWC/79GzxPrMfzXdiVXVKku9N8srRWWAKVXX/JI/I8lXm6e7bFbBV9egkf6eArYpjk5wwu5/ofZP8/eA8R4wStgpm02LXJ7k1yZ9099WjM60Dv5hka5K7RgdZRzrJH1fVtbOnXDCt05PsSfKq2bT7K6vqxNGh1pEnJXnt6BCLrrt3J3lpkg8n+UiST3T3H49NdeQoYaugu+/s7odk+akBZ1XVN4zOtMiq6nFJbu3ua0dnWWe+vbsfluTcJM+oqkeMDrTgjk3ysCS/0t0PTfKZJBePjbQ+zKZ+z0vye6OzLLqqemCS87P8R8dXJjmxqp48NtWRo4StotlUwZuTnDM6y4I7O8l5s3OULkvynVX122MjLb7ZX6zp7luTvD7JWWMTLbxdSXatGFm/IsuljOmdm+Sd3f0Po4OsA9+V5IPdvae7v5DkdUm+bXCmI0YJm1hVbaqqB8zen5DkMUnePzbVYuvu53X3Kd19WpanDN7U3Qvzl9NaVFUnVtVJe98n+e4k7x2barF191KSW6rqQbNVj07yvoGR1pMLYipytXw4ycOr6r6zC90eneTGwZmOmEnvmE+S5CuSvHp2Jc0xSS7vbrdMYNF8eZLXL/83MscmeU13/9HYSOvCTyb5ndn02E1Jnjo4z8Kb/ZHxmCQ/PjrLetDdV1fVFUnemeSOJNdlge6c7xYVAAADmI4EABhACQMAGEAJAwAYQAkDABhACQMAGEAJA5ipqtOq6r2z91uq6pdm7x9VVQtzg0hgbXCfMID96O4dSXbMFh+V5NNJ/nJYIGDhGAkDFkJVPb+q/qaq/qKqXltVP1NVb6mqLbPtG2ePsto74vXnVfXO2c89Rrlmo19/WFWnJfmJJM+uquur6juq6oNVdZ/ZfvdbuQwwLyNhwFGvqr45y4+oekiW/7v2ziQHe4D7rUke092fr6ozsvwImi3727G7P1RVv5rk09390tn3vSXJ9yb5/dn3vm72XDuAuRkJAxbBdyR5fXd/trs/meTKQ+x/nyT/p6rek+T3kpx5mN/3yvzLI4KemuRVh/n7AEbCgIV2R/7lj83jV6x/dpJ/SPJNs+2fP5wP7e63z6Y0H5VkQ3d7WDlw2IyEAYvgbUn+Q1WdUFUnJfm+2foPJfnm2fsnrNj//kk+0t13JXlKkg2H+PxPJTlpn3W/meQ1MQoG3EtKGHDU6+53JvndJO9K8sYk18w2vTTJf66q65JsXPErr0jyI1X1riRfl+Qzh/iKNyT5j3tPzJ+t+50kD8zy+WQAh626e3QGgCOqqn4uK06kn+g7npDk/O5+ylTfASw254QBHKaq+uUk5yZ57OgswNHLSBgAwADOCQMAGEAJAwAYQAkDABhACQMAGEAJAwAYQAkDABjg/wPirYE4+7Ki1QAAAABJRU5ErkJggg==\n",
"text/plain": [
""
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "M4hd_N8EgH57"
},
"source": [
"## 3.4. zbiór Dev (bash)"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 419
},
"id": "XT3hrfW3gOxH",
"outputId": "98ef6303-7f2b-4341-e6ad-c19af8750ccc"
},
"source": [
"wine_dev_bash"
],
"execution_count": 22,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" fixed acidity | \n",
" volatile acidity | \n",
" citric acid | \n",
" residual sugar | \n",
" chlorides | \n",
" free sulfur dioxide | \n",
" total sulfur dioxide | \n",
" density | \n",
" pH | \n",
" sulphates | \n",
" alcohol | \n",
" quality | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 8.0 | \n",
" 0.705 | \n",
" 0.05 | \n",
" 1.9 | \n",
" 0.074 | \n",
" 8.0 | \n",
" 19.0 | \n",
" 0.99620 | \n",
" 3.34 | \n",
" 0.95 | \n",
" 10.5 | \n",
" 6 | \n",
"
\n",
" \n",
" 1 | \n",
" 7.6 | \n",
" 0.665 | \n",
" 0.10 | \n",
" 1.5 | \n",
" 0.066 | \n",
" 27.0 | \n",
" 55.0 | \n",
" 0.99655 | \n",
" 3.39 | \n",
" 0.51 | \n",
" 9.3 | \n",
" 5 | \n",
"
\n",
" \n",
" 2 | \n",
" 7.8 | \n",
" 0.550 | \n",
" 0.35 | \n",
" 2.2 | \n",
" 0.074 | \n",
" 21.0 | \n",
" 66.0 | \n",
" 0.99740 | \n",
" 3.25 | \n",
" 0.56 | \n",
" 9.2 | \n",
" 5 | \n",
"
\n",
" \n",
" 3 | \n",
" 13.0 | \n",
" 0.320 | \n",
" 0.65 | \n",
" 2.6 | \n",
" 0.093 | \n",
" 15.0 | \n",
" 47.0 | \n",
" 0.99960 | \n",
" 3.05 | \n",
" 0.61 | \n",
" 10.6 | \n",
" 5 | \n",
"
\n",
" \n",
" 4 | \n",
" 8.8 | \n",
" 0.610 | \n",
" 0.30 | \n",
" 2.8 | \n",
" 0.088 | \n",
" 17.0 | \n",
" 46.0 | \n",
" 0.99760 | \n",
" 3.26 | \n",
" 0.51 | \n",
" 9.3 | \n",
" 4 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 261 | \n",
" 13.8 | \n",
" 0.490 | \n",
" 0.67 | \n",
" 3.0 | \n",
" 0.093 | \n",
" 6.0 | \n",
" 15.0 | \n",
" 0.99860 | \n",
" 3.02 | \n",
" 0.93 | \n",
" 12.0 | \n",
" 6 | \n",
"
\n",
" \n",
" 262 | \n",
" 7.1 | \n",
" 0.750 | \n",
" 0.01 | \n",
" 2.2 | \n",
" 0.059 | \n",
" 11.0 | \n",
" 18.0 | \n",
" 0.99242 | \n",
" 3.39 | \n",
" 0.40 | \n",
" 12.8 | \n",
" 6 | \n",
"
\n",
" \n",
" 263 | \n",
" 9.9 | \n",
" 0.350 | \n",
" 0.41 | \n",
" 2.3 | \n",
" 0.083 | \n",
" 11.0 | \n",
" 61.0 | \n",
" 0.99820 | \n",
" 3.21 | \n",
" 0.50 | \n",
" 9.5 | \n",
" 5 | \n",
"
\n",
" \n",
" 264 | \n",
" 6.5 | \n",
" 0.520 | \n",
" 0.11 | \n",
" 1.8 | \n",
" 0.073 | \n",
" 13.0 | \n",
" 38.0 | \n",
" 0.99550 | \n",
" 3.34 | \n",
" 0.52 | \n",
" 9.3 | \n",
" 5 | \n",
"
\n",
" \n",
" 265 | \n",
" 6.8 | \n",
" 0.670 | \n",
" 0.00 | \n",
" 1.9 | \n",
" 0.080 | \n",
" 22.0 | \n",
" 39.0 | \n",
" 0.99701 | \n",
" 3.40 | \n",
" 0.74 | \n",
" 9.7 | \n",
" 5 | \n",
"
\n",
" \n",
"
\n",
"
266 rows × 12 columns
\n",
"
"
],
"text/plain": [
" fixed acidity volatile acidity citric acid ... sulphates alcohol quality\n",
"0 8.0 0.705 0.05 ... 0.95 10.5 6\n",
"1 7.6 0.665 0.10 ... 0.51 9.3 5\n",
"2 7.8 0.550 0.35 ... 0.56 9.2 5\n",
"3 13.0 0.320 0.65 ... 0.61 10.6 5\n",
"4 8.8 0.610 0.30 ... 0.51 9.3 4\n",
".. ... ... ... ... ... ... ...\n",
"261 13.8 0.490 0.67 ... 0.93 12.0 6\n",
"262 7.1 0.750 0.01 ... 0.40 12.8 6\n",
"263 9.9 0.350 0.41 ... 0.50 9.5 5\n",
"264 6.5 0.520 0.11 ... 0.52 9.3 5\n",
"265 6.8 0.670 0.00 ... 0.74 9.7 5\n",
"\n",
"[266 rows x 12 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 22
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "lhRktuxPgOsC",
"outputId": "612e6163-0b66-4495-fdc1-2a0813efe37e"
},
"source": [
"wine_dev_bash[\"quality\"].value_counts()"
],
"execution_count": 23,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"5 115\n",
"6 113\n",
"7 24\n",
"4 9\n",
"8 3\n",
"3 2\n",
"Name: quality, dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 23
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 297
},
"id": "FmOQIZMSgOnK",
"outputId": "a7f4b4e8-36a0-4a07-cce4-98caa71ff7d0"
},
"source": [
"wine_dev_bash.describe(include='all')"
],
"execution_count": 24,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" fixed acidity | \n",
" volatile acidity | \n",
" citric acid | \n",
" residual sugar | \n",
" chlorides | \n",
" free sulfur dioxide | \n",
" total sulfur dioxide | \n",
" density | \n",
" pH | \n",
" sulphates | \n",
" alcohol | \n",
" quality | \n",
"
\n",
" \n",
" \n",
" \n",
" count | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
" 266.000000 | \n",
"
\n",
" \n",
" mean | \n",
" 8.273684 | \n",
" 0.540075 | \n",
" 0.253008 | \n",
" 2.523308 | \n",
" 0.088620 | \n",
" 15.398496 | \n",
" 43.973684 | \n",
" 0.996749 | \n",
" 3.317895 | \n",
" 0.649774 | \n",
" 10.453321 | \n",
" 5.590226 | \n",
"
\n",
" \n",
" std | \n",
" 1.720592 | \n",
" 0.193856 | \n",
" 0.190330 | \n",
" 1.380498 | \n",
" 0.055825 | \n",
" 10.002219 | \n",
" 30.518712 | \n",
" 0.001930 | \n",
" 0.152003 | \n",
" 0.176930 | \n",
" 1.058010 | \n",
" 0.777841 | \n",
"
\n",
" \n",
" min | \n",
" 4.900000 | \n",
" 0.120000 | \n",
" 0.000000 | \n",
" 1.300000 | \n",
" 0.012000 | \n",
" 1.000000 | \n",
" 8.000000 | \n",
" 0.990640 | \n",
" 2.870000 | \n",
" 0.330000 | \n",
" 8.500000 | \n",
" 3.000000 | \n",
"
\n",
" \n",
" 25% | \n",
" 7.100000 | \n",
" 0.396250 | \n",
" 0.080000 | \n",
" 1.900000 | \n",
" 0.068250 | \n",
" 8.000000 | \n",
" 20.000000 | \n",
" 0.995525 | \n",
" 3.210000 | \n",
" 0.542500 | \n",
" 9.500000 | \n",
" 5.000000 | \n",
"
\n",
" \n",
" 50% | \n",
" 7.900000 | \n",
" 0.520000 | \n",
" 0.240000 | \n",
" 2.200000 | \n",
" 0.079000 | \n",
" 13.000000 | \n",
" 37.000000 | \n",
" 0.996720 | \n",
" 3.320000 | \n",
" 0.620000 | \n",
" 10.200000 | \n",
" 6.000000 | \n",
"
\n",
" \n",
" 75% | \n",
" 9.200000 | \n",
" 0.648750 | \n",
" 0.390000 | \n",
" 2.600000 | \n",
" 0.090000 | \n",
" 20.000000 | \n",
" 60.000000 | \n",
" 0.997877 | \n",
" 3.430000 | \n",
" 0.720000 | \n",
" 11.200000 | \n",
" 6.000000 | \n",
"
\n",
" \n",
" max | \n",
" 15.600000 | \n",
" 1.580000 | \n",
" 0.760000 | \n",
" 13.800000 | \n",
" 0.611000 | \n",
" 66.000000 | \n",
" 141.000000 | \n",
" 1.003150 | \n",
" 3.720000 | \n",
" 1.950000 | \n",
" 14.000000 | \n",
" 8.000000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" fixed acidity volatile acidity ... alcohol quality\n",
"count 266.000000 266.000000 ... 266.000000 266.000000\n",
"mean 8.273684 0.540075 ... 10.453321 5.590226\n",
"std 1.720592 0.193856 ... 1.058010 0.777841\n",
"min 4.900000 0.120000 ... 8.500000 3.000000\n",
"25% 7.100000 0.396250 ... 9.500000 5.000000\n",
"50% 7.900000 0.520000 ... 10.200000 6.000000\n",
"75% 9.200000 0.648750 ... 11.200000 6.000000\n",
"max 15.600000 1.580000 ... 14.000000 8.000000\n",
"\n",
"[8 rows x 12 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 24
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 405
},
"id": "j3Z6noeZgOjC",
"outputId": "de24703b-50d4-4059-d5e6-ddc0c0f3356c"
},
"source": [
"fig = plt.figure(figsize = (10,6))\n",
"sns.barplot(x = 'quality', y = 'volatile acidity', data = wine_dev_bash)"
],
"execution_count": 25,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
""
]
},
"metadata": {
"tags": []
},
"execution_count": 25
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ftWOC-do2Pq-"
},
"source": [
"# 4. Normalizacja"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Wm0EM2hj4s6V"
},
"source": [
"Normalizacja kolumny 'quality' na wartości od 0 do 20. Nie jest ona konieczna ale została stworzona w celach demonstracyjnych"
]
},
{
"cell_type": "code",
"metadata": {
"id": "EkZQ6Hpy2Tj_"
},
"source": [
"wine[\"quality\"]=((wine[\"quality\"]-wine[\"quality\"].min())/(wine[\"quality\"].max()-wine[\"quality\"].min()))*20"
],
"execution_count": 26,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 419
},
"id": "_bQgYfct3Tir",
"outputId": "8b50d411-b47b-4d4d-d3eb-606d7c134de0"
},
"source": [
"wine"
],
"execution_count": 27,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" fixed acidity | \n",
" volatile acidity | \n",
" citric acid | \n",
" residual sugar | \n",
" chlorides | \n",
" free sulfur dioxide | \n",
" total sulfur dioxide | \n",
" density | \n",
" pH | \n",
" sulphates | \n",
" alcohol | \n",
" quality | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 7.4 | \n",
" 0.700 | \n",
" 0.00 | \n",
" 1.9 | \n",
" 0.076 | \n",
" 11.0 | \n",
" 34.0 | \n",
" 0.99780 | \n",
" 3.51 | \n",
" 0.56 | \n",
" 9.4 | \n",
" 8.0 | \n",
"
\n",
" \n",
" 1 | \n",
" 7.8 | \n",
" 0.880 | \n",
" 0.00 | \n",
" 2.6 | \n",
" 0.098 | \n",
" 25.0 | \n",
" 67.0 | \n",
" 0.99680 | \n",
" 3.20 | \n",
" 0.68 | \n",
" 9.8 | \n",
" 8.0 | \n",
"
\n",
" \n",
" 2 | \n",
" 7.8 | \n",
" 0.760 | \n",
" 0.04 | \n",
" 2.3 | \n",
" 0.092 | \n",
" 15.0 | \n",
" 54.0 | \n",
" 0.99700 | \n",
" 3.26 | \n",
" 0.65 | \n",
" 9.8 | \n",
" 8.0 | \n",
"
\n",
" \n",
" 3 | \n",
" 11.2 | \n",
" 0.280 | \n",
" 0.56 | \n",
" 1.9 | \n",
" 0.075 | \n",
" 17.0 | \n",
" 60.0 | \n",
" 0.99800 | \n",
" 3.16 | \n",
" 0.58 | \n",
" 9.8 | \n",
" 12.0 | \n",
"
\n",
" \n",
" 4 | \n",
" 7.4 | \n",
" 0.700 | \n",
" 0.00 | \n",
" 1.9 | \n",
" 0.076 | \n",
" 11.0 | \n",
" 34.0 | \n",
" 0.99780 | \n",
" 3.51 | \n",
" 0.56 | \n",
" 9.4 | \n",
" 8.0 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 1594 | \n",
" 6.2 | \n",
" 0.600 | \n",
" 0.08 | \n",
" 2.0 | \n",
" 0.090 | \n",
" 32.0 | \n",
" 44.0 | \n",
" 0.99490 | \n",
" 3.45 | \n",
" 0.58 | \n",
" 10.5 | \n",
" 8.0 | \n",
"
\n",
" \n",
" 1595 | \n",
" 5.9 | \n",
" 0.550 | \n",
" 0.10 | \n",
" 2.2 | \n",
" 0.062 | \n",
" 39.0 | \n",
" 51.0 | \n",
" 0.99512 | \n",
" 3.52 | \n",
" 0.76 | \n",
" 11.2 | \n",
" 12.0 | \n",
"
\n",
" \n",
" 1596 | \n",
" 6.3 | \n",
" 0.510 | \n",
" 0.13 | \n",
" 2.3 | \n",
" 0.076 | \n",
" 29.0 | \n",
" 40.0 | \n",
" 0.99574 | \n",
" 3.42 | \n",
" 0.75 | \n",
" 11.0 | \n",
" 12.0 | \n",
"
\n",
" \n",
" 1597 | \n",
" 5.9 | \n",
" 0.645 | \n",
" 0.12 | \n",
" 2.0 | \n",
" 0.075 | \n",
" 32.0 | \n",
" 44.0 | \n",
" 0.99547 | \n",
" 3.57 | \n",
" 0.71 | \n",
" 10.2 | \n",
" 8.0 | \n",
"
\n",
" \n",
" 1598 | \n",
" 6.0 | \n",
" 0.310 | \n",
" 0.47 | \n",
" 3.6 | \n",
" 0.067 | \n",
" 18.0 | \n",
" 42.0 | \n",
" 0.99549 | \n",
" 3.39 | \n",
" 0.66 | \n",
" 11.0 | \n",
" 12.0 | \n",
"
\n",
" \n",
"
\n",
"
1599 rows × 12 columns
\n",
"
"
],
"text/plain": [
" fixed acidity volatile acidity citric acid ... sulphates alcohol quality\n",
"0 7.4 0.700 0.00 ... 0.56 9.4 8.0\n",
"1 7.8 0.880 0.00 ... 0.68 9.8 8.0\n",
"2 7.8 0.760 0.04 ... 0.65 9.8 8.0\n",
"3 11.2 0.280 0.56 ... 0.58 9.8 12.0\n",
"4 7.4 0.700 0.00 ... 0.56 9.4 8.0\n",
"... ... ... ... ... ... ... ...\n",
"1594 6.2 0.600 0.08 ... 0.58 10.5 8.0\n",
"1595 5.9 0.550 0.10 ... 0.76 11.2 12.0\n",
"1596 6.3 0.510 0.13 ... 0.75 11.0 12.0\n",
"1597 5.9 0.645 0.12 ... 0.71 10.2 8.0\n",
"1598 6.0 0.310 0.47 ... 0.66 11.0 12.0\n",
"\n",
"[1599 rows x 12 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 27
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "I1AwZoyN4RHs",
"outputId": "15a7bca4-8bbe-4749-80b8-5eede667aa07"
},
"source": [
"wine[\"quality\"].value_counts()"
],
"execution_count": 28,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"8.0 681\n",
"12.0 638\n",
"16.0 199\n",
"4.0 53\n",
"20.0 18\n",
"0.0 10\n",
"Name: quality, dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 28
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XBU3z_of414w"
},
"source": [
"# 5. Usuwanie artefaktów"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "KCstRwQp5-X1"
},
"source": [
"### Całe szczęscie nie ma w moim zbiorze ani pustych linijek, ani przykładów z niepoprawnymi wartościami"
]
},
{
"cell_type": "code",
"metadata": {
"id": "EJqksTP545UV"
},
"source": [
"# Znajdźmy pustą linijkę:\n",
"! grep -P \"^$\" -n winequality-red.csv"
],
"execution_count": 29,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "8DuoPn3Fa0kP"
},
"source": [
"Szukanie wartości \"NA\": https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "REYF2AWjz_lr",
"outputId": "01c5cd70-a37e-433f-bde3-d0c855c96c2e"
},
"source": [
"wine.isnull().sum()"
],
"execution_count": 30,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"fixed acidity 0\n",
"volatile acidity 0\n",
"citric acid 0\n",
"residual sugar 0\n",
"chlorides 0\n",
"free sulfur dioxide 0\n",
"total sulfur dioxide 0\n",
"density 0\n",
"pH 0\n",
"sulphates 0\n",
"alcohol 0\n",
"quality 0\n",
"dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 30
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "RbkqNj9_akcU"
},
"source": [
"wine.dropna(inplace=True) "
],
"execution_count": 31,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 419
},
"id": "4WylJo9malyG",
"outputId": "95a9b3f4-a7f5-4f61-fdbe-918dbca2d72c"
},
"source": [
"wine"
],
"execution_count": 32,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" fixed acidity | \n",
" volatile acidity | \n",
" citric acid | \n",
" residual sugar | \n",
" chlorides | \n",
" free sulfur dioxide | \n",
" total sulfur dioxide | \n",
" density | \n",
" pH | \n",
" sulphates | \n",
" alcohol | \n",
" quality | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 7.4 | \n",
" 0.700 | \n",
" 0.00 | \n",
" 1.9 | \n",
" 0.076 | \n",
" 11.0 | \n",
" 34.0 | \n",
" 0.99780 | \n",
" 3.51 | \n",
" 0.56 | \n",
" 9.4 | \n",
" 8.0 | \n",
"
\n",
" \n",
" 1 | \n",
" 7.8 | \n",
" 0.880 | \n",
" 0.00 | \n",
" 2.6 | \n",
" 0.098 | \n",
" 25.0 | \n",
" 67.0 | \n",
" 0.99680 | \n",
" 3.20 | \n",
" 0.68 | \n",
" 9.8 | \n",
" 8.0 | \n",
"
\n",
" \n",
" 2 | \n",
" 7.8 | \n",
" 0.760 | \n",
" 0.04 | \n",
" 2.3 | \n",
" 0.092 | \n",
" 15.0 | \n",
" 54.0 | \n",
" 0.99700 | \n",
" 3.26 | \n",
" 0.65 | \n",
" 9.8 | \n",
" 8.0 | \n",
"
\n",
" \n",
" 3 | \n",
" 11.2 | \n",
" 0.280 | \n",
" 0.56 | \n",
" 1.9 | \n",
" 0.075 | \n",
" 17.0 | \n",
" 60.0 | \n",
" 0.99800 | \n",
" 3.16 | \n",
" 0.58 | \n",
" 9.8 | \n",
" 12.0 | \n",
"
\n",
" \n",
" 4 | \n",
" 7.4 | \n",
" 0.700 | \n",
" 0.00 | \n",
" 1.9 | \n",
" 0.076 | \n",
" 11.0 | \n",
" 34.0 | \n",
" 0.99780 | \n",
" 3.51 | \n",
" 0.56 | \n",
" 9.4 | \n",
" 8.0 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 1594 | \n",
" 6.2 | \n",
" 0.600 | \n",
" 0.08 | \n",
" 2.0 | \n",
" 0.090 | \n",
" 32.0 | \n",
" 44.0 | \n",
" 0.99490 | \n",
" 3.45 | \n",
" 0.58 | \n",
" 10.5 | \n",
" 8.0 | \n",
"
\n",
" \n",
" 1595 | \n",
" 5.9 | \n",
" 0.550 | \n",
" 0.10 | \n",
" 2.2 | \n",
" 0.062 | \n",
" 39.0 | \n",
" 51.0 | \n",
" 0.99512 | \n",
" 3.52 | \n",
" 0.76 | \n",
" 11.2 | \n",
" 12.0 | \n",
"
\n",
" \n",
" 1596 | \n",
" 6.3 | \n",
" 0.510 | \n",
" 0.13 | \n",
" 2.3 | \n",
" 0.076 | \n",
" 29.0 | \n",
" 40.0 | \n",
" 0.99574 | \n",
" 3.42 | \n",
" 0.75 | \n",
" 11.0 | \n",
" 12.0 | \n",
"
\n",
" \n",
" 1597 | \n",
" 5.9 | \n",
" 0.645 | \n",
" 0.12 | \n",
" 2.0 | \n",
" 0.075 | \n",
" 32.0 | \n",
" 44.0 | \n",
" 0.99547 | \n",
" 3.57 | \n",
" 0.71 | \n",
" 10.2 | \n",
" 8.0 | \n",
"
\n",
" \n",
" 1598 | \n",
" 6.0 | \n",
" 0.310 | \n",
" 0.47 | \n",
" 3.6 | \n",
" 0.067 | \n",
" 18.0 | \n",
" 42.0 | \n",
" 0.99549 | \n",
" 3.39 | \n",
" 0.66 | \n",
" 11.0 | \n",
" 12.0 | \n",
"
\n",
" \n",
"
\n",
"
1599 rows × 12 columns
\n",
"
"
],
"text/plain": [
" fixed acidity volatile acidity citric acid ... sulphates alcohol quality\n",
"0 7.4 0.700 0.00 ... 0.56 9.4 8.0\n",
"1 7.8 0.880 0.00 ... 0.68 9.8 8.0\n",
"2 7.8 0.760 0.04 ... 0.65 9.8 8.0\n",
"3 11.2 0.280 0.56 ... 0.58 9.8 12.0\n",
"4 7.4 0.700 0.00 ... 0.56 9.4 8.0\n",
"... ... ... ... ... ... ... ...\n",
"1594 6.2 0.600 0.08 ... 0.58 10.5 8.0\n",
"1595 5.9 0.550 0.10 ... 0.76 11.2 12.0\n",
"1596 6.3 0.510 0.13 ... 0.75 11.0 12.0\n",
"1597 5.9 0.645 0.12 ... 0.71 10.2 8.0\n",
"1598 6.0 0.310 0.47 ... 0.66 11.0 12.0\n",
"\n",
"[1599 rows x 12 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 32
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "iqsJ9Bfngy-m"
},
"source": [
""
],
"execution_count": null,
"outputs": []
}
]
}