naive-bayes-gaussian/naive_bayes.ipynb

1166 lines
401 KiB
Plaintext
Raw Normal View History

{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "TQqrOdkY6nsy"
},
"source": [
"# **Klasyfikacja za pomocą naiwnej metody bayesowskiej z rozkładem normalnym**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AlcfRFCPSXIj"
},
"source": [
"# **Twierdzenie Bayesa**\n",
"![bayes.svg](data:image/svg+xml;base64,PHN2ZyB4bWxuczp4bGluaz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94bGluayIgd2lkdGg9IjI3LjQ4NGV4IiBoZWlnaHQ9IjYuNTA5ZXgiIHN0eWxlPSJ2ZXJ0aWNhbC1hbGlnbjogLTIuNjcxZXg7IiB2aWV3Qm94PSIwIC0xNjUyLjUgMTE4MzMuMyAyODAyLjYiIHJvbGU9ImltZyIgZm9jdXNhYmxlPSJmYWxzZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiBhcmlhLWxhYmVsbGVkYnk9Ik1hdGhKYXgtU1ZHLTEtVGl0bGUiPgo8dGl0bGUgaWQ9Ik1hdGhKYXgtU1ZHLTEtVGl0bGUiPntcZGlzcGxheXN0eWxlIHtcbWF0aHNmIHtQfX0oQVxtaWQgQik9e1xmcmFjIHt7XG1hdGhzZiB7UH19KEJcbWlkIEEpXCx7XG1hdGhzZiB7UH19KEEpfXt7XG1hdGhzZiB7UH19KEIpfX0sfTwvdGl0bGU+CjxkZWZzIGFyaWEtaGlkZGVuPSJ0cnVlIj4KPHBhdGggc3Ryb2tlLXdpZHRoPSIxIiBpZD0iRTEtTUpTUy01MCIgZD0iTTg4IDBWNjk0SDIzMFEzNDcgNjkzIDM3MCA2OTJUNDEwIDY4NlE0ODcgNjY3IDUzNSA2MTFUNTgzIDQ4NVE1ODMgNDA5IDUyNyAzNDhUMzc5IDI3NlEzNjkgMjc0IDI3OSAyNzRIMTkyVjBIODhaTTQ4NiA0ODVRNDg2IDUyMyA0NzEgNTUxVDQzMiA1OTNUMzkxIDYxMlQzNTcgNjIxUTM1MCA2MjIgMjY4IDYyM0gxODlWMzQ3SDI2OFEzNTAgMzQ4IDM1NyAzNDlRMzcwIDM1MSAzODMgMzU0VDQxNiAzNjhUNDUwIDM5MVQ0NzUgNDI5VDQ4NiA0ODVaIj48L3BhdGg+CjxwYXRoIHN0cm9rZS13aWR0aD0iMSIgaWQ9IkUxLU1KTUFJTi0yOCIgZD0iTTk0IDI1MFE5NCAzMTkgMTA0IDM4MVQxMjcgNDg4VDE2NCA1NzZUMjAyIDY0M1QyNDQgNjk1VDI3NyA3MjlUMzAyIDc1MEgzMTVIMzE5UTMzMyA3NTAgMzMzIDc0MVEzMzMgNzM4IDMxNiA3MjBUMjc1IDY2N1QyMjYgNTgxVDE4NCA0NDNUMTY3IDI1MFQxODQgNThUMjI1IC04MVQyNzQgLTE2N1QzMTYgLTIyMFQzMzMgLTI0MVEzMzMgLTI1MCAzMTggLTI1MEgzMTVIMzAyTDI3NCAtMjI2UTE4MCAtMTQxIDEzNyAtMTRUOTQgMjUwWiI+PC9wYXRoPgo8cGF0aCBzdHJva2Utd2lkdGg9IjEiIGlkPSJFMS1NSk1BVEhJLTQxIiBkPSJNMjA4IDc0UTIwOCA1MCAyNTQgNDZRMjcyIDQ2IDI3MiAzNVEyNzIgMzQgMjcwIDIyUTI2NyA4IDI2NCA0VDI1MSAwUTI0OSAwIDIzOSAwVDIwNSAxVDE0MSAyUTcwIDIgNTAgMEg0MlEzNSA3IDM1IDExUTM3IDM4IDQ4IDQ2SDYyUTEzMiA0OSAxNjQgOTZRMTcwIDEwMiAzNDUgNDAxVDUyMyA3MDRRNTMwIDcxNiA1NDcgNzE2SDU1NUg1NzJRNTc4IDcwNyA1NzggNzA2TDYwNiAzODNRNjM0IDYwIDYzNiA1N1E2NDEgNDYgNzAxIDQ2UTcyNiA0NiA3MjYgMzZRNzI2IDM0IDcyMyAyMlE3MjAgNyA3MTggNFQ3MDQgMFE3MDEgMCA2OTAgMFQ2NTEgMVQ1NzggMlE0ODQgMiA0NTUgMEg0NDNRNDM3IDYgNDM3IDlUNDM5IDI3UTQ0MyA0MCA0NDUgNDNMNDQ5IDQ2SDQ2OVE1MjMgNDkgNTMzIDYzTDUyMSAyMTNIMjgzTDI0OSAxNTVRMjA4IDg2IDIwOCA3NFpNNTE2IDI2MFE1MTYgMjcxIDUwNCA0MTZUNDkwIDU2Mkw0NjMgNTE5UTQ0NyA0OTIgNDAwIDQxMkwzMTAgMjYwTDQxMyAyNTlRNTE2IDI1OSA1MTYgMjYwWiI+PC9wYXRoPgo8cGF0aCBzdHJva2Utd2lkdGg9IjEiIGlkPSJFMS1NSk1BSU4tMjIyMyIgZD0iTTEzOSAtMjQ5SDEzN1ExMjUgLTI0OSAxMTkgLTIzNVYyNTFMMTIwIDczN1ExMzAgNzUwIDEzOSA3NTBRMTUyIDc1MCAxNTkgNzM1Vi0yMzVRMTUxIC0yNDkgMTQxIC0yNDlIMTM5WiI+PC9wYXRoPgo8cGF0aCBzdHJva2Utd2lkdGg9IjEiIGlkPSJFMS1NSk1BVEhJLTQyIiBkPSJNMjMxIDYzN1EyMDQgNjM3IDE5OSA2MzhUMTk0IDY0OVExOTQgNjc2IDIwNSA2ODJRMjA2IDY4MyAzMzUgNjgzUTU5NCA2ODMgNjA4IDY4MVE2NzEgNjcxIDcxMyA2MzZUNzU2IDU0NFE3NTYgNDgwIDY5OCA0MjlUNTY1IDM2MEw1NTUgMzU3UTYxOSAzNDggNjYwIDMxMVQ3MDIgMjE5UTcwMiAxNDYgNjMwIDc4VDQ1MyAxUTQ0NiAwIDI0MiAwUTQyIDAgMzkgMlEzNSA1IDM1IDEwUTM1IDE3IDM3IDI0UTQyIDQzIDQ3IDQ1UTUxIDQ2IDYyIDQ2SDY4UTk1IDQ2IDEyOCA0OVExNDIgNTIgMTQ3IDYxUTE1MCA2NSAyMTkgMzM5VDI4OCA2MjhRMjg4IDYzNSAyMzEgNjM3Wk02NDkgNTQ0UTY0OSA1NzQgNjM0IDYwMFQ1ODUgNjM0UTU3OCA2MzYgNDkzIDYzN1E0NzMgNjM3IDQ1MSA2MzdUNDE2IDYzNkg0MDNRMzg4IDYzNSAzODQgNjI2UTM4MiA2MjIgMzUyIDUwNlEzNTIgNTAzIDM1MSA1MDBMMzIwIDM3NEg0MDFRNDgyIDM3NCA0OTQgMzc2UTU1NCAzODYgNjAxIDQzNFQ2NDkgNTQ0Wk01OTUgMjI5UTU5NSAyNzMgNTcyIDMwMlQ1MTIgMzM2UTUwNiAzMzcgNDI5IDMzN1EzMTEgMzM3IDMxMCAzMzZRMzEwIDMzNCAyOTMgMjYzVDI1OCAxMjJMMjQwIDUyUTI0MCA0OCAyNTIgNDhUMzMzIDQ2UTQyMiA0NiA0MjkgNDdRNDkxIDU0IDU0MyAxMDVUNTk1IDIyOVoiPjwvcGF0aD4KPHBhdGggc3Ryb2tlLXdpZHRoPSIxIiBpZD0iRTEtTUpNQUlOLTI5IiBkPSJNNjAgNzQ5TDY0IDc1MFE2OSA3NTAgNzQgNzUwSDg2TDExNCA3MjZRMjA4IDY0MSAyNTEgNTE0VDI5NCAyNTBRMjk0IDE4MiAyODQgMTE5VDI2MSAxMlQyMjQgLTc2VDE4NiAtMTQzVDE0NSAtMTk0VDExMyAtMjI3VDkwIC0yNDZRODcgLTI0OSA4NiAtMjUwSDc0UTY2IC0yNTAgNjMgLTI1MFQ1OCAtMjQ3VDU1IC0yMzhRNTYgLTIzNyA2NiAtMjI1UTIyMSAtNjQgMjIxIDI1MFQ2NiA3MjVRNTYgNzM3IDU1IDczOFE1NSA3NDYgNjAgNzQ5WiI+PC9wYXRoPgo8cGF0aCBzdHJva2Utd2lkdGg9IjEiIGlkPSJFMS1NSk1BSU4tM0QiIGQ9Ik01NiAzNDdRNTYgMzYwIDcwIDM2N0g3MDdRNzIyIDM1OSA3MjIgMzQ3UTcyMiAzMzYgNzA4IDMyOEwzOTAgMzI3SDcyUTU2IDMzMiA1NiAzNDdaTTU2IDE1M1E1NiAxNjggNzIgMTczSDcwOFE3MjIgMTYzIDcyMiAxNTNRNzIyIDE0MCA3MDcgMTMzSDcwUTU2IDE0MCA1NiAxNTNaIj48L3BhdGg+CjxwYXRoIHN0cm9rZS13aWR0aD0iMSIgaWQ9IkUxLU1KTUFJTi0yQyIgZD0iTTc4IDM1VDc4IDYwVDk0IDE
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "rcpTnWjOh5dq"
},
"source": [
"P(A) -- oznacza prawdopodobieństwo a-priori wystąpienia klasy A (tj. prawdopodobieństwo, że dowolny przykład należy do klasy A)\n",
"\n",
"P(B|A) -- oznacza prawdopodobieństwo a-posteriori, że B należy do \n",
"klasy A\n",
"\n",
"P(B) -- znacza prawdopodobieństwo a-priori wystąpienia przykładu B "
]
},
2022-05-17 23:08:12 +02:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Naiwny klasyfikator bayesowski jest _naiwny, ponieważ zakłada, że poszczególne cechy są niezależne od siebie**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SSaJsYOhz8h8"
},
"source": [
"![rozklady.jpg](data:image/jpeg;base64,/9j/4AAQSkZJRgABAQEAYABgAAD//gA8Q1JFQVRPUjogZ2QtanBlZyB2MS4wICh1c2luZyBJSkcgSlBFRyB2NjIpLCBxdWFsaXR5ID0gMTAwCv/bAEMAAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAf/bAEMBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAf/AABEIAucC6QMBEQACEQEDEQH/xAAfAAABBQEBAQEBAQAAAAAAAAAAAQIDBAUGBwgJCgv/xAC1EAACAQMDAgQDBQUEBAAAAX0BAgMABBEFEiExQQYTUWEHInEUMoGRoQgjQrHBFVLR8CQzYnKCCQoWFxgZGiUmJygpKjQ1Njc4OTpDREVGR0hJSlNUVVZXWFlaY2RlZmdoaWpzdHV2d3h5eoOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4eLj5OXm5+jp6vHy8/T19vf4+fr/xAAfAQADAQEBAQEBAQEBAAAAAAAAAQIDBAUGBwgJCgv/xAC1EQACAQIEBAMEBwUEBAABAncAAQIDEQQFITEGEkFRB2FxEyIygQgUQpGhscEJIzNS8BVictEKFiQ04SXxFxgZGiYnKCkqNTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqCg4SFhoeIiYqSk5SVlpeYmZqio6Slpqeoqaqys7S1tre4ubrCw8TFxsfIycrS09TV1tfY2dri4+Tl5ufo6ery8/T19vf4+fr/2gAMAwEAAhEDEQA/AP73P7Pi9dS/8HOvf/F0AH9nxeupf+DnXv8A4ugA/s+L11L/AMHOvf8AxdAB/Z8XrqX/AIOde/8Ai6AD+z4vXUv/AAc69/8AF0AH9nxeupf+DnXv/i6AD+z4vXUv/Bzr3/xdAB/Z8XrqX/g517/4ugA/s+L11L/wc69/8XQAf2fF66l/4Ode/wDi6AD+z4vXUv8Awc69/wDF0AH9nxeupf8Ag517/wCLoAP7Pi9dS/8ABzr3/wAXQAf2fF66l/4Ode/+LoAP7Pi9dS/8HOvf/F0AH9nxeupf+DnXv/i6AD+z4vXUv/Bzr3/xdAB/Z8XrqX/g517/AOLoAP7Pi9dS/wDBzr3/AMXQAf2fF66l/wCDnXv/AIugA/s+L11L/wAHOvf/ABdAB/Z8XrqX/g517/4ugA/s+L11L/wc69/8XQAf2fF66l/4Ode/+LoAP7Pi9dS/8HOvf/F0AH9nxeupf+DnXv8A4ugA/s+L11L/AMHOvf8AxdAB/Z8XrqX/AIOde/8Ai6AD+z4vXUv/AAc69/8AF0AH9nxeupf+DnXv/i6AD+z4vXUv/Bzr3/xdAB/Z8XrqX/g517/4ugA/s+L11L/wc69/8XQAf2fF66l/4Ode/wDi6AD+z4vXUv8Awc69/wDF0AH9nxeupf8Ag517/wCLoAP7Pi9dS/8ABzr3/wAXQAf2fF66l/4Ode/+LoAP7Pi9dS/8HOvf/F0AH9nxeupf+DnXv/i6AD+z4vXUv/Bzr3/xdAB/Z8XrqX/g517/AOLoAP7Pi9dS/wDBzr3/AMXQAf2fF66l/wCDnXv/AIugA/s+L11L/wAHOvf/ABdAB/Z8XrqX/g517/4ugA/s+L11L/wc69/8XQAf2fF66l/4Ode/+LoAP7Pi9dS/8HOvf/F0AH9nxeupf+DnXv8A4ugA/s+L11L/AMHOvf8AxdAB/Z8XrqX/AIOde/8Ai6AD+z4vXUv/AAc69/8AF0AH9nxeupf+DnXv/i6AD+z4vXUv/Bzr3/xdAB/Z8XrqX/g517/4ugA/s+L11L/wc69/8XQAf2fF66l/4Ode/wDi6AD+z4vXUv8Awc69/wDF0AH9nxeupf8Ag517/wCLoAP7Pi9dS/8ABzr3/wAXQAf2fF66l/4Ode/+LoAP7Pi9dS/8HOvf/F0AH9nxeupf+DnXv/i6AD+z4vXUv/Bzr3/xdAB/Z8XrqX/g517/AOLoAP7Pi9dS/wDBzr3/AMXQAf2fF66l/wCDnXv/AIugA/s+L11L/wAHOvf/ABdAB/Z8XrqX/g517/4ugA/s+L11L/wc69/8XQAf2fF66l/4Ode/+LoAP7Pi9dS/8HOvf/F0AH9nxeupf+DnXv8A4ugA/s+L11L/AMHOvf8AxdAB/Z8XrqX/AIOde/8Ai6AD+z4vXUv/AAc69/8AF0AH9nxeupf+DnXv/i6AD+z4vXUv/Bzr3/xdAB/Z8XrqX/g517/4ugA/s+L11L/wc69/8XQAf2fF66l/4Ode/wDi6AD+z4vXUv8Awc69/wDF0AH9nxeupf8Ag517/wCLoAP7Pi9dS/8ABzr3/wAXQAf2fF66l/4Ode/+LoAP7Pi9dS/8HOvf/F0AH9nxeupf+DnXv/i6AD+z4vXUv/Bzr3/xdAB/Z8XrqX/g517/AOLoAP7Pi9dS/wDBzr3/AMXQAf2fF66l/wCDnXv/AIugA/s+L11L/wAHOvf/ABdAHz1/buv/APQwar/4HD/5q6APqGgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKAPkX5/9n9aAPrqgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKAPkX5/9n9aAPrqgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKAPkX5/wDZ/WgD66oAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgD5F+f/AGf1oA+uqACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoA+Rfn/2f1oA+uqACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoA+Rfn/2f1oA+uqACgAoAKACgAoAKACgAoAKACg
]
},
{
"cell_type": "code",
"execution_count": 104,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>sepal.length</th>\n",
" <th>sepal.width</th>\n",
" <th>petal.length</th>\n",
" <th>petal.width</th>\n",
" <th>variety</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>5.1</td>\n",
" <td>3.5</td>\n",
" <td>1.4</td>\n",
" <td>0.2</td>\n",
" <td>Setosa</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>4.9</td>\n",
" <td>3.0</td>\n",
" <td>1.4</td>\n",
" <td>0.2</td>\n",
" <td>Setosa</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>4.7</td>\n",
" <td>3.2</td>\n",
" <td>1.3</td>\n",
" <td>0.2</td>\n",
" <td>Setosa</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4.6</td>\n",
" <td>3.1</td>\n",
" <td>1.5</td>\n",
" <td>0.2</td>\n",
" <td>Setosa</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5.0</td>\n",
" <td>3.6</td>\n",
" <td>1.4</td>\n",
" <td>0.2</td>\n",
" <td>Setosa</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>145</th>\n",
" <td>6.7</td>\n",
" <td>3.0</td>\n",
" <td>5.2</td>\n",
" <td>2.3</td>\n",
" <td>Virginica</td>\n",
" </tr>\n",
" <tr>\n",
" <th>146</th>\n",
" <td>6.3</td>\n",
" <td>2.5</td>\n",
" <td>5.0</td>\n",
" <td>1.9</td>\n",
" <td>Virginica</td>\n",
" </tr>\n",
" <tr>\n",
" <th>147</th>\n",
" <td>6.5</td>\n",
" <td>3.0</td>\n",
" <td>5.2</td>\n",
" <td>2.0</td>\n",
" <td>Virginica</td>\n",
" </tr>\n",
" <tr>\n",
" <th>148</th>\n",
" <td>6.2</td>\n",
" <td>3.4</td>\n",
" <td>5.4</td>\n",
" <td>2.3</td>\n",
" <td>Virginica</td>\n",
" </tr>\n",
" <tr>\n",
" <th>149</th>\n",
" <td>5.9</td>\n",
" <td>3.0</td>\n",
" <td>5.1</td>\n",
" <td>1.8</td>\n",
" <td>Virginica</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>150 rows × 5 columns</p>\n",
"</div>"
],
"text/plain": [
" sepal.length sepal.width petal.length petal.width variety\n",
"0 5.1 3.5 1.4 0.2 Setosa\n",
"1 4.9 3.0 1.4 0.2 Setosa\n",
"2 4.7 3.2 1.3 0.2 Setosa\n",
"3 4.6 3.1 1.5 0.2 Setosa\n",
"4 5.0 3.6 1.4 0.2 Setosa\n",
".. ... ... ... ... ...\n",
"145 6.7 3.0 5.2 2.3 Virginica\n",
"146 6.3 2.5 5.0 1.9 Virginica\n",
"147 6.5 3.0 5.2 2.0 Virginica\n",
"148 6.2 3.4 5.4 2.3 Virginica\n",
"149 5.9 3.0 5.1 1.8 Virginica\n",
"\n",
"[150 rows x 5 columns]"
]
},
"execution_count": 104,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"iris_data = pd.read_csv(\"iris.csv\")\n",
"iris_data"
]
},
{
"cell_type": "code",
"execution_count": 106,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.collections.PathCollection at 0x7efc44f734f0>"
]
},
"execution_count": 106,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD4CAYAAAD1jb0+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAgcElEQVR4nO2db4xcV3mHn3fjgE1CCnboxE3Y2g7IEW3Kn3pTEFWbNqXapizJqiiCSGlSUVkqqhqEqhLxwUh1K6VfKGmhUAtQTUWT0CxbbATboi1/hNSi2BQCTnCbeEFJ6mwgBEjIULLs6Yd7NzG7M957zsw5885730eyPDN3duecndnX18957+9ICAHHcRxn/JgY9QAcx3GcNLyAO47jjClewB3HccYUL+CO4zhjihdwx3GcMWVLyRe78MILw65du0q+pOM4zthz/Pjx74QQXrT+8aIFfNeuXRw7dqzkSzqO44w9IvKtXo+7QnEcxxlTGp2Bi8g3gSeAnwArIYR9IrIduBPYBXwTuC6E8HieYTqO4zjriTkD/40QwitCCPvq+7cAiyGElwKL9X3HcRynEIMolGuAw/Xtw8C1A4/GcRzHaUzTRcwA/JuIBODvQwiHgE4I4XR9/BGg0+sLRWQ/sB9gcnJywOE246mnYH4elpZgzx6YnYVt24q8tDNurDwFD87DD5fg/D1wySxsac+H5amnn2L+vnmWvrfEnhfuYfayWbad2575jztNC/ivhhAeFpGfBT4jIt8482AIIdTFfQN1sT8EsG/fvuzJWXffDTMzsLz87GOdDhw9ClNTuV/dGSseuxs+PwM/OuPDsrUDv34Udtj/sNz98N3M3D7D8g+fnX/nvA5H33yUqYvtz98CjRRKCOHh+u9HgXngCmBZRHYC1H8/mmuQTel2NxZvqO7PzFTHHQeAle7G4g3V/c/PVMcN0326u6F4Ayz/cJmZ22foPm17/lbYtICLyHki8vy128BvA18HjgA31k+7EfhErkE2ZX5+Y/FeY3m5Ou44ADw0v7F4r/Gj5eq4Yea/Mb+heK+x/MNl5r9he/5WaKJQOsC8iKw9/59CCAsicjfwMRF5C/At4Lp8w2zGqVODHXeUkdNPP7nJh2Gz42POqcfPPr/NjlvAgv/ftICHEE4BL+/x+GPAVTkGlcqePYMddxSR20+fv8mHYbPjY86eF559fpsdH3es+H9TV2JOT8NEnxlNTFTHnTGghJ++ZLb6B6EXWzvVccPMXjZL57ze8++c12H2Mrvzt+T/TRXwhQVYXe19bHW1Ou6MASX89JZt1dn8+iK+dpZvvJVw27nbOPrmoxuK+NpZ6LiphBgs+f+iYVa5cQduhFJ+escUXH0C7jkAT5yEC/bC5Qdh6/bhfH/lTF08xdLNS8x/Y55Tj5/K5oG1uWZL/t9UAXcHboRSfnq9Z19ehAfnWtMHDtWZ+PWXX5/t+2t0zZb8vymF4g7cCCX8dMv7wEug1TVb8v+mCrg7cCOU8NMt7wMvgVbXbMn/m1Io7sANsWMK3rBUFdInT3kfuBJifLZm11zK/+fGVAF3B26MLdtgVyY/2/I+8BRifbZ215zb/5fAlEJxB+405qJp+n/8J+rjzhopPtuSa9aKqQLuDtxpzCMLQJ8PC6v1cWeNFJ9tyTVrxZRCSXXgnh9uiKb5Ke7Ao0j12Zpds7b+9BRMFfAUB+754YaIyU9xBx7FID5bo2vW2J+egoSQfY+FZ9i3b184duxYtu/f7cLu3b0jZTud6gz7zDPr2Oc7ilnpwpHdvVsDt3aqjpYzz8Rjn99yuk932X3b7p4apXNeh6Wbl8bm7HUc5yIix8/Yj/gZTDnwbduqM+fOunWTtTPq9cXY88MNEdvX3fIslFgs+Wyt/ekpmFIoUGmPpaWq+J46dXanPUjfuHtzZaQ47ZZnocSi2WfHULI/PbdnN1fAoSqk1zdQbql94+7NFZLitD0LJRqNPjuWUv3pJTy7KQceS4oDd2+uFHfgTkNKOPBhv0YrHHgssc4c3JurJdZpexZKaynh80t5dpMKJYYYZw5l81ZMefac+1uuEZOf4n3grWbq4ilOvPUEBz57gJOPnWTvhXs5eOVBtj9vOOsfpTx76ws4NHfmUC5vxZRnz72/5Zk0zU/xPvBWs95PLy4tMnfv3ND8dCnP3moHnkIJB27Ks2t1zVrH5WTHHXiLSfHmsZjy7Fpds/eBt5YSfrpU37xJhZLbHcd681hM5ZqXds0xrj135rhBLOSHlPLTJfrmzRXwUu44xpvHYirXvKRrTnHtOTPHjWElP6RkTnnuvnlTDtyKO7YyD6Cca3annZVxzA/pxzjOpRUO3Io7LuHZi1HKNWt17UawlB9iKdfFlEKx5I5ze/ailHDN3tedFc37W6ZgJdfFVAFPdcdaL5jJ6dmLk9s1l3TtJS5KUob2/S1TCARKKuQcmCrgs7OVZujnjmd7bMFn6oKZNnPJbKVl+jnwS4a0/2LJi5IUsba/ZT9vPG77W1pZkDXlwGPdcbe7sXhDdX9mpjrujAklXPtKd2Pxhur+52eq40ax5I1TNmjWiqkzcIhzx00WPc0ojDaQ27U3WSg13JJoxRs3WZAdl8jcxgVcRM4BjgEPhxBeLyK7gTuAHcBx4IYQwo/zDDOOpu7Y0qKns0aAXF7TF0pN5IEPsiCr7UKmmDPwm4H7gAvq+38F/HUI4Q4R+QDwFuD9Qx5fVkxdMOPk99MegGWC1AVZjd68kQMXkUuA3wU+WN8X4DeBu+qnHAauzTC+rExPw0Sfn8DERHXcGRNK+Om1hdJeDHOh1MnK2oJsL/otyGr15k0XMd8D/BmwWt/fAXwvhLBS338IuHi4Q8vPwgKsrvY+trpaHXfGhBIX8ngAlglSFmS1Xsi0qUIRkdcDj4YQjovIlbEvICL7gf0Ak5OTsV+eRNO+bnfghkj107E93R6AVYTcrjl2QVbrhUxNHPhrgTeIyNXAVioHfhvwAhHZUp+FXwI83OuLQwiHgENQZaEMZdRnIaav2x24IYaxqTE0c+YegJWVUq45ZkFW64VMUWFW9Rn4n9ZdKP8MzJ2xiHlPCOHvzvb12sKsTIVGtR3f1NgEWoOmRj2uHGFW7wDeLiL3UznxDw3wvYZCbJiVqdCotuObGptAq2vWeiFT1IU8IYTPAZ+rb58Crhj+kNJJcdqmQqPajm9qHI22vmatrhl0Xshk6krMVKdtKjSq7fimxo3R2Nes1TWvoe1CJlNZKN7X7TTmomn6f/wn6uN20drXnNKj3WZMFXDv63Ya88gCz17WsJ7V+rhd3DXbQL1Cicnq9r5upzEtd+Dumm2guoDHZnV7X7fTmJY7cHfNNlCrUFKyutc2dOhFvw0dnJbScgc+fek0E33mP8EE05fanr8V1BbwlA2Kva/baUzLHfjCAwus9pn/KqssPGB7/lZQq1BSffbUFJw4AQcOwMmTsHcvHDwI27cPf4zOGFPagSvbR1OzA9eMtr55tQU81Wev9+aLizA353tcOuso6cAV7qOp3YFrRGPffFQWyqDEZKGk5JR4tonTmFJZKEozV0ad7TFujPrnlSMLJSspPjvFmzstpVS2t9LMFe+3jkNr37xahQLxOSXeB+5EUSLbu1ROeQJt77eO8dla1wxUF3CIyynxPnAnmtzZ3iVzyhNoa791rM/WumagVqGk4Fkojjpi99Essbdny0nJgdGa0WKqgHsWiqMOzylXR4rP1rpmoF6hxOAO3ClBdC/wjim4+gTccwCeOAkX7IXLD8LWHhcntDyjpQSpPlvjmoGpAu4O3MlNUi/weqe9vAgPzvV22i3PaCnBID5b25qBKYXiDtzJSVKGdqzTjnXmTjRafXYKpgq4O3AnJ0m9wLFOu1R/eovR6rNTMKVQ3IE7OUlyp7WzfmoV5p+EpRXYswVmz4dtE/R22jHO3ElCo89OwVQBdwfu5CTJnZ6/h7t/BDP/C8s/efbhznfg6M/BVJM+8LM5cycZbT47BVMKxR24k5MUd9rtTDNzeuKnijdUxXzm9ATdzroPpfeBOxGYKuDuwJ2cpLjT+fsXWF7p/aFcXlll/v51H0r
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"fig = plt.figure()\n",
"ax = fig.add_subplot(111)\n",
"setosa = iris_data[:50]\n",
"versicolor = iris_data[50:100]\n",
"virginica = iris_data[100:150]\n",
"# ax.scatter(setosa['sepal.length'],np.arange(50), color='blue', lw=2)\n",
"# ax.scatter(versicolor['sepal.length'],np.arange(50), color='orange', lw=2)\n",
"# ax.scatter(virginica['sepal.length'],np.arange(50), color='green', lw=2) \n",
"\n",
"ax.scatter(setosa['petal.width'],np.arange(50), color='blue', lw=2)\n",
"ax.scatter(versicolor['petal.width'],np.arange(50), color='orange', lw=2)\n",
"ax.scatter(virginica['petal.width'],np.arange(50), color='green', lw=2) "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Yabcm4Rei2ue"
},
"source": [
"![GaussianNB.png](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAA8AAAAKACAYAAABT+RFDAAABhWlDQ1BJQ0MgcHJvZmlsZQAAKJF9kT1Iw0AcxV9TpVUqDnYQdchQHcSCqIijVqEIFUKt0KqDyaVf0KQhSXFxFFwLDn4sVh1cnHV1cBUEwQ8QJ0cnRRcp8X9JoUWsB8f9eHfvcfcOEGolplkd44Cm22YyHhPTmVUx8IogBHRhEKMys4w5SUqg7fi6h4+vd1Ge1f7cn6NHzVoM8InEs8wwbeIN4ulN2+C8TxxmBVklPiceM+mCxI9cVzx+45x3WeCZYTOVnCcOE4v5FlZamBVMjXiKOKJqOuULaY9VzluctVKFNe7JXxjK6ivLXKc5hDgWsQQJIhRUUEQJNqK06qRYSNJ+rI1/wPVL5FLIVQQjxwLK0CC7fvA/+N2tlZuc8JJCMaDzxXE+hoHALlCvOs73sePUTwD/M3ClN/3lGjDzSXq1qUWOgN5t4OK6qSl7wOUO0P9kyKbsSn6aQi4HvJ/RN2WAvluge83rrbGP0wcgRV0lboCDQ2AkT9nrbd4dbO3t3zON/n4ARb1ylXHM+TcAAAAGYktHRAD0APwA/9UrKNsAAAAJcEhZcwAACxMAAAsTAQCanBgAAAAHdElNRQfkCAIJHTANHBdBAAAAGXRFWHRDb21tZW50AENyZWF0ZWQgd2l0aCBHSU1QV4EOFwAAIABJREFUeNrs3Xd4lfX9//HXnZzskDCSkEASVgxLZIugAkqrIEgrw4KihCVVsa2jitZRrP1+7VeLP5xQxdAUBC1EGnEUWTKkypAhIAEkJoEMdgbZ5/z+SLlznwxIQgJJzvNxXbmu3Ofc5z73fX+Okdd5f4bhcDgcAgAAAACgiXPjFgAAAAAACMAAAAAAABCAAQAAAAAgAAMAAAAAQAAGAAAAAIAADAAAAAAAARgAAAAAAAIwAAAAAAAEYAAAAAAAARgAAAAAAAIwAAAAAABNga0xnrSXl5eCg4NpPQAAAABwYSdOnFBBQUHTDsDBwcFKTU2ltQEAAADAhYWHh9dof7pAAwAAAABcAgEYAAAAAEAABgAAAACAAAwAAAAAAAEYAAAAAAACMAAAAAAABGAAAAAAAAjAAAAAAADUko1bAAAAADQODofD/AGaAsMwzB8CMAAAAAAVFRXp1KlTOnfunOx2OzcETY6fn58CAgIUGBhYr2GYAAwAAAA0YCUlJUpOTpaXl5ciIiLk5eXFTUGTUlxcrNzcXJ08eVLnz59XWFhYvYVgAjAAAADQgJ06dUo2m01t27a9Yt1EgSvJ3d1dXl5eCggIUFJSks6dO6fmzZvXy3sxCRYAAADQgOXm5qp58+aEXzR5NptNLVq0UFZWVr29BwEYAAAAaKAcDofy8/Pl4+PDzYBL8Pf3V25ubr1N9EYABgAAABpwAJZKu4gCrsBmszl99gnAAAAAgIsFYIDPPgEYAAAAAAACMAAAAAAABGAAAAAATcqhQ4c0a9YsdevWTX5+fvL29lZ4eLj69++vWbNmacWKFdwk1D4AHzp0SIMGDVJ0dLT69++vffv2Vbmvw+HQrbfeWmEtp1WrVqlLly665pprNGbMmHqd7hoAAABA0xQfH68ePXrorbfeUmZmpm688UaNHTtW1113nY4dO6a33npLM2fOvOz3SUpKkmEYat++PTfd1QLwzJkz9cADDygxMVFPPfWUYmJiqtz3tddeU6dOnZwey8nJ0bRp07Ry5UodOnRIbdq00Z/+9CdaBAAAAEC1ZWRkaPLkySooKNDjjz+u1NRUrV69WkuWLNFnn32m48ePa/v27ZoxYwY3C7ULwJmZmdq+fbsmTZokSRo7dqxSUlJ0+PDhCvvu27dPK1eu1OzZs50e//zzz9W7d2916dJFkvTQQw9p6dKltAgAAACAalu1apVycnLUpk0bvfrqq/L29q6wT9++ffW///u/3CzULgCnpKQoLCzMXKPJMAxFRkYqOTnZab+ioiLNmDFDCxYsqLB2WXJystq1a2dut2/fXmlpaSouLqZVAAAAAFRLRkaGJCk4OLjGry0uLtZ7772noUOHqmXLlvLy8lKHDh304IMPKiUlxWnfmJgYdejQQZL0008/yTAMp5/yli1bpmHDhpnHbdeunaZOnarExMRKzyUtLU2//e1vFR0dLW9vb/n6+ioiIkLDhg3Tq6++WmH/+Ph4TZ8+Xddee61atGghb29vdejQQVOnTtXBgwf5YFTBVp8HnzNnjsaMGaOuXbsqKSmp1seZO3eu5s6da27n5OTQcgAanJM5Bfp0T5pyCsq+yOsWFqBbuoRwcwAAqCeRkZGSpO+//15r167VsGHDqvW67OxsjR49Whs2bJC/v7/69u2r4OBg7d27V/Pnz9c///lPffnll+rdu7ck6aabblJOTo5WrFghPz8/jRs3rtLjOhwOxcTEKC4uTjabTYMHD1ZISIh27typ2NhYffjhh1qxYoWGDx9uviY9PV39+vXT8ePHFRkZqeHDh8vb21vHjx/Xrl27tGPHDj3xxBNO73P33XfLy8tL3bp106233qri4mJ9//33io2N1UcffaTVq1dr0KBBfEDKMRy1WGE4MzNTUVFROn36tGw2mxwOh8LCwrR582ZFRUWZ+918881KTk6WYRgqLi42G3Tbtm3asGGDFi5cqC+++EKStH//ft12221KTU295PuHh4dXaz8AuFLyi0p05xubdSiz4hd0/3NXD90zIJKbBACosZKSEiUmJio6OrpCj8oL//9JPn2+8YbXlr7y9nC/rGPk5OSoS5cuOnbsmAzD0JAhQzRs2DD16dNH/fv3r7IyfO+99+qDDz7QqFGjtHDhQoWElH1h/f/+3//To48+qmuuuUYHDhww731SUpI6dOigdu3aVVngmz9/vh588EEFBQXpyy+/VK9evcxgPGfOHM2ZM0fNmzdXYmKieW4vvviiXnjhBT3wwAOaP3++U0W5qKhIGzdurBDsP/zwQ40aNUp+fn5O4fudd97Rww8/rO7du2vv3r2VVqcb82f+crNhrQKwJA0dOlQxMTGKiYnR8uXL9fLLL2v79u1V7p+UlKRevXrp7Nmz5jcunTp10saNG9WlSxfNmjVL3t7elZb3CcAAGrq31h/WK/+uvLtRgLdN658Yqlb+XtwoAECdhoHEjGzd9trGRnt9qx8drOjWzS77OAcPHtTkyZP1zTffVHiuV69emjlzpmbMmGHewwMHDqh79+4KCwvTDz/8oGbNKp7DyJEj9dlnn+mTTz7RqFGjqh2Ao6KidOTIEb3++ut65JFHnJ5zOBzq1auX9uzZoz//+c965plnJEkPP/yw3n77bcXHx+uuu+667PsxaNAgbd26Vfv27VO3bt0IwBa1ngV6wYIFWrBggaKjo/Xyyy8rNjZWkjR9+nQlJCRc8vXNmjXTe++9p1/+8peKiopSamqqnnvuOf7KAWh0jp/N05vryiYBbOZtU2RLX3M7K79Yr65mLA4AAPWlc+fO+s9//qNvvvlGzz//vG6//Xazurpr1y49+OCDGj58uAoLCyVJn332mRwOh0aMGFFp+JVKC36S9PXXX1f7PFJTU3XkyBFJ0uTJkys8bxiGpkyZIklav369+fj1118vSZo9e7bi4+OrPeTz8OHDevPNN/W73/1O06ZNMwuUF8ZFMxa4ItvlfMi2bt1a4fH33nuv0v3bt29vVn8vGD16tEaPHk0rAGjU/uezA8orKjG3F025Xn3btdDMf2zXv/eV/g9o2bYUTbw+UteFN+eGAQBQT66//nozTDocDn333Xd65ZVXtGzZMq1Zs0bz5s3T73//e/3444+SpIULF2rhwoUXPeaJEyeq/f7Hjh2TJLVq1UoBAQGV7nNhedgL+0rSfffdpy+//FJLlizR2LFj5e7urm7duummm27SuHHjdOuttzodo6SkRLNmzdKCBQt0sQ69WVlZfCjqKgADAKStR05p1Z40c3tMn7bq266FJOnZkd204eAJFRTb5XBILyTs04pfD5Kbm8GNAwDUiciWvlr96OBGff71xTAM9enTR0
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dsf6FnlgjiOL"
},
"source": [
"# Funkcja gęstości prawdopodobieństwa rozkładu normalnego \n",
"![gestosc.svg](data:image/svg+xml;base64,PHN2ZyB4bWxuczp4bGluaz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94bGluayIgd2lkdGg9IjM2LjMyOGV4IiBoZWlnaHQ9IjcuNTA5ZXgiIHN0eWxlPSJ2ZXJ0aWNhbC1hbGlnbjogLTMuMTcxZXg7IiB2aWV3Qm94PSIwIC0xODY3LjcgMTU2NDEuMiAzMjMzLjIiIHJvbGU9ImltZyIgZm9jdXNhYmxlPSJmYWxzZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiBhcmlhLWxhYmVsbGVkYnk9Ik1hdGhKYXgtU1ZHLTEtVGl0bGUiPgo8dGl0bGUgaWQ9Ik1hdGhKYXgtU1ZHLTEtVGl0bGUiPntcZGlzcGxheXN0eWxlIGZfe1xtdSAsXHNpZ21hIH0oeCk9e1xmcmFjIHsxfXtcc2lnbWEge1xzcXJ0IHsyXHBpIH19fX1cLFxleHAgXGxlZnQoe1xmcmFjIHstKHgtXG11ICleezJ9fXsyXHNpZ21hIF57Mn19fVxyaWdodCkufTwvdGl0bGU+CjxkZWZzIGFyaWEtaGlkZGVuPSJ0cnVlIj4KPHBhdGggc3Ryb2tlLXdpZHRoPSIxIiBpZD0iRTEtTUpNQVRISS02NiIgZD0iTTExOCAtMTYyUTEyMCAtMTYyIDEyNCAtMTY0VDEzNSAtMTY3VDE0NyAtMTY4UTE2MCAtMTY4IDE3MSAtMTU1VDE4NyAtMTI2UTE5NyAtOTkgMjIxIDI3VDI2NyAyNjdUMjg5IDM4MlYzODVIMjQyUTE5NSAzODUgMTkyIDM4N1ExODggMzkwIDE4OCAzOTdMMTk1IDQyNVExOTcgNDMwIDIwMyA0MzBUMjUwIDQzMVEyOTggNDMxIDI5OCA0MzJRMjk4IDQzNCAzMDcgNDgyVDMxOSA1NDBRMzU2IDcwNSA0NjUgNzA1UTUwMiA3MDMgNTI2IDY4M1Q1NTAgNjMwUTU1MCA1OTQgNTI5IDU3OFQ0ODcgNTYxUTQ0MyA1NjEgNDQzIDYwM1E0NDMgNjIyIDQ1NCA2MzZUNDc4IDY1N0w0ODcgNjYyUTQ3MSA2NjggNDU3IDY2OFE0NDUgNjY4IDQzNCA2NThUNDE5IDYzMFE0MTIgNjAxIDQwMyA1NTJUMzg3IDQ2OVQzODAgNDMzUTM4MCA0MzEgNDM1IDQzMVE0ODAgNDMxIDQ4NyA0MzBUNDk4IDQyNFE0OTkgNDIwIDQ5NiA0MDdUNDkxIDM5MVE0ODkgMzg2IDQ4MiAzODZUNDI4IDM4NUgzNzJMMzQ5IDI2M1EzMDEgMTUgMjgyIC00N1EyNTUgLTEzMiAyMTIgLTE3M1ExNzUgLTIwNSAxMzkgLTIwNVExMDcgLTIwNSA4MSAtMTg2VDU1IC0xMzJRNTUgLTk1IDc2IC03OFQxMTggLTYxUTE2MiAtNjEgMTYyIC0xMDNRMTYyIC0xMjIgMTUxIC0xMzZUMTI3IC0xNTdMMTE4IC0xNjJaIj48L3BhdGg+CjxwYXRoIHN0cm9rZS13aWR0aD0iMSIgaWQ9IkUxLU1KTUFUSEktM0JDIiBkPSJNNTggLTIxNlE0NCAtMjE2IDM0IC0yMDhUMjMgLTE4NlEyMyAtMTc2IDk2IDExNlQxNzMgNDE0UTE4NiA0NDIgMjE5IDQ0MlEyMzEgNDQxIDIzOSA0MzVUMjQ5IDQyM1QyNTEgNDEzUTI1MSA0MDEgMjIwIDI3OVQxODcgMTQyUTE4NSAxMzEgMTg1IDEwN1Y5OVExODUgMjYgMjUyIDI2UTI2MSAyNiAyNzAgMjdUMjg3IDMxVDMwMiAzOFQzMTUgNDVUMzI3IDU1VDMzOCA2NVQzNDggNzdUMzU2IDg4VDM2NSAxMDBMMzcyIDExMEw0MDggMjUzUTQ0NCAzOTUgNDQ4IDQwNFE0NjEgNDMxIDQ5MSA0MzFRNTA0IDQzMSA1MTIgNDI0VDUyMyA0MTJUNTI1IDQwMkw0NDkgODRRNDQ4IDc5IDQ0OCA2OFE0NDggNDMgNDU1IDM1VDQ3NiAyNlE0ODUgMjcgNDk2IDM1UTUxNyA1NSA1MzcgMTMxUTU0MyAxNTEgNTQ3IDE1MlE1NDkgMTUzIDU1NyAxNTNINTYxUTU4MCAxNTMgNTgwIDE0NFE1ODAgMTM4IDU3NSAxMTdUNTU1IDYzVDUyMyAxM1E1MTAgMCA0OTEgLThRNDgzIC0xMCA0NjcgLTEwUTQ0NiAtMTAgNDI5IC00VDQwMiAxMVQzODUgMjlUMzc2IDQ0VDM3NCA1MUwzNjggNDVRMzYyIDM5IDM1MCAzMFQzMjQgMTJUMjg4IC00VDI0NiAtMTFRMTk5IC0xMSAxNTMgMTJMMTI5IC04NVExMDggLTE2NyAxMDQgLTE4MFQ5MiAtMjAyUTc2IC0yMTYgNTggLTIxNloiPjwvcGF0aD4KPHBhdGggc3Ryb2tlLXdpZHRoPSIxIiBpZD0iRTEtTUpNQUlOLTJDIiBkPSJNNzggMzVUNzggNjBUOTQgMTAzVDEzNyAxMjFRMTY1IDEyMSAxODcgOTZUMjEwIDhRMjEwIC0yNyAyMDEgLTYwVDE4MCAtMTE3VDE1NCAtMTU4VDEzMCAtMTg1VDExNyAtMTk0UTExMyAtMTk0IDEwNCAtMTg1VDk1IC0xNzJROTUgLTE2OCAxMDYgLTE1NlQxMzEgLTEyNlQxNTcgLTc2VDE3MyAtM1Y5TDE3MiA4UTE3MCA3IDE2NyA2VDE2MSAzVDE1MiAxVDE0MCAwUTExMyAwIDk2IDE3WiI+PC9wYXRoPgo8cGF0aCBzdHJva2Utd2lkdGg9IjEiIGlkPSJFMS1NSk1BVEhJLTNDMyIgZD0iTTE4NCAtMTFRMTE2IC0xMSA3NCAzNFQzMSAxNDdRMzEgMjQ3IDEwNCAzMzNUMjc0IDQzMFEyNzUgNDMxIDQxNCA0MzFINTUyUTU1MyA0MzAgNTU1IDQyOVQ1NTkgNDI3VDU2MiA0MjVUNTY1IDQyMlQ1NjcgNDIwVDU2OSA0MTZUNTcwIDQxMlQ1NzEgNDA3VDU3MiA0MDFRNTcyIDM1NyA1MDcgMzU3UTUwMCAzNTcgNDkwIDM1N1Q0NzYgMzU4SDQxNkw0MjEgMzQ4UTQzOSAzMTAgNDM5IDI2M1E0MzkgMTUzIDM1OSA3MVQxODQgLTExWk0zNjEgMjc4UTM2MSAzNTggMjc2IDM1OFExNTIgMzU4IDExNSAxODRRMTE0IDE4MCAxMTQgMTc4UTEwNiAxNDEgMTA2IDExN1ExMDYgNjcgMTMxIDQ3VDE4OCAyNlEyNDIgMjYgMjg3IDczUTMxNiAxMDMgMzM0IDE1M1QzNTYgMjMzVDM2MSAyNzhaIj48L3BhdGg+CjxwYXRoIHN0cm9rZS13aWR0aD0iMSIgaWQ9IkUxLU1KTUFJTi0yOCIgZD0iTTk0IDI1MFE5NCAzMTkgMTA0IDM4MVQxMjcgNDg4VDE2NCA1NzZUMjAyIDY0M1QyNDQgNjk1VDI3NyA3MjlUMzAyIDc1MEgzMTVIMzE5UTMzMyA3NTAgMzMzIDc0MVEzMzMgNzM4IDMxNiA3MjBUMjc1IDY2N1QyMjYgNTgxVDE4NCA0NDNUMTY3IDI1MFQxODQgNThUMjI1IC04MVQyNzQgLTE2N1QzMTYgLTIyMFQzMzMgLTI0MVEzMzMgLTI1MCAzMTggLTI1MEgzMTVIMzAyTDI3NCAtMjI2UTE4MCAtMTQxIDEzNyAtMTRUOTQgMjUwWiI+PC9wYXRoPgo8cGF0aCBzdHJva2Utd2lkdGg9IjEiIGlkPSJFMS1NSk1BVEhJLTc4IiBkPSJNNTIgMjg5UTU5IDMzMSAxMDYgMzg2VDIyMiA0NDJRMjU3IDQ0MiAyODYgNDI0VDMyOSAzNzlRMzcxIDQ0MiA0MzAgNDQyUTQ2NyA0NDIgNDk0IDQyMFQ1MjIgMzYxUTUyMiAzMzIgNTA4IDMxNFQ0ODEgMjkyVDQ1OCAyO
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "v0oeHebytjNp"
},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import scipy.stats as stats\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns\n",
"from sklearn.model_selection import train_test_split\n",
"sns.set(style=\"whitegrid\")"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "fOYTA3VVtjNw"
},
"outputs": [],
"source": [
"class NaiveBayesClassifier():\n",
" def calc_prior(self, features, target):\n",
" '''\n",
" Wyliczenie prawdopodobieństwa a priori\n",
" '''\n",
" self.prior = (features.groupby(target).apply(lambda x: len(x)) / self.rows).to_numpy()\n",
"\n",
" return self.prior\n",
" \n",
" def calc_statistics(self, features, target):\n",
" '''\n",
" Wyliczenie średnich i wariancji dla danych\n",
" ''' \n",
" self.mean = features.groupby(target).apply(np.mean).to_numpy()\n",
" self.var = features.groupby(target).apply(np.var).to_numpy()\n",
" \n",
" return self.mean, self.var\n",
" \n",
" def gaussian_density(self, class_idx, x): \n",
" '''\n",
" Wyliczenie prawdopodobieństwa z rozkładu normalnego \n",
" (1/√2pi*σ) * exp((-1/2)*((x-μ)^2)/(2*σ²))\n",
" μ -średnia\n",
" σ² - wariancja\n",
" σ - odchylenie standardowe\n",
" '''\n",
" mean = self.mean[class_idx]\n",
" var = self.var[class_idx]\n",
2022-05-17 18:19:47 +02:00
" \n",
" numerator = np.exp((-1/2)*((x-mean)**2) / (2 * var)) # Licznik wzoru na gęstość rozkładu normalnego \n",
" denominator = np.sqrt(2 * np.pi * var) # Mianownik wzoru na gęstość rozkładu normalnego \n",
" prob = numerator / denominator\n",
2022-05-17 18:19:47 +02:00
" \n",
" return prob\n",
" \n",
2022-05-17 18:19:47 +02:00
" def classify(self, x):\n",
" '''\n",
" Wyliczenie prawdopodobieństwa a posteriori i zwrócenie klasy, dla której prawdopodobieństwo jest najwyższe\n",
" '''\n",
" posteriors = []\n",
" posteriors_no_log = []\n",
"\n",
" # calculate posterior probability for each class\n",
" for i in range(self.count):\n",
" prior = np.log(self.prior[i]) # Do predykcji używane jest prawodopodobieństwo logarytmiczne\n",
" prior_no_log = self.prior[i] # Zwykłe prawdopodobieństwo liczymy, żeby zwrócić je z predykcjami\n",
"\n",
" conditional = np.sum(np.log(self.gaussian_density(i, x))) \n",
" conditional_no_log = np.prod(self.gaussian_density(i, x))\n",
"\n",
" posterior = prior + conditional\n",
" posterior_no_log = prior_no_log * conditional_no_log\n",
"\n",
" posteriors.append(posterior)\n",
" posteriors_no_log.append(posterior_no_log)\n",
"\n",
" # Zwracamy klasę o największym prawdopodobieństwie\n",
" return self.classes[np.argmax(posteriors)], np.max(posteriors_no_log)\n",
"\n",
" def fit(self, features, target):\n",
" '''\n",
" Główna metoda trenująca model\n",
" '''\n",
" self.classes = np.unique(target)\n",
" self.count = len(self.classes)\n",
" self.feature_nums = features.shape[1]\n",
" self.rows = features.shape[0]\n",
" \n",
" self.calc_statistics(features, target)\n",
" self.calc_prior(features, target)\n",
" \n",
" def predict(self, features):\n",
" '''\n",
" Predykcja wartości dla każdego wiersza\n",
" '''\n",
2022-05-17 18:19:47 +02:00
" preds = [self.classify(f) for f in features.to_numpy()]\n",
" return preds\n",
"\n",
" def accuracy(self, y_test, y_pred):\n",
" '''\n",
" Wyliczenie accuracy modelu\n",
" '''\n",
" accuracy = np.sum(y_test == y_pred) / len(y_test)\n",
" return accuracy\n",
"\n",
" def visualize(self, y_true, y_pred, target):\n",
" '''\n",
" Narysowanie wykresu porównującego rozkład klas prawdziwych i przewidzianych\n",
" '''\n",
" tr = pd.DataFrame(data=y_true, columns=[target])\n",
" pr = pd.DataFrame(data=y_pred, columns=[target])\n",
" \n",
" \n",
" fig, ax = plt.subplots(1, 2, sharex='col', sharey='row', figsize=(15,6))\n",
" \n",
2022-05-17 18:19:47 +02:00
" sns.countplot(x=target, data=tr, ax=ax[0], alpha=0.7, hue=target, dodge=False)\n",
" sns.countplot(x=target, data=pr, ax=ax[1], alpha=0.7, hue=target, dodge=False)\n",
" \n",
" ax[0].tick_params(labelsize=12)\n",
" ax[1].tick_params(labelsize=12)\n",
" ax[0].set_title(\"Prawdziwe wartości\", fontsize=18)\n",
2022-05-17 18:19:47 +02:00
" ax[1].set_title(\"Predykcje\", fontsize=18)\n",
" plt.show()\n"
]
},
2022-05-17 18:54:16 +02:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Pitność wody"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 382
},
"id": "5-riUAGntjN2",
"outputId": "f87f047d-bc71-41ef-a43a-17b6f7cf84c3"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(2948, 9) (2948,)\n",
"(328, 9) (328,)\n"
]
}
],
"source": [
"# Preprocessing danych\n",
"\n",
"# Uzupełnienie pustych wartości w kolumnach\n",
"def fill_nan(df):\n",
" for index, column in enumerate(df.columns[:9]):\n",
" df[column] = df[column].fillna(df.groupby('Potability')[column].transform('mean'))\n",
" return df\n",
"\n",
"# Wczytywanie danych\n",
"df = pd.read_csv(\"water_potability.csv\")\n",
"\n",
"df = fill_nan(df)\n",
"\n",
"# Zrandomizowanie kolejności danych w datasecie\n",
"df = df.sample(frac=1, random_state=10).reset_index(drop=True)\n",
"\n",
"# Podział na atrybuty i przewidywane wartości\n",
"X, y = df.iloc[:, :-1], df.iloc[:, -1]\n",
"\n",
"# Normalizacja i skalowanie danych\n",
"from sklearn.preprocessing import StandardScaler\n",
"sc = StandardScaler()\n",
"X = sc.fit_transform(X.to_numpy())\n",
"X = pd.DataFrame(X, columns=df.columns.values.tolist()[:-1])\n",
"\n",
"# Podział na dane trenujące i testowe, z uwzględnieniem równego rozłożenia danych\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, stratify=y, random_state=1)\n",
"\n",
"print(X_train.shape, y_train.shape)\n",
"print(X_test.shape, y_test.shape)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"id": "O82SGzK6tjN5"
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>ph</th>\n",
" <th>Hardness</th>\n",
" <th>Solids</th>\n",
" <th>Chloramines</th>\n",
" <th>Sulfate</th>\n",
" <th>Conductivity</th>\n",
" <th>Organic_carbon</th>\n",
" <th>Trihalomethanes</th>\n",
" <th>Turbidity</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1022</th>\n",
" <td>0.003078</td>\n",
" <td>0.688791</td>\n",
" <td>0.846257</td>\n",
" <td>1.428934</td>\n",
" <td>-0.858263</td>\n",
" <td>0.002792</td>\n",
" <td>0.913790</td>\n",
" <td>0.232417</td>\n",
" <td>2.319505</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3191</th>\n",
" <td>-0.587365</td>\n",
" <td>0.223203</td>\n",
" <td>-0.731867</td>\n",
" <td>0.397503</td>\n",
" <td>0.759893</td>\n",
" <td>0.330607</td>\n",
" <td>0.094379</td>\n",
" <td>0.282563</td>\n",
" <td>0.235024</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>0.003078</td>\n",
" <td>-0.241037</td>\n",
" <td>0.773051</td>\n",
" <td>0.580019</td>\n",
" <td>1.334369</td>\n",
" <td>-0.049130</td>\n",
" <td>-1.121422</td>\n",
" <td>-0.200432</td>\n",
" <td>-0.946356</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2068</th>\n",
" <td>-2.176058</td>\n",
" <td>1.443006</td>\n",
" <td>-1.626771</td>\n",
" <td>-4.164610</td>\n",
" <td>-0.033706</td>\n",
" <td>-1.050763</td>\n",
" <td>-0.391328</td>\n",
" <td>-0.398649</td>\n",
" <td>-0.298341</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1484</th>\n",
" <td>0.213047</td>\n",
" <td>0.403036</td>\n",
" <td>-0.464729</td>\n",
" <td>0.070417</td>\n",
" <td>0.021560</td>\n",
" <td>-0.952776</td>\n",
" <td>-0.213330</td>\n",
" <td>0.111419</td>\n",
" <td>-0.235893</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>691</th>\n",
" <td>0.003078</td>\n",
" <td>1.199106</td>\n",
" <td>-0.003483</td>\n",
" <td>-0.670308</td>\n",
" <td>-0.069513</td>\n",
" <td>0.185754</td>\n",
" <td>-0.466010</td>\n",
" <td>0.031975</td>\n",
" <td>0.676276</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1283</th>\n",
" <td>-2.034004</td>\n",
" <td>-1.508135</td>\n",
" <td>0.255310</td>\n",
" <td>0.083839</td>\n",
" <td>-1.413707</td>\n",
" <td>0.694074</td>\n",
" <td>-1.110579</td>\n",
" <td>0.232996</td>\n",
" <td>2.544703</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2818</th>\n",
" <td>-0.702987</td>\n",
" <td>-0.575677</td>\n",
" <td>0.755056</td>\n",
" <td>0.664695</td>\n",
" <td>0.021560</td>\n",
" <td>-0.489334</td>\n",
" <td>0.371852</td>\n",
" <td>-2.272990</td>\n",
" <td>-1.764684</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1330</th>\n",
" <td>1.525943</td>\n",
" <td>0.497074</td>\n",
" <td>-0.714355</td>\n",
" <td>-1.024237</td>\n",
" <td>-1.022037</td>\n",
" <td>-0.327074</td>\n",
" <td>-1.107341</td>\n",
" <td>0.517432</td>\n",
" <td>-1.230528</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1926</th>\n",
" <td>-0.043558</td>\n",
" <td>-0.882359</td>\n",
" <td>-0.456141</td>\n",
" <td>-0.770271</td>\n",
" <td>0.795189</td>\n",
" <td>0.560306</td>\n",
" <td>-1.086081</td>\n",
" <td>-1.356820</td>\n",
" <td>0.172521</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>2948 rows × 9 columns</p>\n",
"</div>"
],
"text/plain": [
" ph Hardness Solids Chloramines Sulfate Conductivity \\\n",
"1022 0.003078 0.688791 0.846257 1.428934 -0.858263 0.002792 \n",
"3191 -0.587365 0.223203 -0.731867 0.397503 0.759893 0.330607 \n",
"13 0.003078 -0.241037 0.773051 0.580019 1.334369 -0.049130 \n",
"2068 -2.176058 1.443006 -1.626771 -4.164610 -0.033706 -1.050763 \n",
"1484 0.213047 0.403036 -0.464729 0.070417 0.021560 -0.952776 \n",
"... ... ... ... ... ... ... \n",
"691 0.003078 1.199106 -0.003483 -0.670308 -0.069513 0.185754 \n",
"1283 -2.034004 -1.508135 0.255310 0.083839 -1.413707 0.694074 \n",
"2818 -0.702987 -0.575677 0.755056 0.664695 0.021560 -0.489334 \n",
"1330 1.525943 0.497074 -0.714355 -1.024237 -1.022037 -0.327074 \n",
"1926 -0.043558 -0.882359 -0.456141 -0.770271 0.795189 0.560306 \n",
"\n",
" Organic_carbon Trihalomethanes Turbidity \n",
"1022 0.913790 0.232417 2.319505 \n",
"3191 0.094379 0.282563 0.235024 \n",
"13 -1.121422 -0.200432 -0.946356 \n",
"2068 -0.391328 -0.398649 -0.298341 \n",
"1484 -0.213330 0.111419 -0.235893 \n",
"... ... ... ... \n",
"691 -0.466010 0.031975 0.676276 \n",
"1283 -1.110579 0.232996 2.544703 \n",
"2818 0.371852 -2.272990 -1.764684 \n",
"1330 -1.107341 0.517432 -1.230528 \n",
"1926 -1.086081 -1.356820 0.172521 \n",
"\n",
"[2948 rows x 9 columns]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X_train"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"id": "a3jkTMFLtjN6"
},
2022-05-17 20:01:51 +02:00
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/andrzej/anaconda3/lib/python3.9/site-packages/numpy/core/fromnumeric.py:3438: FutureWarning: In a future version, DataFrame.mean(axis=None) will return a scalar mean over the entire DataFrame. To retain the old behavior, use 'frame.mean(axis=0)' or just 'frame.mean()'\n",
" return mean(axis=axis, dtype=dtype, out=out, **kwargs)\n"
]
}
],
"source": [
"# Trenowanie modelu klasyfikatora\n",
"x = NaiveBayesClassifier()\n",
"x.fit(X_train, y_train)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"id": "CoC22aNgtjN9"
},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Predykcja wartości dla danych testowych\n",
"predictions = x.predict(X_test)\n",
"\n",
"# Prawdopodobieństwa kolejnych predykcji\n",
"probabilities = [p[1] for p in predictions]\n",
"\n",
"# Przewidziana wartość\n",
"predictions = [p[0] for p in predictions]\n",
"predictions[0]"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"id": "JR06zodmtjN9"
},
"outputs": [
{
"data": {
"text/plain": [
2022-05-17 20:01:51 +02:00
"6.280487804878049e-01"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Wyliczenie accuracy modelu\n",
"x.accuracy(y_test, predictions)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"id": "1jW0QPootjN_"
},
"outputs": [
{
"data": {
"text/plain": [
"0.14084507042253522"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.metrics import f1_score\n",
"\n",
"f1_score(y_test, predictions)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"id": "vEVogTmAtjOA"
},
"outputs": [
{
"data": {
"text/plain": [
"0 0.609756\n",
"1 0.390244\n",
"Name: Potability, dtype: float64"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y_test.value_counts(normalize=True)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"id": "jCVOdBZytjOB"
},
"outputs": [
{
"data": {
2022-05-17 20:01:51 +02:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA4QAAAGQCAYAAAD2lq6fAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAABAyUlEQVR4nO3deXRU9f3/8ddkhUkiCEjYBUMWMBIigbBbkFWDrNYopgq4oBCRSiIFKYpaBSpUIoWKNLZYAYuyBDzsilsAsbhAEwggqxgLASQZyHp/f/DN/TEkQMg2Se7zcY7nOJ977+e+Z8B5+5q72QzDMAQAAAAAsBw3VxcAAAAAAHANAiEAAAAAWBSBEAAAAAAsikAIAAAAABZFIAQAAAAAiyIQArCM9PR0JSQkaPv27a4uBQAAoEogEALlIDg4WJMnT77h7SZPnqzg4OAKqAhXysnJ0fjx47Vjxw61b9/+hrffsWOHgoOD9dFHH5V/cQCAcnP8+HEFBwcrISGhVNuXd2+mf6Cq83B1AbCuHTt26He/+53TmN1uV6tWrTR48GA9/PDDcnd3d1F1cKXNmzcrJSVFsbGx5Tbn9OnTZRiGFi5cqFq1apXbvACAS+jrQPVEIITLRUVFqWfPnjIMQ7/88otWrlypP/3pTzpw4IBefvllV5dXoV5++WW99NJLri6jytm8ebNWrlxZboHw559/VtOmTTV58mT5+vqWao6OHTvq+++/l4cHX5sAcC1W7uvFoX+gquNvJlyubdu2Gjx4sPn6oYce0sCBA/Xvf/9bEyZMUIMGDYrdLjMzs9T/c19VeHp6urqEKqWi/kwbNWqk8ePHl2kONzc3eXt7l1NFAFBzlaav14SefjX0D1R1XEOIKsfX11fh4eEyDEPHjh2TJPXu3VsxMTH673//qzFjxqhDhw667777JF1qInPnztX999+vyMhIhYaGqm/fvvrzn/+sCxcumPNmZ2erXbt2Ra71mzZtmoKDg/Xqq686jT/77LO68847lZeXZ46lpaVpzJgxat++vTp16qRJkybp9OnTRd5DTEyMgoODi/2nd+/e5npXXqfw0UcfKTg4WDt27DDHcnNzFR4eruDgYKWkpJjjmZmZuv322/Xiiy867fuHH37QuHHjzM+if//+WrBggdP7uJqYmBin+iQpKSlJwcHBTs1dkt5//30FBwfr+++/lyQVFBRowYIFGjlypLp166bQ0FD95je/0fTp03XmzBmnbS+/vuPjjz/WsGHD1K5dO73yyiuKiYnRypUrJcnpc7v82ovU1FTzPd5xxx265557tGjRIuXn5zvt5+TJk/rDH/6gXr16KTQ0VF26dFF0dLQ5fyHDMPTBBx/o/vvvV3h4uMLDwzVo0CC9+eab5jpcAwIApXNlX79WT5ekw4cPKy4uTt27d1doaKh69+6tmTNnyuFwFJl7165dio6OVrt27dS1a1fNmDGjyHr//e9/FRwcrLlz5xZb3+OPP64777yz2PkLXbx4UU899ZRCQ0OVlJRkjpelfxiGoffff1/Dhg1TWFiYwsPDFRMTw43PUOk4QogqxzAMHTlyRJJ08803m+M//fSTHnnkEQ0YMED9+vUzv7jT09O1YsUK9evXT1FRUfLw8NDOnTv1zjvvKCUlRYsXL5YkeXt7q3379kW+aLdv3y43NzenccMwtHPnTnXs2NE8xePYsWMaOXKkcnJyNHLkSDVu3FiffPKJHnvssSLvYezYsRoxYoTT2LFjx5SQkKD69etf9b136dJFkpScnKzIyEhJ0nfffSeHwyE3NzclJyerTZs2ki41wby8PHXu3Nncftu2bRo3bpxuvfVWjR49WnXq1NG3336refPmKSUlRfPmzbvWR6/IyEglJCTo6NGjatGihdPns2/fPmVkZKhevXrmuK+vr26//XZJl4Lr4sWL1a9fP919992qXbu2fvjhB3344Yf6z3/+ow8//FBeXl5O+9u8ebOWLFmiBx98UNHR0fL19VWdOnVUUFCgXbt2adasWea6d955p6RLgTcmJkYeHh4aOXKkGjRooE8++UR//vOflZqaqjfeeEOSlJeXp1GjRik9PV0PPfSQWrZsqczMTO3bt0+7du3S0KFDzbnj4uKUlJSksLAwjR07Vn5+fjp06JA2bNigCRMmXPMzAwBcW3F9/Wo9fc+ePXrkkUd000036YEHHpC/v79SU1O1ZMkS7d69W0uWLDHPrvnuu+80atQo+fj46PHHH5efn58+/vhjPf/88077b9u2rW6//XatXLlSzzzzjNN1jOnp6fryyy81fPhw2e32Yus/c+aMnnrqKe3fv19vv/22unbtai4rS/+Ii4vTunXr1L9/fw0bNkw5OTlKSkrS6NGjlZCQoLvvvvsGP2mglAzARbZv324EBQUZCQkJxunTp43Tp08bKSkpxtSpU42goCDjt7/9rblur169jKCgIOODDz4oMk92draRk5NTZHzu3LlGUFCQ8d1335lj8+fPN4KCgowff/zRMAzD+Omnn4ygoCBj0qRJRlBQkPG///3PMAzDSE1NNYKCgoy///3v5ra///3vjaCgICM5OdkcKygoMJ5++mkjKCjIeP7556/6Xs+ePWv079/f6NSpk3H48GFz/PnnnzeCgoKc1u3Tp4/xwAMPmK8TEhKMyMhIY8yYMcZjjz1mjr/22mtGcHCwcfr0acMwDOPixYtG165djYceesjIzc11mjMxMdEICgoytm/fftUaDcMwvv76ayMoKMhYvny5Oda7d2/z81m3bp35viMjI40nn3zS6bO4cOFCkTk/+OADp20NwzCOHTtmBAUFGW3btjUOHDhQZJviPpdCDzzwgNGmTRsjJSXFad/PPPOMERQUZHz11VeGYRhGSkqKERQUZLz99tvXfM/r1q0z/w7k5+c7Lbv8deHf1w8//PCa8wGAVZW0r1+rpw8aNMjo37+/cf78eafxjRs3FvkOfuCBB4zbb7/dOHTokDmWnZ1tDB8+3AgKCjLmzZtnji9btswICgoyPv30U6d5//rXvxb5f4XLe9CxY8eM/v37G926dTP27t3rtG1Z+kfh+1m2bJnTdrm5ucbQoUONXr16GQUFBUU+H6AicMooXC4hIUFdunRRly5dNHjwYH344Yfq3bu35s+f77Re3bp1NWzYsCLbe3l5mb8W5uXl6dy5c8rIyDB/wfvuu+/MdQuPphUeDUxOTpa7u7tiY2Nls9nM8cJTNgvXLygo0NatWxUaGup0RM5msxV7hPByubm5io2N1fHjxzV//nzdeuut11y/c+fO2rNnj7KyssxaIiMj1bVrV+3atUu5ubnmeHBwsHnE7ssvv9SpU6c0bNgw/frrr8rIyDD/6dmzp7nOtYSFhclut5ufw4kTJ3T8+HFFRUUpKCjIHN+3b5/OnDlT5LMovHtnfn6+WUPhOoWnll7urrvuUkBAwDVrutzp06e1e/du9e7dWyEhIU77Hjt2rCRp06ZNkiQ/Pz/zcyrutN5Chaf+PP/883Jzc/5KvPI1AOD6StLXi+vp+/bt0759+xQVFaWcnBynPtahQwfZ7Xazj13eD1q1amXO4eXlpUcffbRITVFRUbLb7VqxYoU5ZhiGPvroIwUFBaldu3ZFtklJSVF0dLQMw9DSpUvVtm1bp+Vl6R9r1qyRj4+P+vTp4/Q+f/31V/Xu3VsnTpzQ4cOHrzkHUF44ZRQu98ADD2jAgAGy2WyqXbu2WrZsqbp16xZZr3nz5le9XfW//vUvLVu2TAcOHFBBQYHTsnPnzpn/3q5dOzPwREdHa/v27QoNDVWLFi3MwBMVFaXt27erbt26Zug4ffq0HA6HbrvttiL7bt269TXf3x//+Eft2LFDM2fOVERExPU+DnXu3FkffPCBdu3apcjISH377beaMmWKwsLCNHPmTH3//fdq3bq1UlNTnW7vffDgQUnSlClTrjr3qVOnrrlvT09P3XnnnWYgTk5OloeHhyIiIhQZGanPPvtM0v8P1JcHQkn6+OOPlZiYqJSUFDO4Frr8z6FQy5Ytr1nPlY4fPy6p+M88ICBAbm5u5nWnTZs21dixY/X222+
"text/plain": [
"<Figure size 1080x432 with 2 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"x.visualize(y_test, predictions, 'Potability')"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"id": "aw8Tefprhjnn"
},
"outputs": [
{
"data": {
2022-05-17 20:01:51 +02:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAaQAAAEUCAYAAABkhkJAAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAA1R0lEQVR4nO3dfVhUZd4H8O/MMIAICooYpG7phlGkIiIqaoIaaIhvmO4K7qpXrpapu1nilVmoa+JVuqWmuVu6tPaylqLmS1oatlk8wsOjcKmh5jsoKqMiIDPO3M8fLhMv83JmmGHOwPdzXV4yc+5zn9855/7N75wzL0chhBAgIiJyMaWrAyAiIgJYkIiISCZYkIiISBZYkIiISBZYkIiISBZYkIiISBZYkIioxUhNTcXWrVuNj1evXo3o6GjExMS4MCqqwYLk5nJycjB48OAGz9dPPKLmJDc3F5MmTUJkZCT69u2LSZMm4fjx4zb1UVJSgk2bNmHPnj344YcfrLZPS0vD6tWr7Q2ZJPBwdQBERLa4e/cuZs6ciTfffBMjRoyATqdDbm4uPD09bernypUr8Pf3R/v27Z0UKdmKZ0huIi4uDh988AFGjhyJqKgoLFy4ENXV1a4Oi6jJnTt3DgCQmJgIlUoFb29vDBw4EI8//jjWrFmD+fPnG9tevnwZ3bt3x/379+v0ceTIEUybNg2lpaWIiIhAWloaAGDOnDmIiYlBZGQkJk+ejNOnTwMAPv/8c+zatQsffvghIiIiMHPmTADAtWvX8NJLL6Ffv36Ii4tDZmZmU2yCZosFyY3UJMSBAwdw7tw5vP/++64OiajJPfroo1CpVFiwYAGys7Nx+/Ztm/sYMGAA/v73vyMoKAj5+flYsWIFAGDw4MH4+uuv8eOPP+KJJ54wFreJEydi1KhRmD59OvLz87FhwwYYDAbMmjUL3bt3x+HDh/HPf/4T//znP/H99987dH1bEhYkNzJ58mQEBwfD398fs2bNwu7duwEApaWl6NOnT51/eXl5Lo6WyDl8fX3xySefQKFQ4PXXX0f//v0xc+ZM3Lhxo9F9Jycnw9fXF56ennjppZdw6tQplJeXm2xbUFCAsrIyzJ49G56enujcuTOee+457Nmzp9FxtFR8D8mNBAcHG/8OCQlBaWkpACAoKAiHDx+u0zY1NbVJYyNqSt26dTOe1Zw9exavvPIKli9fjkcffdTuPvV6PVavXo19+/ahrKwMSuWD43WNRgM/P78G7a9cuWI8GKzdR+3HZBsWJDdSUlJi/Lu4uBhBQUEujIZIHrp164Zx48bh888/xxNPPIF79+4Zp9ly1rRr1y58++232LRpEzp16oTy8nJERUWh5oYICoWiTvvg4GB06tQJ+/fvd8yKEC/ZuZNPPvkEV69exa1bt4wfcCBqac6ePYuPPvoIV69eBfDgQO2rr75Cz549ERYWhqNHj6K4uBjl5eX44IMPJPdbUVEBT09PBAQEoKqqCqtWraozvX379rh8+bLxcY8ePeDr64uNGzfi3r170Ov1KCoqsvnj5/QrFiQ3kpiYiGnTpmHYsGHo3LkzZs2a5eqQiJqcr68vjh07hgkTJqBXr1547rnnEBoairS0NMTExGDkyJFISkrCuHHjEBsbK7nfMWPGICQkBIMGDcKzzz6LXr161ZmenJyMM2fOoE+fPnjhhRegUqmwfv16nDp1CkOHDkW/fv2waNEi3L1718Fr3HIoeIM+9xAXF4dly5ZhwIABrg6FiMgpeIZERESywIJERESywEt2REQkC1bPkDIyMhAXF4fu3bujqKioKWIialaYQ0TSWP0e0tChQzFlyhRMnjzZ5s4NBgMqKiqgVqsbfIafyJ0IIaDT6dC6dWvjFyalsjeHmD/UXEjNH6sFqTHfOq6oqOARITUroaGhJr+1b4m9OcT8oebGWv449Zca1Gq1MQhbfxreEQoLCxEeHm7XvK+u/R5lt++ZnNaurTcAmJzerq03Vs4eZHF+hVIBYbD+1t0/XhtuQ8QPWFpu/Rht7aP2fI3ZtvYsr7EaG69Wq0VRUZFxTDcFc/ljaXsBlselI0gZY7ZwZGzO4sjxbgtbc0NqnPaMIVPMvZbVj09q/ji1INVcZnDlUV5hYaFd8/1SXGF22q0Ky9MKCwstzi+VPbFLWW5NjLb2UX8+e7etvctrLEf01ZSXzszlj6XtZY4jt6UjxnZtjt7PzuKKGO3JDSlx2jOGbGEuPmv50yS/ZRceHg4vL6+mWFQdeXl5iIyMtGveDntv4rqmyvS0gFYAYHJ6h4BWiIyMtDi/UqmAQcIZkj2xW1pu/Rht7aP2fI3ZtvYsr7EaG291dbXLXjTr54+l7QVYHpeOIGWM2dSfA2NzFkeOd1vYmhtS47RnDJli7rWsfnxS84ffQzJjyogwqJQNq7mHSoEpI8IwZUQYvNSqOtO81CpMGRFmnN9D1XB+lVKBhOguDeatr+dv7buLpbm4TcVoqQ9L6+ZoTb08d2dpezXFtjS1DHup/5tPZJqz9qe1MWTqtas+L7XK5GtZY+Kzeoa0bNky7N+/Hzdu3MDUqVPh7+9vvA9PczYksjMAYGNWAcordQAAPx81Zox5yjgNADL3nsQNTRUCA1phyogw4zRr84c92t44r1IJ6A2/Lrvnb9tj2ayBDotboQCEeHDUUjtGa32YWzdHa+rlNTVH55CU7fWPrGO4U6l3yrY0tfyox4Nw9FQpbmiq4Ovz4H2Cu5U6q9MGPe7dbPazMzgrN6T0W/s1pEbNGVHt15Lar2WNjc+pX4ytOU1zx0t2ruBO8bpTrIDjLtk15VhuzDLdZf8wTseSa5xSxzIv2RERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSywIBERkSxIKkjnzp3DxIkTER8fj4kTJ+L8+fNODouo+WD+EEnjIaXRG2+8gd///vcYPXo0duzYgcWLFyMzM7NRC/4u7xIy957EDU0VfH3UAIC7lToEBrTClBFhGBLZ2er8G7MKUF6pMz6nUABCAH4+aujuG3BPqwc+udyoOJucG8Tr56PG8J6++PCbb3CptML4vALAiP6/wdFTpbihqZK8L2vUHhOW5pXaTipH91efM/LHXZnKW3xyGWoPJXT3DcYcliUZ52bnoNa/5uJ/4/TzUWPGmKcwJLKzcYxf11RBqVTAYBDo4ISx3lhWz5Bu3ryJEydOIDExEQCQmJiIEydOoKyszO6Ffpd3CWu3HsN1TRUEgPJKHcordRAArmuqsHbrMXyXd8ni/O9+nl93UOPXgVxeqXtQjMgpyit12Pajpk4xAgABYM+PF4z7Vcq+rFF/TJibV2o7qRzdX33OyB939V3eJfzts4Z5CwC6+wYAMi5GMlc/F4EHefq3z/Kx/ov/M45xADAYHmxkR491R7BakEpKStCxY0eoVCoAgEqlQlBQEEpKSuxeaObek6jWmS8Y1To9MveetDj/fT1Hrjuwti9rmBoTpuaV2k4qR/dXnzPyx11l7j0JvYF525T0BoF9ORfNvt46cqw7gqRLdo1VWFhY53FNpbbkuqYKeXl5ZqeR+7C0L2u3kTKv1HammJremP6aSv38kUou8ddg3rqGwcpBgJzGutWCFBwcjGvXrkGv10OlUkGv16O0tBTBwcFWOxf/Pf8ODQ2Fp6en8fmuIRqU3b5ncd52bb0RHh5ucpqU+Uk+LO3LGub2af15pbarr7Cw0OR0qf1ptVoUFRUZx7RUzsgfKcytrysxb11DoVRAWChKUvKzsaTmj0JIyLDU1FQkJycb35T94osv8PHHH1sNory8HEVFRdKjJpK50NBQ+Pn52TQP84foAWv5I6kgnT17Fmlpabhz5w7atGmDjIwMdO3a1erCDQYDKioqoFaroVAobIu
"text/plain": [
"<Figure size 432x288 with 6 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"ph_val = X_test[\"ph\"]\n",
"sulfate_val = X_test[\"Sulfate\"]\n",
"hard_val = X_test[\"Hardness\"]\n",
"carb_val = X_test[\"Organic_carbon\"]\n",
"turb_val = X_test[\"Turbidity\"]\n",
"ch_val = X_test[\"Chloramines\"]\n",
"\n",
"\n",
"figure, axes = plt.subplots(nrows=3, ncols=2)\n",
"\n",
"axes[0, 0].plot(ph_val, predictions, 'bo')\n",
"axes[0, 0].set_title(\"pH\")\n",
"\n",
"axes[0, 1].plot(sulfate_val, predictions, 'bo')\n",
"axes[0, 1].set_title(\"Sulfate\")\n",
"\n",
"axes[1, 0].plot(hard_val, predictions, 'bo')\n",
"axes[1, 0].set_title(\"Hardness\")\n",
"\n",
"axes[1, 1].plot(carb_val, predictions, 'bo')\n",
"axes[1, 1].set_title(\"Organic carbon\")\n",
"\n",
"axes[2, 0].plot(turb_val, predictions, 'bo')\n",
"axes[2, 0].set_title(\"Turbidity\")\n",
"\n",
"axes[2, 1].plot(ch_val, predictions, 'bo')\n",
"axes[2, 1].set_title(\"Chloramines\")\n",
"\n",
2022-05-17 20:01:51 +02:00
"figure.tight_layout()\n",
"plt.show()"
]
2022-05-17 18:37:29 +02:00
},
{
"cell_type": "code",
2022-05-17 18:54:16 +02:00
"execution_count": 12,
2022-05-17 18:37:29 +02:00
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
2022-05-17 20:01:51 +02:00
"<matplotlib.collections.PathCollection at 0x7f716bdc1310>"
2022-05-17 18:37:29 +02:00
]
},
2022-05-17 18:54:16 +02:00
"execution_count": 12,
2022-05-17 18:37:29 +02:00
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
2022-05-17 20:01:51 +02:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXYAAAD7CAYAAAB+B7/XAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAACA70lEQVR4nO2dd3gUVReH35mt6Qm999577yBNqoIUAaUoAiIfiCCIIF1EadJBmhQFkQ7SeyfU0HtNgIRA2vad74+FhZDdzaZAQpj3eXgesrNz58wme+bec8/5HUGSJAkZGRkZmTSDmNIGyMjIyMgkL7Jjl5GRkUljyI5dRkZGJo0hO3YZGRmZNIbs2GVkZGTSGMqUNsBqtRIdHY1KpUIQhJQ2R0ZGRuadQJIkTCYTXl5eiGLsOXqKO/bo6GiuXLmS0mbIyMjIvJMUKlQIHx+fWK+luGNXqVSAzTi1Wp2itgQFBVGiRIkUtcEVsn1JJ7XbmNrtg9RvY2q3D5LHRqPRyJUrV+w+9FVS3LG/CL+o1Wo0Gk0KW0OqsMEVsn1JJ7XbmNrtg9RvY2q3D5LPRkchbHnzVEZGRiaNITt2GRkZmTRGiodiZGRkZFIjkuUxmC+B6A/KEu9U1p7s2GVkZGReQZIMSM++B/12ENSAFQR/CJiGoCqV0ua5hRyKkZGRkXkF6el3oN8BGEGKAikGrA+QnnyGZAlJafPcQnbsMjIyMs+RLCFg2AUYHBw0IsX8+dZtSgyyY5eRkZF5gSnoefjF4UEwHH2r5iQW2bHLyMjIvED0BVz0HhL935YlSUJ27DIyMjIvUJUHnBQOCZ4Inh3fqjmJRXbsMjIyMs8RBAWC/xTAg9hJgx6grg2aOiliV0KR0x1lZGRkXkHQVIEM65GiF4ApEMR0CJ6fguYDBOHdmAsnu5XTp0+ncOHCsmKjjIzMO4ugzI3oNxIxw0bEdEsQtI3eGacOyezYz58/z+nTp8mWLVtyDisjIyMjkwCSzbEbjUZGjRrFiBEj3qnSWxkZGZm0RrI59qlTp9KiRQty5syZXEPKyMjIyCQCQZIkF0mb7nHq1CkmT57M4sWLEQSBevXqMXv2bAoVKhTvuQaDgaCgoKSaICMjI/NeUqJEiTja7smSFXP8+HFu3LhB/fr1AQgJCaF79+6MHz+eGjVqJNq4t01gYCDly5dPURtcIduXdFK7jandPkj9NqZ2+yB5bHQ1KU4Wx/7ll1/y5Zdf2n9OyIxdRkZGRiZ5eXfyd2RkZGRk3OKNFCjt2rXrTQwrIyMjI+MG8oxdRkZGJo0hO3YZGRmZNIbs2GVkZGTSGLJjl5GRkUljyI5dRkZGJo0hO3YZGRmZNIbs2GVkZGTSGLJjl5GRkUljyI5dRkZGJo0hO3YZGRmZNIbs2GVkZGTSGLJjl5GRkUljyI5dRkYm1SNJ1pQ24Z1CduwyMjKpEkkyYo2chPVheaSHRbA+qo41eqHs5N3gjcj2ysjIyCQFSZKQnvQA0ynAYHvR+hgipyCZryP4jUlR+1I78oxdRkYm9WE8DOaz2J26HR3o1iGZ76SEVe8MsmOXkZFJdUj6rSDFOH+DYc9bs+VdRHbsMjIyqRApnmOujsvIjl1GRibVIWgbgeDp7Choar9Ve941km3ztHfv3ty7dw9RFPH09OTHH3+kaNGiyTW8jIzM+4S6KiiLgimI2HF2D9A2RlDmSSHD3g2SzbFPmDABHx8fAHbs2MHQoUNZs2ZNcg0vIyPzHiEIIqRbiBQ5BXR/gaQHwQ+8uiF49Uhp81I9yebYXzh1gKioKARBSK6hUy0Ws4Wz+y4QFR5NoQr5yZw7Y0qbJCOTZhAELYLv90g+gwEjoH4v/EpyIEiSlGy7ED/88AMHDx5EkiTmz59PwYIF4z3HYDAQFBSUXCa8Na4dv8XyoeuwmCwAWEwWClfLR7vRLVBrVSlsnYyMzPtCiRIl0Gg0sV5LVsf+grVr17Jp0ybmzZsX73tfOHZHxr1tAgMDKV++fLzvu38tmJ5lvsMQEzvHVq1VUaVZeX5c+W2K2pdSpHb7IPXbmNrtg9RvY2q3D5LHRle+841kxbRq1YqjR48SHh7+JoZPcVZP3ojZaI7zulFv4vCGQELvh6WAVTIyMjI2kiXGHh0dTUREBFmzZgVg165d+Pn54e/vnxzDpzrOH7yMxWxxeEytVXHj7B0yZE//lq2SSY08CQnnn0kbObTuOEqVgg+61KbZVw3x8nWWyicjk3SSxbHrdDr69euHTqdDFEX8/PyYPXt2mt3o8M/k5/SY1WLFL4OP0+My7w/3rwXTt8pQ9FF6TM9XeEtGrmLT3B3MOP4zPgHeKWyhTFolWRx7hgwZWLlyZXIM9U7Qsk9jLhy+jD76dR0L8EnnTaEK+VPAKpnUxtRe84h6Go1kfbmNZdQZCb0XxtLR/9Br0ucpZ5xMmkauPE0EVVtUoFqrSmi9Xm5YqDQqPHw8GP7PwDS7UpFxn+iIGM7tvxjLqb/AZDSzfcneFLBK5n1Blu1NBIIg8P2SvpzYepoNs7fx7HEEZeqWoHnvRmTIli6lzZNJBRhiDIii8we8Icb4Fq2Red+QHXsiEQSBio3LUrFx2ZQ2RSYV4p/JDy8/L4z6pw6PF6qY7+0aJPNeIYdiZGTeAKIo8vnodmg849ZmaDzVdB3dIQWsknlfkGfsMjJviKY9GqCP1rNo+EoEwdYVSK1V87/ZX1KqVrE47w/cfoZZg5by+OY0fNN707JPY1p+3RiVWq5klkkYsmOXkXmDfNSvGc2+asS1kzdQqJQUKJsHhUIR532b5m1nVv/F9mrmmIgYFg3/iyObApmw7UeH58jIOEMOxbxDRIZG8fvX8/k4YzdaBXzG6HaTuH3xXkqbJRMPao2KYlULU7hCfocOWhelY9b/FsWRqDDEGLl8/DpHNgS+LVNjYbVaiXoa7bQYTyb1Ijv2d4Sw4HCmdlzApnk7iAiLJPpZDPtXH+HrykO4evJGSpsnkwQCt59FoXI8I9dH6dm6aPdbtcditrBk5Eo+ytCVtll60NK/C7/3nY8+Jm7dhkzqRHbs7whLflpJTITeriYJIFkl9FF6pvaOX2xNJvViMphxpcX3+kz+TTOhy++snLiO6KcxmI1mDDFGtszfxXf1R2K1Wt+qLTKJQ3bs7wh7Vx7CanH8pbp++haR4VFv2SKZ5KJkzSKYTY7DHRpPDdVaVnxrtty5dJ+D647HybM3GUzcPn+XwO1n35otMolHduzvCM6++GDLqTcZTEkc30zQgYuc2nUOXZQuSWPJJIwM2dPT4NOaaDzVsV4XFSLe/p580KXOW7PlxH+nkZzMynVReg6uOfrWbJFJPHJWzDtC6TrFObblpMPm7AFZ/AjI7J/osfeuPMSUr+ZitVoRBAGzyUy7wa3oNKyNLI/wlug3+0v8Mvry79RNiAoFFpOZ0rWL8+0fvfD08XhrdogK0envXBBAVMrZOe8CsmN/R+g6pj2ndp3FpI+tA6/xVPPVr58l2gGf2XOeid1mxFl6r5ywDm8/L1p/0zTRNsu4j0KhoPu4TynerAA5M+XCJ503vunevkpo5WblmDf4T4fHNJ4aaretihn9W7ZKJqHIoZh3hAJl8tJjenvylsyFSqNC46EmQ470fLfwa2p+XCXR4y4a/pdD3RJ9jIGlo1dhscipbm8TlUZJ9gJZU8SpA2TNm5nG3eqhfa1iVuOhpkT1Ig4Lq2RSH/KM/R0id+kczD3zG09CwjEbzWTMmSHJoRJXqZKGGCNPgp+SMUfaaRpi1Ju4dvom3v5eZMmTKaXNSZV8/Xt38pbMxYrxa3h8Lwy/DD606tuUdoNayqG5dwTZsb+DpMsSkGxjeXh7OFUatFisePpok+1aKYnVamXx8L9ZNWkDKrUSs9FM9oJZGbykL/lL50lp81IVgiDQrGdDmvVsmNKmyCQSORTzntOkez1UmrhaJKIoULJmUbz8vFLAquRn/vdLWT1lEya9iZgIHUa9iZvn7jCg9nAe3Q2N9V6j3kjg9jMc3RQop5HKvJPIM/b3EIvZws1zd0CAtt+15ND6E4TcfGifuau1KrReWvrP7Zn
2022-05-17 18:37:29 +02:00
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from sklearn.decomposition import PCA\n",
"\n",
"pca = PCA(n_components=2)\n",
"pca.fit(X_test)\n",
"X_pca = pca.transform(X_test)\n",
"\n",
"plt.scatter(X_pca[:, 0], X_pca[:, 1], c=predictions, s=50, cmap='viridis')"
]
},
{
"cell_type": "code",
2022-05-17 18:54:16 +02:00
"execution_count": 13,
2022-05-17 18:37:29 +02:00
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
2022-05-17 20:01:51 +02:00
"<matplotlib.collections.PathCollection at 0x7f716bda93d0>"
2022-05-17 18:37:29 +02:00
]
},
2022-05-17 18:54:16 +02:00
"execution_count": 13,
2022-05-17 18:37:29 +02:00
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
2022-05-17 20:01:51 +02:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXYAAAD7CAYAAAB+B7/XAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAACOBklEQVR4nOydd3gUVReH3zvbNz0h9N57R6QLKKACgopg+VQERVFUFBtWxIYoAjYEG4oNRERABBGUJtJL6L1DSE+27879/tgQWHY3CRAgwLzP4yPZmblzZsuZO+ee8ztCSinR0NDQ0LhiUC61ARoaGhoaRYvm2DU0NDSuMDTHrqGhoXGFoTl2DQ0NjSsMzbFraGhoXGHoL7UBqqpis9kwGAwIIS61ORoaGhqXBVJKPB4PERERKErgHP2SO3abzcaOHTsutRkaGhoalyU1a9YkKioq4LVL7tgNBgPgN85oNF5SW5KSkqhfv/4ltSE/NPvOn+JuY3G3D4q/jcXdPigaG91uNzt27MjzoadzyR37yfCL0WjEZDJdYmsoFjbkh2bf+VPcbSzu9kHxt7G42wdFZ2OoELa2eKqhoaFxhaE5dg0NDY0rjEseitHQ0NAojkjfCfBuAyUW9PUvq6w9zbFraGhonIaULmTm8+D8E4QRUEHEQtx4hKHhpTavUGihGA0NDY3TkBnPgHMB4AaZA9IO6hFk2n1I37FLbV6h0By7hoaGRi7SdwxcCwFXiI1upP3bi27TuaA5dg0NDY2TeJJywy8hN4Lrv4tqzrmiOXYNDQ2NkyjRQD69h5TYi2XJeaE5dg0NDY2TGJoBYQqHhBVhveuimnOuaI5dQ0NDIxchdIjYsYCFwKRBCxg7gOm6S2LX2aKlO2poaGichjBdCyV+Q9q+BM8aUOIR1rvBdANCXB5z4SK38qOPPqJWrVqaYqOGhsZli9BXQokZgVJiNkr8Nwhz18vGqUMRO/bNmzezfv16ypYtW5TDamhoaGicBUXm2N1uN6+//jqvvvrqZVV6q6GhoXGlUWSOfdy4cfTs2ZMKFSoU1ZAaGhoaGueAkFLmk7RZONatW8cHH3zA5MmTEULQqVMnJkyYQM2aNQs81uVykZSUdL4maGhoaFyV1K9fP0jbvUiyYlatWsWePXvo3LkzAMeOHWPAgAG8/fbbtG3b9pyNu9isWbOGZs2aXVIb8kOz7/wp7jYWd/ug+NtY3O2DorExv0lxkTj2hx56iIceeijv77OZsWtoaGhoFC2XT/6OhoaGhkahuCAFSgsXLrwQw2poaGhoFAJtxq6hoaFxhaE5dg0NDY0rDM2xa2hoaFxhaI5dQ0ND4wpDc+waGhoaVxiaY9fQ0NC4wtAcu4aGhsYVhubYNTQ0NK4wNMeuoaGhcYWhOXYNDQ2NKwzNsWtoaGhcYWiOXUNDQ+MKQ3PsGhoaxR4p1UttwmWF5tg1NDSKJVK6UbPHoB5vhjxeGzW5DartK83JF4ILIturoaGhcT5IKZFpA8GzDnD5X1RPQPZYpHc3IuaNS2pfcUebsWtoaBQ/3P+CdyN5Tj0PBzhmIr0HLoVVlw2aY9fQ0Ch2SOc8kPbwO7j+vmi2XI5ojl1DQ6MYIgvYlt92Dc2xa2hoFDuEuSsIa7itYOpwUe253CiyxdPBgwdz6NAhFEXBarXy8ssvU6dOnaIaXkND42rC2Ar0dcCTRGCc3QLmbgh95Utk2OVBkTn2UaNGERUVBcCCBQsYPnw4M2bMKKrhNTQ0riKEUCD+K2T2WHD8CNIJIgYiHkBEDLzU5hV7isyxn3TqADk5OQghimroYovP62Pj4i3kpNuo2bwapSolXmqTNDSuGIQwI6KfR0Y9B7gB41XhV4oCIaUsslWIF198kWXLliGl5PPPP6dGjRoFHuNyuUhKSioqEy4au1bt4/vhM/F5fAD4PD5qta5K35E9MZoNl9g6DQ2Nq4X69etjMpkCXitSx36SX3/9lTlz5jBp0qQC9z3p2EMZd7FZs2YNzZo1K3C/w7uOMqjxM7jsgTm2RrOBa7s34+WpT19S+y4Vxd0+KP42Fnf7oPjbWNztg6KxMT/feUGyYnr16sV///1Henr6hRj+kjP9g9l43d6g191OD//OWkPK4dRLYJWGhoaGnyKJsdtsNrKysihTpgwACxcuJCYmhtjY2KIYvtixedl2fF5fyG1Gs4E9Gw9QolzCRbZKoziSdiydn8fMZvnMVegNOm64twPdH+5CRHS4VD4NjfOnSBy7w+HgiSeewOFwoCgKMTExTJgw4Ypd6IgtGRN2m+pTiSkRFXa7xtXD4V1HGXLtcJw5Tjy5T3jfjJjGnIkL+HjVO0TFRV5iCzWuVIrEsZcoUYKpU6cWxVCXBbc82o0t/27HaTtTxwKi4iOp2bzaJbBKo7gx7pFJ5GTYkOqpZSy3w03KoVSmjPyZR8bcf+mM07ii0SpPz4FWPZvTutc1mCNOLVgYTAYsURZe+XnYFfukolF4bFl2Ni3ZGuDUT+Jxe/nzm38ugVUaVwuabO85IITg+W+GsHreemZNmE/miSwad6xPj8FdKVE2/lKbp1EMcNldKEr4G7zL7r6I1mhcbWiO/RwRQtCiWxNadGtyqU3RKIbElowhIiYCtzMj5PaaLapeXIM0riq0UIyGxgVAURTuH9kXkzW4NsNkNdJ/5J2XwCqNqwVtxq6hcYG4aeD1OG1Ovn5lKkL4uwIZzUaenPAQDdvXDdp/zZ8b+PTZKZzYO57ohEhuebQbtzzWDYNRq2TWODs0x66hcQG59YnudH+4K7vW7kFn0FO9SWV0Ol3QfnMm/cmnQyfnVTPbs+x8/cqPrJizhlHzXw55jIZGOLRQzGVEdkoOHz72ObclPkCvuPsY2XcM+7ceutRmaRSA0WSgbqta1GpeLaSDduQ4+PTJr4MkKlx2N9tX7WbFrDUXy9QAVFUlJ8MWthhPo/iiOfbLhNSj6Yy760vmTFpAVmo2tkw7S6av4LGWL7Bz7Z5LbZ7GebDmz43oDKFn5M4cJ/O+XnRR7fF5fXwzYiq3luhPn9IDuSX2Xj4c8jlOe3DdhkbxRHPslwnfvDYVe5YzT00SQKoSZ46TcYMLFlvTKL54XF7y0+I7cyZ/oRl174dMHT0TW4Ydr9uLy+5m7ucLeabzCFRVvai2aJwbmmO/TPhn6nJUX+gf1e71+8hOz7nIFmkUFQ3a1cbrCR3uMFlNtL6lxUWz5cC2wyybuSooz97j8rB/80HW/Lnxotmice5ojv0yIdwPH/w59R6X57zGl9KDdK9Guv5FqrbzGkvj7ChRLoHr726HyWoMeF3RKUTGWrnh3usumi2r/1iPDDMrd+Q4WTbjv4tmi8a5ozn2y4RG19WDMIWMcaVjiCsVe85jq47fkcmtkOkPITMeQya3Qs35KN/wgEbR8sSEh+j9+E0YLQbMkWYMJj1NOzfgo//exhpluWh2KDolrCSGEKDoteycywEt3fEyof8b/Vi3cCMeZ6AOvMlq5OH37jtnfRrp+g8ynwecgRtyJiFFFCLivnO0WONs0Ol0DHjrbup1r06FkhWJio8kOv7iq4S27N6USc99G3KbyWqiQ59WeM/8rmgUO7QZ+2VC9cZVGPhRP6o0qIjBZMBkMVKifALPfPUY7W679pzHlTljCXLqADgg52Ok1FLdLiYGk55y1ctcEqcOUKZKKbo90AnzGRWzJouR+m1qhyys0ih+aDP2y4hKjcozccP7pB1Lx+v2klihxPkrSXo2h98mnaCeAF3p8ztHMUIIF9KzBUQ0Ql/+UptTLHnswwFUaVCRH96ewYlDqcSUiKLXkJvo++wtmnLpZYLm2C9D4kvHFd1gSgSo4R6tfSAiiu5clxApVWTOWBqU/gqZpgfpReorI2LeRRjqXGrzihVCCLoP6kL3QV0utSka54gWirnasdwOGENsUMDYHKFcGd2gZPZosE1Gp7hA2gAXeLcj0+5G+o4G7itdSNcypHMRUs28NAZraJwH2oz9KsTn9bF30wEQULnegyiuheA9yKlYuwmEFRH9xqU0s8iQajbYpwAhCn2kC2n7GhH9AgCq/RfIHkleCpJ0I63
2022-05-17 18:37:29 +02:00
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plt.scatter(X_pca[:, 0], X_pca[:, 1], c=y_test, s=50, cmap='viridis')"
]
2022-05-17 18:54:16 +02:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Irysy"
]
},
{
"cell_type": "code",
2022-05-17 20:01:51 +02:00
"execution_count": 14,
2022-05-17 18:54:16 +02:00
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(105, 4) (105,)\n",
"(45, 4) (45,)\n",
2022-05-17 20:01:51 +02:00
"9.333333333333333e-01\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/andrzej/anaconda3/lib/python3.9/site-packages/numpy/core/fromnumeric.py:3438: FutureWarning: In a future version, DataFrame.mean(axis=None) will return a scalar mean over the entire DataFrame. To retain the old behavior, use 'frame.mean(axis=0)' or just 'frame.mean()'\n",
" return mean(axis=axis, dtype=dtype, out=out, **kwargs)\n"
2022-05-17 18:54:16 +02:00
]
}
],
"source": [
"# Preprocessing danych\n",
"\n",
"\n",
"# Wczytywanie danych\n",
"df = pd.read_csv(\"iris.csv\")\n",
"\n",
"# Zrandomizowanie kolejności danych w datasecie\n",
"df = df.sample(frac=1, random_state=10).reset_index(drop=True)\n",
"\n",
"# Podział na atrybuty i przewidywane wartości\n",
"X, y = df.iloc[:, :-1], df.iloc[:, -1]\n",
"\n",
"# Normalizacja i skalowanie danych\n",
"from sklearn.preprocessing import StandardScaler\n",
"sc = StandardScaler()\n",
"X = sc.fit_transform(X.to_numpy())\n",
"X = pd.DataFrame(X, columns=df.columns.values.tolist()[:-1])\n",
"\n",
"# Podział na dane trenujące i testowe, z uwzględnieniem równego rozłożenia danych\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y, random_state=1)\n",
"\n",
"print(X_train.shape, y_train.shape)\n",
"print(X_test.shape, y_test.shape)\n",
"\n",
"\n",
"# Trenowanie modelu klasyfikatora\n",
"x = NaiveBayesClassifier()\n",
"x.fit(X_train, y_train)\n",
"\n",
"\n",
"# Predykcja wartości dla danych testowych\n",
"predictions = x.predict(X_test)\n",
"\n",
"# Prawdopodobieństwa kolejnych predykcji\n",
"probabilities = [p[1] for p in predictions]\n",
"\n",
"# Przewidziana wartość\n",
"predictions = [p[0] for p in predictions]\n",
"\n",
"\n",
"# Wyliczenie accuracy modelu\n",
"print(x.accuracy(y_test, predictions))"
]
},
{
"cell_type": "code",
2022-05-17 20:01:51 +02:00
"execution_count": 15,
2022-05-17 18:54:16 +02:00
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.9326599326599326"
]
},
2022-05-17 20:01:51 +02:00
"execution_count": 15,
2022-05-17 18:54:16 +02:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"f1_score(y_test, predictions, average=\"macro\")"
]
},
{
"cell_type": "code",
2022-05-17 20:01:51 +02:00
"execution_count": 18,
2022-05-17 18:54:16 +02:00
"metadata": {},
"outputs": [
2022-05-17 20:01:51 +02:00
{
"name": "stdout",
"output_type": "stream",
"text": [
" sepal.length sepal.width petal.length petal.width\n",
"106 -0.900681 0.558611 -1.169714 -0.920548\n",
"65 -1.143017 1.249201 -1.340227 -1.447076\n",
"49 1.038005 0.558611 1.103783 1.712096\n",
"74 2.128516 -0.131979 1.615320 1.185567\n",
"110 -0.173674 1.709595 -1.169714 -1.183812\n",
".. ... ... ... ...\n",
"123 -1.021849 -0.131979 -1.226552 -1.315444\n",
"121 -0.052506 -1.052767 0.137547 0.000878\n",
"85 -0.537178 1.939791 -1.169714 -1.052180\n",
"13 -1.143017 -0.131979 -1.340227 -1.315444\n",
"125 -1.870024 -0.131979 -1.510739 -1.447076\n",
"\n",
"[105 rows x 4 columns]\n"
]
},
2022-05-17 18:54:16 +02:00
{
"data": {
"text/plain": [
2022-05-17 20:01:51 +02:00
"<matplotlib.collections.PathCollection at 0x7f716bd1e820>"
2022-05-17 18:54:16 +02:00
]
},
2022-05-17 20:01:51 +02:00
"execution_count": 18,
2022-05-17 18:54:16 +02:00
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
2022-05-17 20:01:51 +02:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXYAAAD7CAYAAAB+B7/XAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAs20lEQVR4nO3dd3hUVf7H8feZOzUF0ghE6T0YVgQVFRcFVFwBBV1F+Yltde11V9eKi2LBdXXdFcWKu4KuawMBBVFsFBWjohGkKEUl9EDKTKbd8/sjgJRJCGRmbubm+3oeH2Hu5NzPDck3J+eee47SWmuEEELYhsPqAEIIIeJLCrsQQtiMFHYhhLAZKexCCGEzUtiFEMJmnFYHME2TqqoqXC4XSimr4wghRErQWhMOh0lPT8fh2LOPbnlhr6qqYvny5VbHEEKIlNS1a1cyMzP3eM3ywu5yuYCacG632+I0B6akpISioiKrY8SN3a4H5JpSgd2uB5JzTaFQiOXLl++qobuzvLDvHH5xu914PB6L0xy4VMxcF7tdD8g1pQK7XQ8k75piDWHLzVMhhLAZKexCCGEzlg/FCCHEwShdtYGfl5fSonUu7Q9rY3WcRkUKuxAipZRvqeDecx5hycJluDwuIuEIh3Yu4O7X/8whnVpZHa9RkKEYIUTK0Frzl1PupWTeUkLVYaq2+wn6Q6wqWcsNx99JMBC0OmKjIIVdCJEyvpv/PT8vX0ckHN3jdW1qAlVBPnxlgUXJGhcp7EKIlPH95yuJhCMxj1VXVrP4w++SnKhxksIuhEgZGVnpOF2xbw0aTgdZ+c2SnKhxksIuhEgZ/UYcjWnG3vTNcDk55cIBSU7UOElhF0KkjMzsDK5/8jI8PjcOx69PXHrTPJx14xCZ9riDTHcUQqSUUy44kQ5FbXn179NZ9e0aWrXP58wbhnDEwJ5WR2s0pLALIVJOl94duX3K9VbHaLRkKEYIIWxGCrsQQtiMFHYhhLAZKexCCGEzUtiFEMJmpLALIYTNSGEXQgibkcIuhBA2I4VdCCFsRgq7EELYjBR2IYSwGSnsQghhM3FZBKysrIxbbrmFtWvX4na7adeuHffccw85OTnxaD4mbVZB6BPQfnAdiXK2Tdi5hBAilcSlx66U4tJLL2X27NlMnz6dNm3a8PDDD8ej6ZhM/+vojceit9+GLh+L3jwEs+xatA4l7JxCCJEq4lLYs7Ky6Nu3766/9+rVi3Xr1sWj6X3o0CIoHwtUg64CHQCCEPwIXX5vQs4phBCpJO5j7KZp8vLLLzNw4MB4Nw2ArpwIVMc4Ug2BqWizMiHnFUKIVKG01rE3EDxIY8eOZcOGDTz++OM4HPv/uREMBikpKal3+0WtrsBlbI95LGr6WLF5DIFw+3q3J4QQqayoqAiPx7PHa3HdQWn8+PGsWbOGiRMn1quo7y9cLObmVhCJXdgNR5TCw45HGS0P6NwHq7i4mD59+iTlXMlgt+sBuaZUYLfrgeRcU12d4rgNxTz66KOUlJQwYcIE3G53vJrdh0q7CPDFOOIAV8+kFXUhhGis4tJjX7FiBRMnTqR9+/ace+65ALRu3ZoJEybEo/k9+YZDcC6E5tdMdax5ERzpqKzEzcQRQohUEZfC3qVLF5YtWxaPpvZLKQOyHofQfHTgdTArwXMCyjcc5chISgYhhGjM4jrGnixKKfAcj/Icb3UUIYRodGRJASGEsBkp7EIIYTNS2IUQwmZScoz9YOnwN+jqOaCjKO+AmsXDlLI6lhBCxFWTKOxaR9DbrofgPCAImOjAS+DsCTnPotT+H4wSQohU0SSGYnTVJAh+AgQAc8eLfgh/ja54xMpoQggRd02isON/gdgLhwUh8ApaR5McSAghEqdpFHZzS+3HdLhm+V8hhLCJplHYHS1qP6bcoNKTl0UIIRKsaRT29EuJvXCYF3z/V7NMgRBC2ESTKOwqbTR4TwW8gEHNZfvAcywq8zprwwkhRJw1iemOSjlQWePRkcug+n3ABE9/lOswq6MJIUTcNYnCvpNydoaMzlbHEEKIhGoSQzFCCNGUSGEXQgibkcIuhBA2I4VdCCFsRgq7EELYjBR2IYSwGSnsQghhM1LYhRDCZqSwCyGEzUhhF0IIm5HCLoQQNiOFXQghbEYKuxBC2IwUdiGEsBkp7EIIYTNS2IUQwmaksAshhM1IYRdCCJuRwi6EEDYjhV0IIWxGCrsQQtiMFHYhhLAZKexCCGEzcSvs48ePZ+DAgXTr1o3ly5fHq1khhBAHKG6FfdCgQUyZMoVDDz00Xk0KIYQ4CM54NXTkkUfGqykhhAV+XlHKS/e9zqJZX+NyOzlpdH9+/6dhNMvJtDqaOEBxK+xCiNS18qtV3HTCGIKBEGbUBOC1R6bz/pRPeLL4IZrlHlhxX/n1Kl59eDorv15Ffts8zrx+CEcN7pWA5CIWpbXW8Wxw4MCBTJw4ka5du9br/cFgkJKSknhGqAdNmutHPM51hKM5VIYKkfvIoin71wUv8MvS9fu8brgMjjunN0NuGFTvtr6a9R1v3PcOkVAUbdaUF7fPxdEjejH0xvq3I+qnqKgIj8ezx2uNpsceK1wi6Oh6dNmlEPkJlAIUqAxU9tMoV+EBtVVcXEyfPn0SE9QCdrsekGuqj7KN29n44+aYx6LhKN/OWc6YF2+pV1v+igB393+EcHVkj9dDgTCLpi5m1E2/p0vvjnsck3+jg1NXp7hJdVPNyAb0lnMh8gMQAO0HXQXmBvTW0Wiz0uqIQiRdKBDCYdReCkLV4Xq3tfCtL2ptKxyMMPuFDw44nzhwcSvs48aNo3///qxfv56LL76YIUOGxKvpBtPRUswt58PmAWCuA6Ix3hRGB6YlPZsQVmvRJpe0TF/MY0rBb/rX/zfZym1VRCMxvr8AM2qybWP5QWUUByZuhf3OO+/k448/ZsmSJcyfP5+ZM2fGq+kG0aYfveVsCBcDkTreGYDw4mTFEqLRcDgcXHL/KDxp+w6Fun1uLhw7st5tde/bBeWIXVa86R6cboN/XPEUU8a9xsa1mw46s6hboxljTxQdeAvMCmL20vfgBKNVMiIJ0eicevFAzKjJc7dNIVQdQZsmeYfmcOMzV9D5iA71bqfbkZ3o+Jt2rPjyRyKhXztSyqGo9geZ9+bnBKuCuDxOXrr/Da567GJa9c5JxCU1abYv7IQ+BAL1eKOB8v0+wWGEaLxOu/QkBl80gJ9XlOL2umjVPh+lVL0/PhqNEglFuP/t27n/vH+w+KPvcHlcRMI1r0cjUYJVQaBmvB1gwvWTuObfF4K97p1azv6FXaXX401eyPwTytk24XGEaMwMp0G7wtYH9DFb15cx8aZ/88kbn2FGTfLb5nHJfaO44ak/8tOydWz6eQsTrn+eaOW+vzWHq8Ms+F8xp509OF6XIGgCs2KU70wg9o2hGgZkXI4j/cJkRRLCNirKKrnqqFv5+LWFREIRzKjJ+lUb+fulT/LpjGL6nHw4Doejzpk1Kz9flcTETYPtCzvu48DTr443RCHwVtLiCGEn0ye+S8XWSqIRc4/Xg/4gz946hVB1iJbtWmDudXx3FZurEh2zybF9YVdKQead1DnqFF2XtDxC2MmHr8wnFAjFPKYciu8/X0mbwkOhjqH6SHh/ExvEgbL/GDugjFw0BrVOdzTykppHNA0bKiuZ8u3XfLm+lFbpGZz/m170alVgday42t/NVaUUzXMz8aZ7qa6sjvme7FbNExGtSbN9jx1AKQ/4hgHuGEd9kHZJsiMJm1u07mcGvfg8z3z5BQt+WsvUZUsZ9cb/eHjBPKujxdWgUb/F7Yv1fVWj29GdMZwGw6/9HZ60fd/nTfcw4JJjExmxSWoShR1AZd4Bzm6g0na84gDlA8+JqLRRlmYT9hIxTa6Y8Rb+cJhgtGaYwdSa6kiESV8Xs3h9qcUJ42fIH08iq0UzDJexx+ueNDdX/P1C3B4XABeNHckxQ4/E7XPj9rp2/X/o5afQZ2hPK6LbWpMYigFQjnTIfRVCC9DVH4Byo3y/Q7l+Y3U0YTMLf1pLOBp73DgYjTL5m6853CZDMunN05mw6EGeu20K778
2022-05-17 18:54:16 +02:00
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"pca = PCA(n_components=2)\n",
"pca.fit(X_test)\n",
"X_pca = pca.transform(X_test)\n",
"\n",
"df_pred = pd.DataFrame(predictions).replace({'Virginica': 0, 'Versicolor': 1, \"Setosa\": 2}, regex=True)\n",
"df_pred = np.array(df_pred).reshape(1, -1)\n",
"plt.scatter(X_pca[:, 0], X_pca[:, 1], c=df_pred[0], s=50, cmap='viridis')"
]
2022-05-17 20:01:51 +02:00
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAaQAAAEUCAYAAABkhkJAAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAA8i0lEQVR4nO3dfVxUdfr4/9fMcCM3KqBAVmglKZQ3ICBphopaUAIq3nUj7Sezcpe1LFbdT2l5s26Yrr+isvtcMj+WCmkm1lreVYrCgsqKmayhIoKKoIByMzO/P/gycTMDMwPCAa/n4+Hj4Zzzfr/PdQ7va645Zw4HlV6v1yOEEEK0M3V7ByCEEEKAFCQhhBAKIQVJCCGEIkhBEkIIoQhSkIQQQiiCFCQhhBCKIAVJGISGhvLzzz8bXbdgwQJWr159w2Noq+0IcaM1lU/mevrpp0lOTja67uzZs/Tv35/q6mqT/fv3709ubm6LYmhLUpAUKC0tjenTpxMQEMDQoUOZPn06R44cae+wFKGjJZhofx05nz766CMmTpxoVtsZM2awcePGGxzRjWXT3gGI+kpLS3nuued47bXXCA8Pp6qqirS0NOzs7No7NCE6HMmnjkXOkBTm1KlTAIwfPx6NRkOXLl0YMWIEPj4+hjabNm0iPDycoKAgZs6cSV5enmFd//79SUxMZMyYMQQHBxMfH49OpwPg9OnTxMTEEBwcTHBwMC+99BJXrlyxKs5du3YRFRVFYGAg06dP5/jx44Z1oaGhfPzxx0RERBAQEMALL7xARUWFYf2HH37IiBEjGDFiBBs3bmx01nPlyhWeeeYZ/P39mTJlCqdPnwbg8ccfByAqKgp/f3+2b99uVezi5qHEfDpz5gyBgYGGcV5++WWGDRtmWB8XF8fatWuB+mc9Wq2W+Ph4goODGTNmDHv27DH0Wb16NWlpaSxZsgR/f3+WLFliWPfzzz/z4IMPEhQUxOLFi1Hyw3mkICnMnXfeiUajYf78+ezZs4eSkpJ663fu3Mn777/P22+/zf79+wkICOCll16q1+Zf//oXmzdvJjk5mR9++IHNmzcDoNfrefbZZ9m3bx8pKSmcP3+ehIQEi2P8z3/+w//+7/+yZMkSUlNTmTZtGn/84x+prKw0tElJSeGjjz7i+++/55dffiEpKQmAvXv3snbtWj799FP+9a9/cfDgwUbjf/PNN8TGxnLo0CF69+5t+E7p888/B2DLli1kZGTw8MMPWxy7uLkoMZ+8vLxwdnbm2LFjQM0lRUdHR3Jycgyvhw4d2qjfl19+ya5du/jqq6/YvHkzO3bsMKybO3cugYGBLFq0iIyMDBYtWmRYt3v3bjZt2sSWLVtISUlh3759Zh69ticFSWGcnZ1Zv349KpWKhQsXMmzYMJ577jkuXrwIwIYNG3jmmWfo27cvNjY2PPfcc2RnZ9f7VDdr1ixcXFy49dZbiYmJYdu2bQD06dOH+++/Hzs7O9zc3Pif//kfDh06ZHGMX375JdOmTWPw4MFoNBomTpyIra0tmZmZhjYzZszA09MTFxcXRo8eTXZ2NlBTqCZNmsTdd9+Ng4MDsbGxjcYfN24cgwYNwsbGhsjISENfISyl1HwKCgri0KFDXLhwAYCHHnqIgwcPcubMGUpLS+udwdVKSUnhySefpFevXri4uPDss8+ata1Zs2bRrVs3br31VoKDg+tdzVAa+Q5Jgfr27cvrr78OQE5ODn/5y19Yvnw5//jHPzh37hzLly8nPj7e0F6v11NQUMBtt90GQK9evQzrbrvtNgoLCwG4dOkSy5YtIy0tjbKyMvR6Pd26dbM4vnPnzvHVV1+xbt06w7KqqirDdgDc3d0N/3dwcDCsKywsZMCAAYZ1dWOt1bNnT8P/u3TpQnl5ucUxClFLifk0dOhQvv/+ezw9PQkKCiI4OJgtW7Zgb29PYGAganXjc4XCwsJ6sdx6661mbathLpaVlZnVrz1IQVK4vn37MmnSJL744gugJjmee+45IiMjTfbJz8/n7rvvBmqKh4eHBwCrVq1CpVKxdetWXF1d2blzZ71rzeaqjWH27NkW9/Xw8KCgoKBerEK0FaXkU1BQECtWrOCWW24hKCiIgIAAXn31Vezt7QkKCjLax93dvV6+dMbckUt2CpOTk8Mnn3zC+fPngZpJt23bNgYPHgzA9OnT+eCDD/j1118BuHr1KikpKfXG+PjjjykpKSE/P5/ExETDdy1lZWU4OjrSrVs3CgoK+Oijj6yKccqUKWzYsIHDhw+j1+spLy9n9+7dlJaWNts3LCyMpKQkcnJyuHbtGu+8845F2+7ZsydnzpyxKm5x81FqPt1xxx3Y29uzdetWgoKCcHZ2pkePHnz77bcmC1J4eDifffYZ58+fp6SkhA8++KDe+s6QG1KQFMbZ2ZnDhw8zZcoU/Pz8mDp1Kv369WPBggVAzfcrTz/9NC+++CJDhgxh/Pjx7N27t94YY8aMYdKkSUyYMIFRo0YxefJkAGJjYzl27BiBgYE888wzPPjgg1bFOHDgQJYuXcqSJUsICgriwQcfNNy00JyRI0cyY8YMYmJiGDduHH5+fgBm34YbGxvLggULCAwMlLvsRLOUnE9Dhw41fDdV+1qv13PPPfcYbT916lRGjBhBVFQUEydObLS9mJgYQ0FbtmyZRbEohUr+QF/n0r9/f7777jv69OnT3qGYJScnh/Hjx3P06FFsbOQKslCWjpZPHZ2cIYk2969//YvKykpKSkp44403GD16tBQjIYQUJNH2NmzYwLBhwxg3bhwajYbXXnutvUMSQiiAXLITQgihCHKGJIQQQhHkwr0JOp2OsrIybG1tUalU7R2OUAC9Xk9VVRVOTk5Gf3FRNCZ5JOpqLoeaLEgzZ85k3LhxTJ8+vd6AY8aMIT4+3uT98uYoKCggLi6Ozz77zKr+Z8+eJTo6mtTUVKtjaEpZWRknTpy4IWOLjq1fv3507drV7PaSR5JHoj5TOdRkQYqOjmbt2rX1Eik1NRUbGxuzkkin06FSqYx+MvL09LQ6iVpCq9Wi0WiabWdrawvUHLi2flT9gax8knafpKjkOk4ONT+ismvVuHXvwqRR3tw3oPHjdhr2c+vehUF9e3Ak55Lhdd2+61KOsSfzHHqdHpVaxQCvLjz/xAirYli1Lo3s3MuG1759XHnpicAW7Xdz+9pSWVlZ9R5hZI7KykpOnDhhmBvmkjyqn0cN595Iv1t5Itz4795Y0x5Mz8mmxjI1/0wtt2YOteUcbwlr9s0czeVQkwVp7NixLF68mJMnT+Lt7Q1AUlISkyZN4sMPP+Tbb79Fq9Xi6enJ0qVLcXd3JyEhgdzcXMrLyzlz5gyJiYm8+eabHDhwADs7OxwdHdmwYUOjT2YZGRmsWLHC8JylefPmMWLECI4cOcLf/vY3ysvLcXR05OWXX2bQoEGNYt27dy//+Mc/0Gq1uLm5sWTJEvr06UNqairLly8nMDCQo0ePMnv2bEaPHt3sgatNfjs7O+zt7Ztt31p2p5/h7Y1ZVFRpASgu0xrWFZeVkbAxCz0aRgV4NdOvjP+eKzPaN/vUJbbvP12v/75jpXT9OpvZk/0siuGVNT9y+OSlemPtP3aRpZ8cYtnsEZjLWPym9rW1WPtztfTSk+TR73m0ZlNmo7n31b7TVFarmD3Zr1F/S9sDJufkM3//gaKrlfWW147le2cPo/Pv2Klivk87a3RedsWyOdQec7wlbuT7nqkcavJCuJ2dHREREYbfwi8tLWXnzp14enpy+vRpvvzyS5KTkwkJCTE8vBBqHp++bNkyvv76a/Lz89m/fz/bt29n69atvP/++422U1xcTGxsLH/5y1/YunUrycnJDBw4kMrKSubMmcPzzz/P119/zQsvvMCcOXPq/ZkDqHnI4bx581i5ciVff/0148ePJy4uzrD+xIkTjB8/ni+//NKsJGpPiSnZhglrTEWVlsSUxk+/bq5f3b47Uk8bXV+73JIYGiZ+LVPLTTG2TVP72tFIHv2uubnX0uVgeu41LEZ1xzI1/3aknm61edmZ53hrafamhsmTJxserZGSkkJAQAC7d+8mKyv
"text/plain": [
"<Figure size 432x288 with 4 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sep_len = X_test[\"sepal.length\"]\n",
"sep_with = X_test[\"sepal.width\"]\n",
"pet_len = X_test[\"petal.length\"]\n",
"pet_with = X_test[\"petal.width\"]\n",
"\n",
"figure, axes = plt.subplots(nrows=2, ncols=2)\n",
"\n",
"axes[0, 0].plot(sep_len, predictions, 'bo')\n",
"axes[0, 0].set_title(\"Sepal lenght\")\n",
"\n",
"axes[0, 1].plot(sep_with, predictions, 'bo')\n",
"axes[0, 1].set_title(\"Sepal width\")\n",
"\n",
"axes[1, 0].plot(pet_len, predictions, 'bo')\n",
"axes[1, 0].set_title(\"Petal length\")\n",
"\n",
"axes[1, 1].plot(pet_with, predictions, 'bo')\n",
"axes[1, 1].set_title(\"Petal width\")\n",
"\n",
"figure.tight_layout()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"collapsed_sections": [],
"name": "naive_bayes.ipynb",
"provenance": []
},
"kernelspec": {
2022-05-17 20:01:51 +02:00
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
2022-05-17 23:08:12 +02:00
"version": "3.8.12"
}
},
"nbformat": 4,
"nbformat_minor": 4
}