forked from pms/uczenie-maszynowe
Wykład 6. Problem nadmiernego dopasowania
This commit is contained in:
parent
ff0bda56cf
commit
eacef8109a
@ -199,18 +199,6 @@
|
|||||||
"Jak widać powyżej, tutaj oprócz liczb pojawiają się pewne tekstowe wartości specjalne, takie jak `parter`, `poddasze` czy `niski parter`."
|
"Jak widać powyżej, tutaj oprócz liczb pojawiają się pewne tekstowe wartości specjalne, takie jak `parter`, `poddasze` czy `niski parter`."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Takie wartości należy zamienić na liczby. Jak?\n",
|
|
||||||
"* Wydaje się, że `parter` czy `niski parter` można z powodzeniem potraktować jako piętro „zerowe” i zamienić na `0`.\n",
|
|
||||||
"* Z poddaszem sytuacja nie jest już tak oczywista. Czy mają Państwo jakieś propozycje?\n",
|
|
||||||
" * Może zamienić `poddasze` na wartość NaN (zobacz poniżej)?\n",
|
|
||||||
" * Może wykorzystać w tym celu wartość z sąsiedniej kolumny *Liczba pięter w budynku*?\n",
|
|
||||||
" * Może w ogóle odrzucić przykłady, w których występuje ta wartość? (jeżeli tych przykładów jest bardzo mało)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 8,
|
"execution_count": 8,
|
||||||
@ -251,6 +239,18 @@
|
|||||||
"alldata[\"Piętro\"].value_counts()\n"
|
"alldata[\"Piętro\"].value_counts()\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Takie wartości należy zamienić na liczby. Jak?\n",
|
||||||
|
"* Wydaje się, że `parter` czy `niski parter` można z powodzeniem potraktować jako piętro „zerowe” i zamienić na `0`.\n",
|
||||||
|
"* Z poddaszem sytuacja nie jest już tak oczywista. Czy mają Państwo jakieś propozycje?\n",
|
||||||
|
" * Może zamienić `poddasze` na wartość NaN (zobacz poniżej)?\n",
|
||||||
|
" * Może wykorzystać w tym celu wartość z sąsiedniej kolumny *Liczba pięter w budynku*?\n",
|
||||||
|
" * Skoro `poddasze` pojawia się tylko w nielicznych przykładach, może w ogóle odrzucić te przykłady?"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
|
@ -98,7 +98,6 @@
|
|||||||
"data = preprocess(data) # wstępne przetworzenie danych\n",
|
"data = preprocess(data) # wstępne przetworzenie danych\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Podział danych na zbiory uczący i testowy\n",
|
"# Podział danych na zbiory uczący i testowy\n",
|
||||||
"split_point = int(0.8 * len(data))\n",
|
|
||||||
"data_train, data_test = train_test_split(data, test_size=0.2)\n",
|
"data_train, data_test = train_test_split(data, test_size=0.2)\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Uczenie modelu\n",
|
"# Uczenie modelu\n",
|
||||||
@ -252,7 +251,6 @@
|
|||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Podział danych na zbiór uczący i zbiór testowy\n",
|
"# Podział danych na zbiór uczący i zbiór testowy\n",
|
||||||
"split_point = int(0.8 * len(data_iris))\n",
|
|
||||||
"data_train, data_test = train_test_split(data_iris, test_size=0.2)\n",
|
"data_train, data_test = train_test_split(data_iris, test_size=0.2)\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Uczenie modelu\n",
|
"# Uczenie modelu\n",
|
||||||
@ -283,7 +281,7 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"celltoolbar": "Slideshow",
|
"celltoolbar": "Slideshow",
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.10.6 64-bit",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
|
File diff suppressed because one or more lines are too long
1832
wyk/06_Problem_nadmiernego_dopasowania.ipynb
Normal file
1832
wyk/06_Problem_nadmiernego_dopasowania.ipynb
Normal file
File diff suppressed because one or more lines are too long
BIN
wyk/bias2.png
Normal file
BIN
wyk/bias2.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 122 KiB |
BIN
wyk/curves.jpg
Normal file
BIN
wyk/curves.jpg
Normal file
Binary file not shown.
After Width: | Height: | Size: 94 KiB |
50
wyk/data-metrics.tsv
Normal file
50
wyk/data-metrics.tsv
Normal file
@ -0,0 +1,50 @@
|
|||||||
|
0 -0.9410633308036449 0.46518252113944425
|
||||||
|
1 0.4700636553691919 -0.3970321538875541
|
||||||
|
1 -0.01609299859794966 0.23161453968628254
|
||||||
|
0 -0.9966154155058933 0.06419313152355421
|
||||||
|
0 0.8000009607150127 0.44133107977776875
|
||||||
|
0 0.389227379480078 -0.8415416694237676
|
||||||
|
0 -0.7786281038890375 0.2833716839963434
|
||||||
|
1 -0.10150562150521569 -0.02968754639839366
|
||||||
|
1 -0.14995353486391494 0.30921523116923866
|
||||||
|
0 0.3150219624148183 0.4186143523577863
|
||||||
|
0 -0.5542734031872467 0.9291684810885719
|
||||||
|
0 -0.44750469543445215 -0.8240387195698262
|
||||||
|
0 -0.7875312310670415 0.27475695030524894
|
||||||
|
0 0.20470154428730747 -0.8122722630746713
|
||||||
|
0 0.07472783793361693 0.8936381678688297
|
||||||
|
0 -0.6016285994197443 -0.9783927694535444
|
||||||
|
0 0.4235345463350013 -0.23977977886239832
|
||||||
|
0 0.256790496684171 -0.5587059709121811
|
||||||
|
0 -0.2172656054288027 0.8015306542483966
|
||||||
|
0 0.2009238354275602 0.9376873763906164
|
||||||
|
0 -0.8760038215191506 0.015194717659306356
|
||||||
|
0 -0.1512141038160364 -0.9575528046526418
|
||||||
|
0 -0.6378974241766098 0.35900665963616696
|
||||||
|
0 -0.6219617077011876 0.04019896541474166
|
||||||
|
0 -0.2533778634666939 -0.8576798720089458
|
||||||
|
0 -0.9398823073223508 0.806594859009744
|
||||||
|
0 -0.24161324930138606 -0.6982896600554984
|
||||||
|
0 -0.967724402993285 0.15651783268628372
|
||||||
|
0 0.9587968810951801 -0.3382309645563397
|
||||||
|
1 0.18040441263417084 -0.026706542719935777
|
||||||
|
0 -0.2403226372749332 -0.2694487472698215
|
||||||
|
0 -0.49494412803453747 -0.6833825934742561
|
||||||
|
0 -0.32266963833818574 0.6299706350061482
|
||||||
|
0 -0.716450532167108 0.7792499086149187
|
||||||
|
1 -0.5661825812948427 -0.3045016769669948
|
||||||
|
0 -0.9014952263862088 0.19697267011506714
|
||||||
|
1 0.3192734822128551 -0.3145295901019187
|
||||||
|
1 -0.4386590899062277 0.6119229005694005
|
||||||
|
0 -0.6306933372350818 0.4721301354446683
|
||||||
|
0 0.3302936606411402 -0.3047093070118343
|
||||||
|
1 -0.38049655790356285 -0.609474130471132
|
||||||
|
1 0.32069301644263426 0.17266197471996692
|
||||||
|
1 0.8349752241994568 0.4408717276862013
|
||||||
|
0 -0.26741723386938343 -0.4919294757003996
|
||||||
|
0 -0.7786699335922747 -0.47305795528791905
|
||||||
|
0 0.723410510517891 -0.010095862311693793
|
||||||
|
0 0.0902826080483603 -0.6805262097228113
|
||||||
|
0 -0.9286972617786873 0.7200430642275493
|
||||||
|
0 -0.0623197964184079 0.8187639325432745
|
||||||
|
0 -0.20572090815735944 -0.6655000969777327
|
|
4186
wyk/data_flats_with_outliers.tsv
Normal file
4186
wyk/data_flats_with_outliers.tsv
Normal file
File diff suppressed because it is too large
Load Diff
118
wyk/ex2data2.txt
Normal file
118
wyk/ex2data2.txt
Normal file
@ -0,0 +1,118 @@
|
|||||||
|
0.051267,0.69956,1
|
||||||
|
-0.092742,0.68494,1
|
||||||
|
-0.21371,0.69225,1
|
||||||
|
-0.375,0.50219,1
|
||||||
|
-0.51325,0.46564,1
|
||||||
|
-0.52477,0.2098,1
|
||||||
|
-0.39804,0.034357,1
|
||||||
|
-0.30588,-0.19225,1
|
||||||
|
0.016705,-0.40424,1
|
||||||
|
0.13191,-0.51389,1
|
||||||
|
0.38537,-0.56506,1
|
||||||
|
0.52938,-0.5212,1
|
||||||
|
0.63882,-0.24342,1
|
||||||
|
0.73675,-0.18494,1
|
||||||
|
0.54666,0.48757,1
|
||||||
|
0.322,0.5826,1
|
||||||
|
0.16647,0.53874,1
|
||||||
|
-0.046659,0.81652,1
|
||||||
|
-0.17339,0.69956,1
|
||||||
|
-0.47869,0.63377,1
|
||||||
|
-0.60541,0.59722,1
|
||||||
|
-0.62846,0.33406,1
|
||||||
|
-0.59389,0.005117,1
|
||||||
|
-0.42108,-0.27266,1
|
||||||
|
-0.11578,-0.39693,1
|
||||||
|
0.20104,-0.60161,1
|
||||||
|
0.46601,-0.53582,1
|
||||||
|
0.67339,-0.53582,1
|
||||||
|
-0.13882,0.54605,1
|
||||||
|
-0.29435,0.77997,1
|
||||||
|
-0.26555,0.96272,1
|
||||||
|
-0.16187,0.8019,1
|
||||||
|
-0.17339,0.64839,1
|
||||||
|
-0.28283,0.47295,1
|
||||||
|
-0.36348,0.31213,1
|
||||||
|
-0.30012,0.027047,1
|
||||||
|
-0.23675,-0.21418,1
|
||||||
|
-0.06394,-0.18494,1
|
||||||
|
0.062788,-0.16301,1
|
||||||
|
0.22984,-0.41155,1
|
||||||
|
0.2932,-0.2288,1
|
||||||
|
0.48329,-0.18494,1
|
||||||
|
0.64459,-0.14108,1
|
||||||
|
0.46025,0.012427,1
|
||||||
|
0.6273,0.15863,1
|
||||||
|
0.57546,0.26827,1
|
||||||
|
0.72523,0.44371,1
|
||||||
|
0.22408,0.52412,1
|
||||||
|
0.44297,0.67032,1
|
||||||
|
0.322,0.69225,1
|
||||||
|
0.13767,0.57529,1
|
||||||
|
-0.0063364,0.39985,1
|
||||||
|
-0.092742,0.55336,1
|
||||||
|
-0.20795,0.35599,1
|
||||||
|
-0.20795,0.17325,1
|
||||||
|
-0.43836,0.21711,1
|
||||||
|
-0.21947,-0.016813,1
|
||||||
|
-0.13882,-0.27266,1
|
||||||
|
0.18376,0.93348,0
|
||||||
|
0.22408,0.77997,0
|
||||||
|
0.29896,0.61915,0
|
||||||
|
0.50634,0.75804,0
|
||||||
|
0.61578,0.7288,0
|
||||||
|
0.60426,0.59722,0
|
||||||
|
0.76555,0.50219,0
|
||||||
|
0.92684,0.3633,0
|
||||||
|
0.82316,0.27558,0
|
||||||
|
0.96141,0.085526,0
|
||||||
|
0.93836,0.012427,0
|
||||||
|
0.86348,-0.082602,0
|
||||||
|
0.89804,-0.20687,0
|
||||||
|
0.85196,-0.36769,0
|
||||||
|
0.82892,-0.5212,0
|
||||||
|
0.79435,-0.55775,0
|
||||||
|
0.59274,-0.7405,0
|
||||||
|
0.51786,-0.5943,0
|
||||||
|
0.46601,-0.41886,0
|
||||||
|
0.35081,-0.57968,0
|
||||||
|
0.28744,-0.76974,0
|
||||||
|
0.085829,-0.75512,0
|
||||||
|
0.14919,-0.57968,0
|
||||||
|
-0.13306,-0.4481,0
|
||||||
|
-0.40956,-0.41155,0
|
||||||
|
-0.39228,-0.25804,0
|
||||||
|
-0.74366,-0.25804,0
|
||||||
|
-0.69758,0.041667,0
|
||||||
|
-0.75518,0.2902,0
|
||||||
|
-0.69758,0.68494,0
|
||||||
|
-0.4038,0.70687,0
|
||||||
|
-0.38076,0.91886,0
|
||||||
|
-0.50749,0.90424,0
|
||||||
|
-0.54781,0.70687,0
|
||||||
|
0.10311,0.77997,0
|
||||||
|
0.057028,0.91886,0
|
||||||
|
-0.10426,0.99196,0
|
||||||
|
-0.081221,1.1089,0
|
||||||
|
0.28744,1.087,0
|
||||||
|
0.39689,0.82383,0
|
||||||
|
0.63882,0.88962,0
|
||||||
|
0.82316,0.66301,0
|
||||||
|
0.67339,0.64108,0
|
||||||
|
1.0709,0.10015,0
|
||||||
|
-0.046659,-0.57968,0
|
||||||
|
-0.23675,-0.63816,0
|
||||||
|
-0.15035,-0.36769,0
|
||||||
|
-0.49021,-0.3019,0
|
||||||
|
-0.46717,-0.13377,0
|
||||||
|
-0.28859,-0.060673,0
|
||||||
|
-0.61118,-0.067982,0
|
||||||
|
-0.66302,-0.21418,0
|
||||||
|
-0.59965,-0.41886,0
|
||||||
|
-0.72638,-0.082602,0
|
||||||
|
-0.83007,0.31213,0
|
||||||
|
-0.72062,0.53874,0
|
||||||
|
-0.59389,0.49488,0
|
||||||
|
-0.48445,0.99927,0
|
||||||
|
-0.0063364,0.99927,0
|
||||||
|
0.63265,-0.030612,0
|
BIN
wyk/fit.png
Normal file
BIN
wyk/fit.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 138 KiB |
BIN
wyk/learning-curves.png
Normal file
BIN
wyk/learning-curves.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 5.9 KiB |
100
wyk/polynomial_logistic.tsv
Normal file
100
wyk/polynomial_logistic.tsv
Normal file
@ -0,0 +1,100 @@
|
|||||||
|
1 0.25777005758108174 0.601012316037165
|
||||||
|
1 0.3659669567447452 -0.11214686303429633
|
||||||
|
0 0.49453050141627375 0.47110655546911206
|
||||||
|
0 0.7029060372914113 -0.9225798301680093
|
||||||
|
0 0.46658862037642423 -0.6226973935055724
|
||||||
|
0 0.8793946243263941 -0.11408014657778076
|
||||||
|
0 -0.3311850002119068 0.8444766749977881
|
||||||
|
0 -0.5435170087333634 0.8851383010436487
|
||||||
|
0 0.9197924083397226 0.41607011737177735
|
||||||
|
0 0.28011742147804797 0.6143115673056148
|
||||||
|
0 0.9475436344725683 -0.7830731144606005
|
||||||
|
0 0.4904989452188586 0.649356142549592
|
||||||
|
0 -0.865983500565505 0.9896361556274065
|
||||||
|
0 -0.8579184997717257 0.3062253122060574
|
||||||
|
0 0.08082005095746103 -0.7736760810964189
|
||||||
|
0 -0.3363842450225085 -0.8802992880290186
|
||||||
|
0 0.4748472924067402 0.9756949850919965
|
||||||
|
0 -0.7956979203895616 0.8751067723304518
|
||||||
|
1 0.06752895667287606 -0.7683056187589332
|
||||||
|
0 -0.5825898275446799 0.8068359661366173
|
||||||
|
1 0.1109238791315652 -0.2034825016864903
|
||||||
|
0 0.5011542085506828 0.9366868642789181
|
||||||
|
0 0.2011359606302785 0.4800561245801245
|
||||||
|
1 -0.38620580274071115 0.4003933803256208
|
||||||
|
1 -0.1722113915778094 0.3926707935387965
|
||||||
|
0 0.6575404624823169 -0.7070032890943085
|
||||||
|
1 -0.2832309098070882 0.034184675674787446
|
||||||
|
1 -0.16828017341376333 -0.1628482245819587
|
||||||
|
0 -0.6552618226108893 -0.3159705063754401
|
||||||
|
0 -0.6466772083696701 -0.07116372625398881
|
||||||
|
0 0.848711325640519 0.2132898335742659
|
||||||
|
1 -0.35490315474701606 -0.0025105634256454845
|
||||||
|
0 -0.36568446532837817 0.5637325774329354
|
||||||
|
0 -0.5089179414092766 0.8086671779253405
|
||||||
|
0 0.9609295951994559 -0.12114542304082354
|
||||||
|
0 0.055563338045806265 0.8532855304613407
|
||||||
|
0 -0.8937129542754998 -0.02555660184206876
|
||||||
|
0 0.40678784672410284 -0.5480665560665205
|
||||||
|
0 -0.7683896050204841 0.9475293644451854
|
||||||
|
0 -0.515467982993429 -0.5389177617277066
|
||||||
|
0 0.9693903475176826 -0.9765032993967369
|
||||||
|
0 -0.5476549714934908 -0.018838768427513974
|
||||||
|
0 0.5262277827151787 -0.9936327305281174
|
||||||
|
1 0.9394838829593151 0.9962891110157359
|
||||||
|
0 -0.935709119652979 -0.6940925482964921
|
||||||
|
0 0.6161569745665239 -0.044448545050667976
|
||||||
|
0 -0.08521587367561922 0.9636255303204684
|
||||||
|
0 0.9073344675416231 -0.08813265618067079
|
||||||
|
1 -0.1563237189794715 0.05022859605451302
|
||||||
|
0 -0.9785642881644829 -0.5076719844587916
|
||||||
|
0 -0.5494648865481802 -0.6044852696776528
|
||||||
|
0 -0.7170122682018529 -0.6250685449461151
|
||||||
|
1 0.5333872877810009 0.1395189003073396
|
||||||
|
0 -0.49270328980187905 0.9081426529064955
|
||||||
|
1 0.07777642690144848 -0.44188199856981347
|
||||||
|
0 0.8328452661100116 0.5508441451500428
|
||||||
|
1 -0.33275827507477573 -0.15434344174028314
|
||||||
|
0 -0.9057550401714867 0.6324599729071743
|
||||||
|
0 -0.8476574433184823 0.5739140088331203
|
||||||
|
0 -0.37393930555231103 0.7361874446899226
|
||||||
|
1 0.6610910543790163 0.0036185958785315275
|
||||||
|
0 0.49147748571126004 -0.6155167984371757
|
||||||
|
0 0.31992462553488177 -0.38253832622755657
|
||||||
|
0 0.7398386519468336 -0.915886088774648
|
||||||
|
0 0.5915392280694003 0.011422405850611383
|
||||||
|
0 -0.5818860867200502 -0.44086037005029377
|
||||||
|
0 -0.9066322824076023 0.21754010215910524
|
||||||
|
1 0.12243932470792318 -0.3830697406526009
|
||||||
|
0 0.40607941790742297 0.5626829623336307
|
||||||
|
1 -0.1210920179663808 -0.20552144405177608
|
||||||
|
0 0.48099006522554233 0.9583656149315158
|
||||||
|
0 -0.059491720260914205 0.6161097510891897
|
||||||
|
1 -0.053220979060695006 0.07562497263502688
|
||||||
|
0 -0.8742304482942296 -0.13488952315510616
|
||||||
|
0 0.7362712712103594 0.6087347685508093
|
||||||
|
0 0.025549937023763736 -0.6202087182389777
|
||||||
|
0 0.6755333538371804 0.7047713746899604
|
||||||
|
0 -0.3954771867034055 0.3567082570178153
|
||||||
|
1 0.24896928809009156 -0.17106278785061302
|
||||||
|
0 0.6133735778535989 -0.6297261231852487
|
||||||
|
1 -0.35955189872833593 -0.2086164112593747
|
||||||
|
0 0.646544898896497 0.8858921579510579
|
||||||
|
0 0.6459228334265068 -0.9141274779126995
|
||||||
|
0 -0.5279127041052518 -0.11119649758918437
|
||||||
|
0 -0.47141090620857784 -0.29849889702571786
|
||||||
|
1 0.1901970467567704 -0.5049996808415897
|
||||||
|
0 -0.5497623652380574 -0.49032403671408553
|
||||||
|
0 -0.5759454285366339 0.445122514716527
|
||||||
|
0 -0.7800687910859982 -0.4823078816937112
|
||||||
|
0 0.39722150362989095 0.5827352140491311
|
||||||
|
1 0.018540458464545218 -0.20805328372207677
|
||||||
|
0 -0.14419638252986933 -0.8679481460173017
|
||||||
|
1 -0.15012196110925857 0.5474017473230433
|
||||||
|
1 -0.11028545705088533 0.5371497474265077
|
||||||
|
0 -0.46577855502057375 -0.9226883886539352
|
||||||
|
0 0.4843595022265692 0.47692504895620713
|
||||||
|
0 0.4330264545403766 -0.40096944878062857
|
||||||
|
0 -0.7401024435876022 0.758623363044544
|
||||||
|
0 0.20470935356917574 -0.7551473328272353
|
||||||
|
0 0.1877078820888327 -0.3377139504156679
|
|
Loading…
Reference in New Issue
Block a user