IUM 2
Installation of packages
Requirement already satisfied: kaggle in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (1.6.6)
Requirement already satisfied: six>=1.10 in c:\users\skype\appdata\roaming\python\python312\site-packages (from kaggle) (1.16.0)
Requirement already satisfied: certifi in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from kaggle) (2024.2.2)
Requirement already satisfied: python-dateutil in c:\users\skype\appdata\roaming\python\python312\site-packages (from kaggle) (2.9.0.post0)
Requirement already satisfied: requests in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from kaggle) (2.31.0)
Requirement already satisfied: tqdm in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from kaggle) (4.66.2)
Requirement already satisfied: python-slugify in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from kaggle) (8.0.4)
Requirement already satisfied: urllib3 in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from kaggle) (2.2.1)
Requirement already satisfied: bleach in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from kaggle) (6.1.0)
Requirement already satisfied: webencodings in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from bleach->kaggle) (0.5.1)
Requirement already satisfied: text-unidecode>=1.3 in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from python-slugify->kaggle) (1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from requests->kaggle) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from requests->kaggle) (3.6)
Requirement already satisfied: colorama in c:\users\skype\appdata\roaming\python\python312\site-packages (from tqdm->kaggle) (0.4.6)
Note: you may need to restart the kernel to use updated packages.
Requirement already satisfied: pandas in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (2.2.1)
Requirement already satisfied: numpy<2,>=1.26.0 in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from pandas) (1.26.3)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\skype\appdata\roaming\python\python312\site-packages (from pandas) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from pandas) (2024.1)
Requirement already satisfied: tzdata>=2022.7 in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from pandas) (2024.1)
Requirement already satisfied: six>=1.5 in c:\users\skype\appdata\roaming\python\python312\site-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)
Note: you may need to restart the kernel to use updated packages.
Requirement already satisfied: numpy in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (1.26.3)
Note: you may need to restart the kernel to use updated packages.
Requirement already satisfied: scikit-learn in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (1.4.1.post1)
Requirement already satisfied: numpy<2.0,>=1.19.5 in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from scikit-learn) (1.26.3)
Requirement already satisfied: scipy>=1.6.0 in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from scikit-learn) (1.12.0)
Requirement already satisfied: joblib>=1.2.0 in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from scikit-learn) (1.3.2)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\skype\appdata\local\programs\python\python312\lib\site-packages (from scikit-learn) (3.3.0)
Note: you may need to restart the kernel to use updated packages.
Importing libraries
Downloading a dataset
creditcardfraud.zip: Skipping, found more recently modified local copy (use --force to force download)
Uncompress a file
Archive: creditcardfraud.zip
inflating: creditcard.csv
Load the data
Check missing values
Time 0
V1 0
V2 0
V3 0
V4 0
V5 0
V6 0
V7 0
V8 0
V9 0
V10 0
V11 0
V12 0
V13 0
V14 0
V15 0
V16 0
V17 0
V18 0
V19 0
V20 0
V21 0
V22 0
V23 0
V24 0
V25 0
V26 0
V27 0
V28 0
Amount 0
Class 0
dtype: int64
Size of the dataset
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 284807 entries, 0 to 284806
Data columns (total 31 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Time 284807 non-null float64
1 V1 284807 non-null float64
2 V2 284807 non-null float64
3 V3 284807 non-null float64
4 V4 284807 non-null float64
5 V5 284807 non-null float64
6 V6 284807 non-null float64
7 V7 284807 non-null float64
8 V8 284807 non-null float64
9 V9 284807 non-null float64
10 V10 284807 non-null float64
11 V11 284807 non-null float64
12 V12 284807 non-null float64
13 V13 284807 non-null float64
14 V14 284807 non-null float64
15 V15 284807 non-null float64
16 V16 284807 non-null float64
17 V17 284807 non-null float64
18 V18 284807 non-null float64
19 V19 284807 non-null float64
20 V20 284807 non-null float64
21 V21 284807 non-null float64
22 V22 284807 non-null float64
23 V23 284807 non-null float64
24 V24 284807 non-null float64
25 V25 284807 non-null float64
26 V26 284807 non-null float64
27 V27 284807 non-null float64
28 V28 284807 non-null float64
29 Amount 284807 non-null float64
30 Class 284807 non-null int64
dtypes: float64(30), int64(1)
memory usage: 67.4 MB
Normalising the data
Summary statistics
|
Time |
V1 |
V2 |
V3 |
V4 |
V5 |
V6 |
V7 |
V8 |
V9 |
V10 |
V11 |
V12 |
V13 |
V14 |
V15 |
V16 |
V17 |
V18 |
V19 |
V20 |
V21 |
V22 |
V23 |
V24 |
V25 |
V26 |
V27 |
V28 |
Amount |
Class |
count |
284807.000000 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
2.848070e+05 |
284807.000000 |
mean |
94813.859575 |
1.168375e-15 |
3.416908e-16 |
-1.379537e-15 |
2.074095e-15 |
9.604066e-16 |
1.487313e-15 |
-5.556467e-16 |
1.213481e-16 |
-2.406331e-15 |
2.239053e-15 |
1.673327e-15 |
-1.247012e-15 |
8.190001e-16 |
1.207294e-15 |
4.887456e-15 |
1.437716e-15 |
-3.772171e-16 |
9.564149e-16 |
1.039917e-15 |
6.406204e-16 |
1.654067e-16 |
-3.568593e-16 |
2.578648e-16 |
4.473266e-15 |
5.340915e-16 |
1.683437e-15 |
-3.660091e-16 |
-1.227390e-16 |
2.913952e-17 |
0.001727 |
std |
47488.145955 |
1.958696e+00 |
1.651309e+00 |
1.516255e+00 |
1.415869e+00 |
1.380247e+00 |
1.332271e+00 |
1.237094e+00 |
1.194353e+00 |
1.098632e+00 |
1.088850e+00 |
1.020713e+00 |
9.992014e-01 |
9.952742e-01 |
9.585956e-01 |
9.153160e-01 |
8.762529e-01 |
8.493371e-01 |
8.381762e-01 |
8.140405e-01 |
7.709250e-01 |
7.345240e-01 |
7.257016e-01 |
6.244603e-01 |
6.056471e-01 |
5.212781e-01 |
4.822270e-01 |
4.036325e-01 |
3.300833e-01 |
1.000002e+00 |
0.041527 |
min |
0.000000 |
-5.640751e+01 |
-7.271573e+01 |
-4.832559e+01 |
-5.683171e+00 |
-1.137433e+02 |
-2.616051e+01 |
-4.355724e+01 |
-7.321672e+01 |
-1.343407e+01 |
-2.458826e+01 |
-4.797473e+00 |
-1.868371e+01 |
-5.791881e+00 |
-1.921433e+01 |
-4.498945e+00 |
-1.412985e+01 |
-2.516280e+01 |
-9.498746e+00 |
-7.213527e+00 |
-5.449772e+01 |
-3.483038e+01 |
-1.093314e+01 |
-4.480774e+01 |
-2.836627e+00 |
-1.029540e+01 |
-2.604551e+00 |
-2.256568e+01 |
-1.543008e+01 |
-3.532294e-01 |
0.000000 |
25% |
54201.500000 |
-9.203734e-01 |
-5.985499e-01 |
-8.903648e-01 |
-8.486401e-01 |
-6.915971e-01 |
-7.682956e-01 |
-5.540759e-01 |
-2.086297e-01 |
-6.430976e-01 |
-5.354257e-01 |
-7.624942e-01 |
-4.055715e-01 |
-6.485393e-01 |
-4.255740e-01 |
-5.828843e-01 |
-4.680368e-01 |
-4.837483e-01 |
-4.988498e-01 |
-4.562989e-01 |
-2.117214e-01 |
-2.283949e-01 |
-5.423504e-01 |
-1.618463e-01 |
-3.545861e-01 |
-3.171451e-01 |
-3.269839e-01 |
-7.083953e-02 |
-5.295979e-02 |
-3.308401e-01 |
0.000000 |
50% |
84692.000000 |
1.810880e-02 |
6.548556e-02 |
1.798463e-01 |
-1.984653e-02 |
-5.433583e-02 |
-2.741871e-01 |
4.010308e-02 |
2.235804e-02 |
-5.142873e-02 |
-9.291738e-02 |
-3.275735e-02 |
1.400326e-01 |
-1.356806e-02 |
5.060132e-02 |
4.807155e-02 |
6.641332e-02 |
-6.567575e-02 |
-3.636312e-03 |
3.734823e-03 |
-6.248109e-02 |
-2.945017e-02 |
6.781943e-03 |
-1.119293e-02 |
4.097606e-02 |
1.659350e-02 |
-5.213911e-02 |
1.342146e-03 |
1.124383e-02 |
-2.652715e-01 |
0.000000 |
75% |
139320.500000 |
1.315642e+00 |
8.037239e-01 |
1.027196e+00 |
7.433413e-01 |
6.119264e-01 |
3.985649e-01 |
5.704361e-01 |
3.273459e-01 |
5.971390e-01 |
4.539234e-01 |
7.395934e-01 |
6.182380e-01 |
6.625050e-01 |
4.931498e-01 |
6.488208e-01 |
5.232963e-01 |
3.996750e-01 |
5.008067e-01 |
4.589494e-01 |
1.330408e-01 |
1.863772e-01 |
5.285536e-01 |
1.476421e-01 |
4.395266e-01 |
3.507156e-01 |
2.409522e-01 |
9.104512e-02 |
7.827995e-02 |
-4.471707e-02 |
0.000000 |
max |
172792.000000 |
2.454930e+00 |
2.205773e+01 |
9.382558e+00 |
1.687534e+01 |
3.480167e+01 |
7.330163e+01 |
1.205895e+02 |
2.000721e+01 |
1.559499e+01 |
2.374514e+01 |
1.201891e+01 |
7.848392e+00 |
7.126883e+00 |
1.052677e+01 |
8.877742e+00 |
1.731511e+01 |
9.253526e+00 |
5.041069e+00 |
5.591971e+00 |
3.942090e+01 |
2.720284e+01 |
1.050309e+01 |
2.252841e+01 |
4.584549e+00 |
7.519589e+00 |
3.517346e+00 |
3.161220e+01 |
3.384781e+01 |
1.023622e+02 |
1.000000 |
Distribution of legitimate and fraudulent transactions
Class
0 284315
1 492
Name: count, dtype: int64
Undersampling the data
We will employ undersampling as one class significantly dominates the other.
Size of undersampled dataset
<class 'pandas.core.frame.DataFrame'>
Index: 984 entries, 541 to 141412
Data columns (total 31 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Time 984 non-null float64
1 V1 984 non-null float64
2 V2 984 non-null float64
3 V3 984 non-null float64
4 V4 984 non-null float64
5 V5 984 non-null float64
6 V6 984 non-null float64
7 V7 984 non-null float64
8 V8 984 non-null float64
9 V9 984 non-null float64
10 V10 984 non-null float64
11 V11 984 non-null float64
12 V12 984 non-null float64
13 V13 984 non-null float64
14 V14 984 non-null float64
15 V15 984 non-null float64
16 V16 984 non-null float64
17 V17 984 non-null float64
18 V18 984 non-null float64
19 V19 984 non-null float64
20 V20 984 non-null float64
21 V21 984 non-null float64
22 V22 984 non-null float64
23 V23 984 non-null float64
24 V24 984 non-null float64
25 V25 984 non-null float64
26 V26 984 non-null float64
27 V27 984 non-null float64
28 V28 984 non-null float64
29 Amount 984 non-null float64
30 Class 984 non-null int64
dtypes: float64(30), int64(1)
memory usage: 246.0 KB
Summary statistics of the undersampled dataset
|
Time |
V1 |
V2 |
V3 |
V4 |
V5 |
V6 |
V7 |
V8 |
V9 |
V10 |
V11 |
V12 |
V13 |
V14 |
V15 |
V16 |
V17 |
V18 |
V19 |
V20 |
V21 |
V22 |
V23 |
V24 |
V25 |
V26 |
V27 |
V28 |
Amount |
Class |
count |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
984.000000 |
mean |
88501.498984 |
-2.445079 |
1.781022 |
-3.509406 |
2.214004 |
-1.477993 |
-0.713150 |
-2.787427 |
0.279073 |
-1.253108 |
-2.841500 |
1.930697 |
-3.124120 |
-0.026229 |
-3.502384 |
-0.039494 |
-2.097294 |
-3.304208 |
-1.128950 |
0.343668 |
0.175905 |
0.331911 |
0.049631 |
-0.031264 |
-0.037389 |
0.022812 |
0.027632 |
0.086286 |
0.046738 |
0.039676 |
0.500000 |
std |
48996.269445 |
5.512352 |
3.713232 |
6.223001 |
3.231076 |
4.274632 |
1.789350 |
5.856197 |
4.857643 |
2.371055 |
4.563067 |
2.764745 |
4.595103 |
1.054377 |
4.653202 |
1.002911 |
3.465619 |
5.990033 |
2.412032 |
1.290973 |
1.126258 |
2.787884 |
1.167097 |
1.177562 |
0.551518 |
0.677541 |
0.476480 |
1.023332 |
0.479168 |
0.851800 |
0.500254 |
min |
60.000000 |
-30.552380 |
-15.799625 |
-31.103685 |
-3.863126 |
-22.105532 |
-10.261990 |
-43.557242 |
-41.044261 |
-13.434066 |
-24.588262 |
-2.613374 |
-18.683715 |
-3.223045 |
-19.214325 |
-4.498945 |
-14.129855 |
-25.162799 |
-9.498746 |
-3.681904 |
-7.242879 |
-22.797604 |
-8.887017 |
-19.254328 |
-2.028024 |
-4.781606 |
-1.214960 |
-7.263482 |
-2.735623 |
-0.353229 |
0.000000 |
25% |
45531.000000 |
-2.867222 |
-0.155438 |
-5.084967 |
-0.172018 |
-1.700260 |
-1.619179 |
-3.066415 |
-0.204192 |
-2.279453 |
-4.572043 |
-0.187147 |
-5.495221 |
-0.784589 |
-6.721799 |
-0.627097 |
-3.543426 |
-5.302111 |
-1.809496 |
-0.412430 |
-0.187708 |
-0.157259 |
-0.509376 |
-0.240064 |
-0.379825 |
-0.321251 |
-0.281187 |
-0.061809 |
-0.050194 |
-0.347302 |
0.000000 |
50% |
83076.500000 |
-0.823244 |
0.957399 |
-1.381998 |
1.287041 |
-0.394605 |
-0.689473 |
-0.668321 |
0.147397 |
-0.694910 |
-0.948441 |
1.170286 |
-0.858094 |
-0.000686 |
-1.110717 |
-0.006070 |
-0.677801 |
-0.513640 |
-0.383038 |
0.221049 |
0.040630 |
0.155404 |
0.080270 |
-0.030318 |
0.009379 |
0.049923 |
-0.007475 |
0.063100 |
0.039464 |
-0.280984 |
0.500000 |
75% |
135051.500000 |
0.919444 |
2.791569 |
0.356911 |
4.175332 |
0.616305 |
0.069620 |
0.265089 |
0.877002 |
0.134399 |
-0.016047 |
3.586502 |
0.190356 |
0.683977 |
0.110541 |
0.672903 |
0.250353 |
0.313841 |
0.334927 |
0.978754 |
0.445616 |
0.642724 |
0.624948 |
0.180735 |
0.365624 |
0.395001 |
0.324059 |
0.457194 |
0.226492 |
0.046539 |
1.000000 |
max |
172733.000000 |
2.335833 |
22.057729 |
3.476268 |
12.114672 |
14.103918 |
6.474115 |
5.802537 |
20.007208 |
6.816732 |
11.732926 |
12.018913 |
2.534876 |
3.091328 |
3.442422 |
2.471358 |
3.139656 |
6.739384 |
3.790316 |
5.228342 |
11.059004 |
27.202839 |
8.361985 |
5.466230 |
1.208141 |
2.208209 |
2.745261 |
3.052358 |
4.975792 |
8.146182 |
1.000000 |
Distribution of legitimate and fraudulent transactions in an undersampled dataset
Class
1 492
0 492
Name: count, dtype: int64
Splitting whole data into training and test datasets
Statistical measures of the training dataset of whole data
<class 'pandas.core.frame.DataFrame'>
Index: 199364 entries, 161145 to 117952
Data columns (total 31 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Time 199364 non-null float64
1 V1 199364 non-null float64
2 V2 199364 non-null float64
3 V3 199364 non-null float64
4 V4 199364 non-null float64
5 V5 199364 non-null float64
6 V6 199364 non-null float64
7 V7 199364 non-null float64
8 V8 199364 non-null float64
9 V9 199364 non-null float64
10 V10 199364 non-null float64
11 V11 199364 non-null float64
12 V12 199364 non-null float64
13 V13 199364 non-null float64
14 V14 199364 non-null float64
15 V15 199364 non-null float64
16 V16 199364 non-null float64
17 V17 199364 non-null float64
18 V18 199364 non-null float64
19 V19 199364 non-null float64
20 V20 199364 non-null float64
21 V21 199364 non-null float64
22 V22 199364 non-null float64
23 V23 199364 non-null float64
24 V24 199364 non-null float64
25 V25 199364 non-null float64
26 V26 199364 non-null float64
27 V27 199364 non-null float64
28 V28 199364 non-null float64
29 Amount 199364 non-null float64
30 Class 199364 non-null int64
dtypes: float64(30), int64(1)
memory usage: 48.7 MB
|
Time |
V1 |
V2 |
V3 |
V4 |
V5 |
V6 |
V7 |
V8 |
V9 |
V10 |
V11 |
V12 |
V13 |
V14 |
V15 |
V16 |
V17 |
V18 |
V19 |
V20 |
V21 |
V22 |
V23 |
V24 |
V25 |
V26 |
V27 |
V28 |
Amount |
Class |
count |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
199364.000000 |
mean |
94799.493936 |
0.000315 |
-0.002690 |
-0.001532 |
0.000721 |
-0.001494 |
-0.000210 |
-0.000870 |
-0.001980 |
0.000212 |
0.001357 |
-0.001039 |
-0.001565 |
0.000693 |
0.000137 |
0.000322 |
0.000084 |
0.000292 |
-0.000134 |
0.000490 |
0.000430 |
-0.000014 |
-0.000022 |
-0.000258 |
0.000362 |
0.000395 |
-0.000094 |
-0.000027 |
0.000015 |
0.001271 |
0.001731 |
std |
47499.835491 |
1.963554 |
1.657379 |
1.516716 |
1.417138 |
1.368744 |
1.328673 |
1.226018 |
1.212338 |
1.102021 |
1.092801 |
1.020027 |
0.996526 |
0.997718 |
0.956938 |
0.916143 |
0.876131 |
0.852181 |
0.837556 |
0.814506 |
0.770257 |
0.743450 |
0.727625 |
0.629145 |
0.605298 |
0.521175 |
0.481842 |
0.401042 |
0.324849 |
0.983948 |
0.041563 |
min |
0.000000 |
-46.855047 |
-63.344698 |
-33.680984 |
-5.560118 |
-42.147898 |
-23.496714 |
-43.557242 |
-73.216718 |
-13.434066 |
-24.588262 |
-4.797473 |
-17.769143 |
-5.791881 |
-19.214325 |
-4.498945 |
-14.129855 |
-25.162799 |
-9.498746 |
-7.213527 |
-23.646890 |
-34.830382 |
-10.933144 |
-44.807735 |
-2.822684 |
-10.295397 |
-2.534330 |
-22.565679 |
-11.710896 |
-0.353229 |
0.000000 |
25% |
54126.000000 |
-0.921539 |
-0.601213 |
-0.892838 |
-0.848835 |
-0.692874 |
-0.769177 |
-0.554220 |
-0.209086 |
-0.644753 |
-0.535493 |
-0.762852 |
-0.407660 |
-0.648456 |
-0.425122 |
-0.583616 |
-0.467945 |
-0.484055 |
-0.498850 |
-0.456800 |
-0.211662 |
-0.229272 |
-0.544345 |
-0.162021 |
-0.354179 |
-0.316088 |
-0.327327 |
-0.070864 |
-0.052907 |
-0.330640 |
0.000000 |
50% |
84633.500000 |
0.019705 |
0.063784 |
0.177888 |
-0.017852 |
-0.055832 |
-0.274397 |
0.039228 |
0.021803 |
-0.049633 |
-0.092069 |
-0.034135 |
0.137912 |
-0.013416 |
0.051179 |
0.049289 |
0.067772 |
-0.065113 |
-0.003217 |
0.004422 |
-0.062889 |
-0.029045 |
0.006744 |
-0.010915 |
0.040974 |
0.018014 |
-0.052287 |
0.001064 |
0.011119 |
-0.265271 |
0.000000 |
75% |
139334.250000 |
1.316707 |
0.802437 |
1.025529 |
0.745566 |
0.609349 |
0.397928 |
0.569638 |
0.327023 |
0.597096 |
0.458129 |
0.738143 |
0.617393 |
0.664148 |
0.493925 |
0.649589 |
0.523095 |
0.401034 |
0.500436 |
0.460367 |
0.132834 |
0.187095 |
0.531017 |
0.147503 |
0.438953 |
0.350802 |
0.241082 |
0.090491 |
0.077989 |
-0.043058 |
0.000000 |
max |
172792.000000 |
2.451888 |
22.057729 |
9.382558 |
16.715537 |
34.099309 |
23.917837 |
44.054461 |
20.007208 |
15.594995 |
23.745136 |
12.018913 |
7.848392 |
4.569009 |
10.526766 |
5.825654 |
7.059132 |
9.207059 |
5.041069 |
5.572113 |
39.420904 |
27.202839 |
10.503090 |
22.528412 |
4.022866 |
7.519589 |
3.463246 |
12.152401 |
22.620072 |
78.235272 |
1.000000 |
Class
0 199019
1 345
Name: count, dtype: int64
Statistical measures of the test dataset of whole data
<class 'pandas.core.frame.DataFrame'>
Index: 85443 entries, 183484 to 240913
Data columns (total 31 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Time 85443 non-null float64
1 V1 85443 non-null float64
2 V2 85443 non-null float64
3 V3 85443 non-null float64
4 V4 85443 non-null float64
5 V5 85443 non-null float64
6 V6 85443 non-null float64
7 V7 85443 non-null float64
8 V8 85443 non-null float64
9 V9 85443 non-null float64
10 V10 85443 non-null float64
11 V11 85443 non-null float64
12 V12 85443 non-null float64
13 V13 85443 non-null float64
14 V14 85443 non-null float64
15 V15 85443 non-null float64
16 V16 85443 non-null float64
17 V17 85443 non-null float64
18 V18 85443 non-null float64
19 V19 85443 non-null float64
20 V20 85443 non-null float64
21 V21 85443 non-null float64
22 V22 85443 non-null float64
23 V23 85443 non-null float64
24 V24 85443 non-null float64
25 V25 85443 non-null float64
26 V26 85443 non-null float64
27 V27 85443 non-null float64
28 V28 85443 non-null float64
29 Amount 85443 non-null float64
30 Class 85443 non-null int64
dtypes: float64(30), int64(1)
memory usage: 20.9 MB
|
Time |
V1 |
V2 |
V3 |
V4 |
V5 |
V6 |
V7 |
V8 |
V9 |
V10 |
V11 |
V12 |
V13 |
V14 |
V15 |
V16 |
V17 |
V18 |
V19 |
V20 |
V21 |
V22 |
V23 |
V24 |
V25 |
V26 |
V27 |
V28 |
Amount |
Class |
count |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
85443.000000 |
mean |
94847.378896 |
-0.000734 |
0.006277 |
0.003574 |
-0.001682 |
0.003486 |
0.000489 |
0.002030 |
0.004620 |
-0.000495 |
-0.003167 |
0.002424 |
0.003652 |
-0.001616 |
-0.000319 |
-0.000751 |
-0.000195 |
-0.000682 |
0.000312 |
-0.001144 |
-0.001004 |
0.000033 |
0.000052 |
0.000602 |
-0.000845 |
-0.000922 |
0.000220 |
0.000062 |
-0.000036 |
-0.002966 |
0.001720 |
std |
47461.120548 |
1.947325 |
1.637050 |
1.515182 |
1.412908 |
1.406722 |
1.340636 |
1.262562 |
1.151291 |
1.090691 |
1.079574 |
1.022315 |
1.005413 |
0.989553 |
0.962457 |
0.913388 |
0.876542 |
0.842669 |
0.839626 |
0.812957 |
0.772484 |
0.713266 |
0.721198 |
0.613394 |
0.606464 |
0.521520 |
0.483126 |
0.409616 |
0.341987 |
1.036492 |
0.041443 |
min |
0.000000 |
-56.407510 |
-72.715728 |
-48.325589 |
-5.683171 |
-113.743307 |
-26.160506 |
-28.215112 |
-50.943369 |
-9.481456 |
-20.949192 |
-4.568390 |
-18.683715 |
-3.888606 |
-18.493773 |
-4.391307 |
-13.303888 |
-22.883999 |
-9.287832 |
-6.938297 |
-54.497720 |
-22.665685 |
-9.499423 |
-32.828995 |
-2.836627 |
-8.696627 |
-2.604551 |
-9.793568 |
-15.430084 |
-0.353229 |
0.000000 |
25% |
54354.000000 |
-0.916858 |
-0.591858 |
-0.883828 |
-0.848202 |
-0.688280 |
-0.766664 |
-0.553479 |
-0.207216 |
-0.638926 |
-0.535400 |
-0.761716 |
-0.400087 |
-0.648761 |
-0.426516 |
-0.581015 |
-0.468312 |
-0.483139 |
-0.498660 |
-0.455027 |
-0.211881 |
-0.226184 |
-0.537704 |
-0.161490 |
-0.355671 |
-0.319736 |
-0.326068 |
-0.070797 |
-0.053129 |
-0.331280 |
0.000000 |
50% |
84850.000000 |
0.013238 |
0.070185 |
0.185047 |
-0.024109 |
-0.051627 |
-0.273686 |
0.042343 |
0.023782 |
-0.053821 |
-0.094949 |
-0.029129 |
0.144948 |
-0.013803 |
0.049248 |
0.045291 |
0.062957 |
-0.066955 |
-0.004245 |
0.002229 |
-0.061529 |
-0.030687 |
0.006971 |
-0.011789 |
0.040976 |
0.013508 |
-0.051695 |
0.001984 |
0.011561 |
-0.265271 |
0.000000 |
75% |
139277.500000 |
1.313257 |
0.806615 |
1.031155 |
0.737784 |
0.618067 |
0.399864 |
0.572423 |
0.328337 |
0.597388 |
0.443126 |
0.743511 |
0.620694 |
0.657826 |
0.491916 |
0.647117 |
0.523608 |
0.396799 |
0.501455 |
0.455249 |
0.133608 |
0.184846 |
0.523689 |
0.147923 |
0.441093 |
0.350617 |
0.240657 |
0.092224 |
0.078900 |
-0.047356 |
0.000000 |
max |
172788.000000 |
2.454930 |
15.876923 |
4.079168 |
16.875344 |
34.801666 |
73.301626 |
120.589494 |
18.748872 |
9.272376 |
15.331742 |
11.669205 |
4.406338 |
7.126883 |
7.439566 |
8.877742 |
17.315112 |
9.253526 |
4.712398 |
5.591971 |
38.117209 |
22.579714 |
7.220158 |
20.803344 |
4.584549 |
5.826159 |
3.517346 |
31.612198 |
33.847808 |
102.362243 |
1.000000 |
Class
0 85296
1 147
Name: count, dtype: int64
Splitting undersampled data into training and test datasets
Statistical measures of the training dataset of undersampled data
<class 'pandas.core.frame.DataFrame'>
Index: 688 entries, 6870 to 208266
Data columns (total 31 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Time 688 non-null float64
1 V1 688 non-null float64
2 V2 688 non-null float64
3 V3 688 non-null float64
4 V4 688 non-null float64
5 V5 688 non-null float64
6 V6 688 non-null float64
7 V7 688 non-null float64
8 V8 688 non-null float64
9 V9 688 non-null float64
10 V10 688 non-null float64
11 V11 688 non-null float64
12 V12 688 non-null float64
13 V13 688 non-null float64
14 V14 688 non-null float64
15 V15 688 non-null float64
16 V16 688 non-null float64
17 V17 688 non-null float64
18 V18 688 non-null float64
19 V19 688 non-null float64
20 V20 688 non-null float64
21 V21 688 non-null float64
22 V22 688 non-null float64
23 V23 688 non-null float64
24 V24 688 non-null float64
25 V25 688 non-null float64
26 V26 688 non-null float64
27 V27 688 non-null float64
28 V28 688 non-null float64
29 Amount 688 non-null float64
30 Class 688 non-null int64
dtypes: float64(30), int64(1)
memory usage: 172.0 KB
|
Time |
V1 |
V2 |
V3 |
V4 |
V5 |
V6 |
V7 |
V8 |
V9 |
V10 |
V11 |
V12 |
V13 |
V14 |
V15 |
V16 |
V17 |
V18 |
V19 |
V20 |
V21 |
V22 |
V23 |
V24 |
V25 |
V26 |
V27 |
V28 |
Amount |
Class |
count |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
688.000000 |
mean |
88546.635174 |
-2.443642 |
1.748210 |
-3.490693 |
2.161294 |
-1.466909 |
-0.737723 |
-2.759190 |
0.361773 |
-1.222417 |
-2.808144 |
1.937783 |
-3.131850 |
-0.001132 |
-3.568854 |
-0.022936 |
-2.145811 |
-3.365430 |
-1.137238 |
0.377690 |
0.127157 |
0.446495 |
0.012945 |
-0.069031 |
-0.020203 |
0.031782 |
0.022154 |
0.114684 |
0.041557 |
0.036592 |
0.501453 |
std |
48529.661753 |
5.382638 |
3.616426 |
6.020391 |
3.198221 |
4.227553 |
1.829535 |
5.498995 |
4.741154 |
2.336555 |
4.417548 |
2.771137 |
4.560753 |
1.081826 |
4.641960 |
0.981683 |
3.458663 |
6.062216 |
2.462689 |
1.287256 |
1.072960 |
2.749354 |
1.143940 |
1.283882 |
0.549485 |
0.689015 |
0.474411 |
0.923161 |
0.487077 |
0.834360 |
0.500362 |
min |
117.000000 |
-30.552380 |
-15.799625 |
-31.103685 |
-3.863126 |
-22.105532 |
-10.261990 |
-37.060311 |
-37.353443 |
-11.126624 |
-23.228255 |
-2.613374 |
-18.431131 |
-3.223045 |
-19.214325 |
-4.498945 |
-13.563273 |
-25.162799 |
-9.498746 |
-3.602657 |
-7.242879 |
-16.922016 |
-8.887017 |
-19.254328 |
-2.028024 |
-4.781606 |
-1.214960 |
-7.263482 |
-2.735623 |
-0.353229 |
0.000000 |
25% |
45531.000000 |
-2.867222 |
-0.164478 |
-5.049001 |
-0.212543 |
-1.703845 |
-1.691031 |
-3.105154 |
-0.220868 |
-2.205996 |
-4.731895 |
-0.194163 |
-5.643631 |
-0.767631 |
-6.767749 |
-0.562582 |
-3.612856 |
-5.277726 |
-1.816368 |
-0.373523 |
-0.197730 |
-0.142520 |
-0.510247 |
-0.246005 |
-0.373302 |
-0.320463 |
-0.281449 |
-0.061809 |
-0.050983 |
-0.346113 |
0.000000 |
50% |
82526.500000 |
-0.874057 |
0.984845 |
-1.482880 |
1.285768 |
-0.400360 |
-0.741307 |
-0.740952 |
0.141389 |
-0.694910 |
-0.981569 |
1.154879 |
-0.845463 |
0.008049 |
-1.132761 |
0.001558 |
-0.750918 |
-0.495063 |
-0.392743 |
0.246478 |
0.030556 |
0.163323 |
0.076684 |
-0.027143 |
0.014360 |
0.046511 |
-0.026232 |
0.059798 |
0.036635 |
-0.273188 |
1.000000 |
75% |
135096.750000 |
0.945582 |
2.850947 |
0.348579 |
4.166857 |
0.599892 |
0.033569 |
0.240843 |
0.919999 |
0.196633 |
-0.001047 |
3.625262 |
0.163104 |
0.744021 |
0.086669 |
0.665736 |
0.219809 |
0.314206 |
0.371481 |
0.978754 |
0.443495 |
0.680597 |
0.629109 |
0.174862 |
0.382076 |
0.406056 |
0.306403 |
0.482488 |
0.235549 |
0.046539 |
1.000000 |
max |
172573.000000 |
2.335833 |
19.167239 |
3.228978 |
11.927512 |
14.103918 |
6.355986 |
5.802537 |
20.007208 |
6.816732 |
11.732926 |
12.018913 |
2.534876 |
3.091328 |
3.442422 |
2.364199 |
3.139656 |
6.739384 |
3.790316 |
5.228342 |
7.907378 |
27.202839 |
5.774087 |
5.303607 |
1.208141 |
2.208209 |
2.745261 |
3.052358 |
4.975792 |
8.146182 |
1.000000 |
Class
1 345
0 343
Name: count, dtype: int64
Statistical measures of the test dataset of undersampled data
<class 'pandas.core.frame.DataFrame'>
Index: 296 entries, 102782 to 57921
Data columns (total 31 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Time 296 non-null float64
1 V1 296 non-null float64
2 V2 296 non-null float64
3 V3 296 non-null float64
4 V4 296 non-null float64
5 V5 296 non-null float64
6 V6 296 non-null float64
7 V7 296 non-null float64
8 V8 296 non-null float64
9 V9 296 non-null float64
10 V10 296 non-null float64
11 V11 296 non-null float64
12 V12 296 non-null float64
13 V13 296 non-null float64
14 V14 296 non-null float64
15 V15 296 non-null float64
16 V16 296 non-null float64
17 V17 296 non-null float64
18 V18 296 non-null float64
19 V19 296 non-null float64
20 V20 296 non-null float64
21 V21 296 non-null float64
22 V22 296 non-null float64
23 V23 296 non-null float64
24 V24 296 non-null float64
25 V25 296 non-null float64
26 V26 296 non-null float64
27 V27 296 non-null float64
28 V28 296 non-null float64
29 Amount 296 non-null float64
30 Class 296 non-null int64
dtypes: float64(30), int64(1)
memory usage: 74.0 KB
|
Time |
V1 |
V2 |
V3 |
V4 |
V5 |
V6 |
V7 |
V8 |
V9 |
V10 |
V11 |
V12 |
V13 |
V14 |
V15 |
V16 |
V17 |
V18 |
V19 |
V20 |
V21 |
V22 |
V23 |
V24 |
V25 |
V26 |
V27 |
V28 |
Amount |
Class |
count |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
296.000000 |
mean |
88396.587838 |
-2.448419 |
1.857288 |
-3.552900 |
2.336519 |
-1.503755 |
-0.656035 |
-2.853058 |
0.086851 |
-1.324446 |
-2.919028 |
1.914227 |
-3.106154 |
-0.084562 |
-3.347887 |
-0.077981 |
-1.984526 |
-3.161909 |
-1.109686 |
0.264590 |
0.289212 |
0.065582 |
0.134902 |
0.056521 |
-0.077336 |
0.001963 |
0.040364 |
0.020281 |
0.058781 |
0.046845 |
0.496622 |
std |
50147.105326 |
5.812072 |
3.934323 |
6.680660 |
3.308417 |
4.389263 |
1.693893 |
6.622008 |
5.121293 |
2.451914 |
4.891517 |
2.754439 |
4.681722 |
0.986937 |
4.683458 |
1.051296 |
3.484989 |
5.826410 |
2.293910 |
1.298310 |
1.235841 |
2.862463 |
1.216935 |
0.877975 |
0.555090 |
0.650752 |
0.481822 |
1.224166 |
0.460841 |
0.892432 |
0.500835 |
min |
60.000000 |
-29.876366 |
-8.402154 |
-30.558697 |
-2.956827 |
-21.665654 |
-5.773192 |
-43.557242 |
-41.044261 |
-13.434066 |
-24.588262 |
-2.383066 |
-18.683715 |
-3.076318 |
-17.620634 |
-3.092108 |
-14.129855 |
-22.541652 |
-9.090892 |
-3.681904 |
-5.225849 |
-22.797604 |
-8.887017 |
-5.988806 |
-1.742803 |
-2.079928 |
-1.170476 |
-7.263482 |
-1.931920 |
-0.353229 |
0.000000 |
25% |
45977.500000 |
-2.867766 |
-0.130600 |
-5.417818 |
-0.118496 |
-1.667035 |
-1.477544 |
-2.835885 |
-0.168935 |
-2.345829 |
-4.445615 |
-0.144802 |
-5.340188 |
-0.815218 |
-6.363108 |
-0.729637 |
-3.303237 |
-5.358990 |
-1.747789 |
-0.563676 |
-0.165023 |
-0.178103 |
-0.483530 |
-0.212828 |
-0.405811 |
-0.324214 |
-0.270853 |
-0.056831 |
-0.042639 |
-0.349231 |
0.000000 |
50% |
84069.000000 |
-0.740915 |
0.941852 |
-1.139964 |
1.340723 |
-0.369227 |
-0.596589 |
-0.501864 |
0.169642 |
-0.696902 |
-0.875521 |
1.267304 |
-0.938658 |
-0.060414 |
-1.059352 |
-0.012904 |
-0.547678 |
-0.527389 |
-0.318904 |
0.169827 |
0.056998 |
0.130060 |
0.081904 |
-0.035614 |
-0.010232 |
0.068890 |
0.031911 |
0.073702 |
0.046030 |
-0.300834 |
0.000000 |
75% |
135023.500000 |
0.879511 |
2.700371 |
0.394765 |
4.305361 |
0.624459 |
0.139244 |
0.306788 |
0.833392 |
0.011527 |
-0.051012 |
3.542336 |
0.234752 |
0.609629 |
0.173916 |
0.685300 |
0.351119 |
0.309636 |
0.237358 |
0.948371 |
0.461180 |
0.568611 |
0.617588 |
0.200328 |
0.317653 |
0.386804 |
0.355382 |
0.395412 |
0.192766 |
0.028048 |
1.000000 |
max |
172733.000000 |
2.306769 |
22.057729 |
3.476268 |
12.114672 |
9.880564 |
6.474115 |
3.791907 |
19.587773 |
4.866316 |
6.367661 |
11.152491 |
1.725185 |
2.897044 |
2.654275 |
2.471358 |
2.696475 |
6.443649 |
2.591846 |
4.851255 |
11.059004 |
27.202839 |
8.361985 |
5.466230 |
1.077407 |
2.156042 |
1.458828 |
2.706566 |
3.042406 |
5.663610 |
1.000000 |
Class
0 149
1 147
Name: count, dtype: int64