753 lines
82 KiB
Plaintext
753 lines
82 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"# Analiza zbioru danych, czyszczenie i podział na train/valid/test"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 123,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": " app_id app_name \\\n6417086 99910 Puzzle Pirates \n6417087 99910 Puzzle Pirates \n6417088 99910 Puzzle Pirates \n6417089 99910 Puzzle Pirates \n6417090 99910 Puzzle Pirates \n6417091 99910 Puzzle Pirates \n6417092 99910 Puzzle Pirates \n6417093 99910 Puzzle Pirates \n6417094 99910 Puzzle Pirates \n6417095 99910 Puzzle Pirates \n6417096 99910 Puzzle Pirates \n6417097 99910 Puzzle Pirates \n6417098 99910 Puzzle Pirates \n6417099 99910 Puzzle Pirates \n6417100 99910 Puzzle Pirates \n6417101 99910 Puzzle Pirates \n6417102 99910 Puzzle Pirates \n6417103 99910 Puzzle Pirates \n6417104 99910 Puzzle Pirates \n6417105 99910 Puzzle Pirates \n\n review_text review_score \\\n6417086 Reminds me of the games I played in elementary... -1 \n6417087 I dont like this game -1 \n6417088 The actual game play of Puzzle Pirates is grea... -1 \n6417089 Rating based on current state of play, as per ... -1 \n6417090 This is just appalling. -1 \n6417091 Set my age as less than 5 by mistake. Apparent... -1 \n6417092 It is terrible because I cant ge on because of... -1 \n6417093 Was fun for the first 30 minutes or so, got bo... -1 \n6417094 The game is very awefull and strange. I think ... -1 \n6417095 A very good game, got sick of it after a while... -1 \n6417096 Imagine Bejeweled with a heavy grind based eco... -1 \n6417097 This game has some serious problems. First of ... -1 \n6417098 This game is good but also horrible. Its fun t... -1 \n6417099 A very good game, got sick of it after a while... -1 \n6417100 This game is good but also horrible. Its fun t... -1 \n6417101 I really ove this game but it needs somethings... -1 \n6417102 Used to play Puzzel Pirates 'way back when', b... -1 \n6417103 This game was aright, though a bit annoying. W... -1 \n6417104 I had a nice review to recommend this game, bu... -1 \n6417105 The puzzles in this game are fun, but you have... -1 \n\n review_votes \n6417086 0 \n6417087 0 \n6417088 0 \n6417089 0 \n6417090 0 \n6417091 0 \n6417092 0 \n6417093 0 \n6417094 0 \n6417095 1 \n6417096 0 \n6417097 0 \n6417098 0 \n6417099 1 \n6417100 0 \n6417101 0 \n6417102 0 \n6417103 0 \n6417104 0 \n6417105 0 ",
|
||
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>app_id</th>\n <th>app_name</th>\n <th>review_text</th>\n <th>review_score</th>\n <th>review_votes</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>6417086</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>Reminds me of the games I played in elementary...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417087</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>I dont like this game</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417088</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>The actual game play of Puzzle Pirates is grea...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417089</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>Rating based on current state of play, as per ...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417090</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>This is just appalling.</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417091</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>Set my age as less than 5 by mistake. Apparent...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417092</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>It is terrible because I cant ge on because of...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417093</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>Was fun for the first 30 minutes or so, got bo...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417094</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>The game is very awefull and strange. I think ...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417095</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>A very good game, got sick of it after a while...</td>\n <td>-1</td>\n <td>1</td>\n </tr>\n <tr>\n <th>6417096</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>Imagine Bejeweled with a heavy grind based eco...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417097</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>This game has some serious problems. First of ...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417098</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>This game is good but also horrible. Its fun t...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417099</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>A very good game, got sick of it after a while...</td>\n <td>-1</td>\n <td>1</td>\n </tr>\n <tr>\n <th>6417100</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>This game is good but also horrible. Its fun t...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417101</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>I really ove this game but it needs somethings...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417102</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>Used to play Puzzel Pirates 'way back when', b...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417103</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>This game was aright, though a bit annoying. W...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417104</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>I had a nice review to recommend this game, bu...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417105</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>The puzzles in this game are fun, but you have...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n </tbody>\n</table>\n</div>"
|
||
},
|
||
"execution_count": 123,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"import pandas as pd\n",
|
||
"import numpy as np\n",
|
||
"\n",
|
||
"dataset = pd.read_csv(\"dataset.csv\")\n",
|
||
"dataset.tail(20)"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 124,
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"RangeIndex: 6417106 entries, 0 to 6417105\n",
|
||
"Data columns (total 5 columns):\n",
|
||
" # Column Dtype \n",
|
||
"--- ------ ----- \n",
|
||
" 0 app_id int64 \n",
|
||
" 1 app_name object\n",
|
||
" 2 review_text object\n",
|
||
" 3 review_score int64 \n",
|
||
" 4 review_votes int64 \n",
|
||
"dtypes: int64(3), object(2)\n",
|
||
"memory usage: 244.8+ MB\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"dataset.info()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 125,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": " app_id review_score review_votes\ncount 6.417106e+06 6.417106e+06 6.417106e+06\nmean 2.274695e+05 6.394992e-01 1.472446e-01\nstd 1.260451e+05 7.687918e-01 3.543496e-01\nmin 1.000000e+01 -1.000000e+00 0.000000e+00\n25% 2.018100e+05 1.000000e+00 0.000000e+00\n50% 2.391600e+05 1.000000e+00 0.000000e+00\n75% 3.056200e+05 1.000000e+00 0.000000e+00\nmax 5.653400e+05 1.000000e+00 1.000000e+00",
|
||
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>app_id</th>\n <th>review_score</th>\n <th>review_votes</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>count</th>\n <td>6.417106e+06</td>\n <td>6.417106e+06</td>\n <td>6.417106e+06</td>\n </tr>\n <tr>\n <th>mean</th>\n <td>2.274695e+05</td>\n <td>6.394992e-01</td>\n <td>1.472446e-01</td>\n </tr>\n <tr>\n <th>std</th>\n <td>1.260451e+05</td>\n <td>7.687918e-01</td>\n <td>3.543496e-01</td>\n </tr>\n <tr>\n <th>min</th>\n <td>1.000000e+01</td>\n <td>-1.000000e+00</td>\n <td>0.000000e+00</td>\n </tr>\n <tr>\n <th>25%</th>\n <td>2.018100e+05</td>\n <td>1.000000e+00</td>\n <td>0.000000e+00</td>\n </tr>\n <tr>\n <th>50%</th>\n <td>2.391600e+05</td>\n <td>1.000000e+00</td>\n <td>0.000000e+00</td>\n </tr>\n <tr>\n <th>75%</th>\n <td>3.056200e+05</td>\n <td>1.000000e+00</td>\n <td>0.000000e+00</td>\n </tr>\n <tr>\n <th>max</th>\n <td>5.653400e+05</td>\n <td>1.000000e+00</td>\n <td>1.000000e+00</td>\n </tr>\n </tbody>\n</table>\n</div>"
|
||
},
|
||
"execution_count": 125,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"dataset.describe()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"### ~5x więcej pozytywnych recenzji niż negatywnych"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"### Usuwanie pustych wartości"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 126,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": "review_score\n 1 5260420\n-1 1156686\nName: count, dtype: int64"
|
||
},
|
||
"execution_count": 126,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"dataset[\"review_score\"].value_counts()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 127,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": "<AxesSubplot:xlabel='review_score'>"
|
||
},
|
||
"execution_count": 127,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": "<Figure size 640x480 with 1 Axes>",
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAhYAAAG/CAYAAAAEvJ5oAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAa1ElEQVR4nO3df5BV9X3/8dcCZUmFhQCGsHUFfwz+BAS0lqCISk2pYbBOjLG0YSC2kxnMhGHstIytxqpZM43WJk0IwQhxRotN1CTVBjFOAEdjFBgVDSECErYlggl1+fGd3pjd+/3DcdutgFz8LMvK4zFzZrjnnnvP+65eeXrO2XvrqtVqNQAABfTq7gEAgPcPYQEAFCMsAIBihAUAUIywAACKERYAQDHCAgAoRlgAAMUICwCgGGEBABTTbWGxevXqTJ8+PY2Njamrq8t3v/vdmp+jWq3mS1/6UkaNGpX6+vr83u/9Xm677bbywwIAh6RPd+143759GTt2bObMmZMrr7zysJ7jc5/7XFasWJEvfelLGT16dHbt2pVdu3YVnhQAOFR1R8OXkNXV1eXhhx/OFVdc0bGuUqnkhhtuyL/8y7/kjTfeyNlnn50vfvGLmTJlSpJkw4YNGTNmTF566aWcdtpp3TM4ANDJUXuNxXXXXZcf//jHWbZsWV588cVcddVV+aM/+qO88sorSZJ/+7d/y8knn5xHHnkkJ510UkaOHJlrr73WEQsA6EZHZVhs27YtS5Ysybe//e1ceOGFOeWUU3L99dfnggsuyJIlS5IkW7ZsyS9+8Yt8+9vfzr333pulS5dm7dq1+fjHP97N0wPAsavbrrE4mPXr16etrS2jRo3qtL5SqWTIkCFJkvb29lQqldx7770d233zm9/MhAkTsnHjRqdHAKAbHJVhsXfv3vTu3Ttr165N7969O93Xv3//JMnw4cPTp0+fTvFxxhlnJHnriIewAIAj76gMi3HjxqWtrS07d+7MhRdeuN9tJk2alN/+9rfZvHlzTjnllCTJz3/+8yTJiBEjjtisAMD/6LbfCtm7d282bdqU5K2QuPPOO3PxxRdn8ODBOfHEE/Nnf/Zneeqpp3LHHXdk3Lhxef311/PEE09kzJgxufzyy9Pe3p7zzjsv/fv3z1133ZX29vbMnTs3DQ0NWbFiRXe8JAA45nVbWKxcuTIXX3zxO9bPmjUrS5cuzZtvvplbb7019957b/7zP/8zQ4cOzR/8wR/k5ptvzujRo5Mk27dvz2c/+9msWLEixx13XKZNm5Y77rgjgwcPPtIvBwDIUfI5FgDA+8NR+eumAEDPdMQv3mxvb8/27dszYMCA1NXVHendAwCHoVqtZs+ePWlsbEyvXgc+LnHEw2L79u1pamo60rsFAApoaWnJCSeccMD7j3hYDBgwIMlbgzU0NBzp3QMAh2H37t1pamrq+Hv8QI54WLx9+qOhoUFYAEAP826XMbh4EwAoRlgAAMUICwCgGGEBABQjLACAYoQFAFCMsAAAihEWAEAxwgIAKEZYAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIrp090DHEtG/s2j3T0CR9DW2y/v7hEAjjhHLACAYoQFAFCMsAAAihEWAEAxwgIAKEZYAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIoRFgBAMcICACimprD4/Oc/n7q6uk7L6aef3lWzAQA9TM1fm37WWWflhz/84f88QR/fvA4AvKXmKujTp08+/OEPH/L2lUollUql4/bu3btr3SUA0EPUfI3FK6+8ksbGxpx88smZOXNmtm3bdtDtm5ubM3DgwI6lqanpsIcFAI5uNYXF+eefn6VLl2b58uVZuHBhXn311Vx44YXZs2fPAR+zYMGCtLa2diwtLS3veWgA4OhU06mQadOmdfx5zJgxOf/88zNixIj867/+az796U/v9zH19fWpr69/b1MCAD3Ce/p100GDBmXUqFHZtGlTqXkAgB7sPYXF3r17s3nz5gwfPrzUPABAD1ZTWFx//fVZtWpVtm7dmqeffjp/8id/kt69e+eaa67pqvkAgB6kpmss/uM//iPXXHNNfv3rX+f444/PBRdckGeeeSbHH398V80HAPQgNYXFsmXLumoOAOB9wHeFAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIoRFgBAMcICAChGWAAAxQgLAKAYYQEAFCMsAIBihAUAUIywAACKERYAQDHCAgAoRlgAAMUICwCgGGEBABQjLACAYoQFAFCMsAAAihEWAEAxwgIAKEZYAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIoRFgBAMcICAChGWAAAxQgLAKAYYQEAFCMsAIBihAUAUIywAACKERYAQDHCAgAoRlgAAMUICwCgGGEBABQjLACAYoQFAFCMsAAAihEWAEAxwgIAKEZYAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIoRFgBAMcICACjmPYXF7bffnrq6usybN6/QOABAT3bYYfHcc89l0aJFGTNmTMl5AIAe7LDCYu/evZk5c2YWL16cD37wgwfdtlKpZPfu3Z0WAOD96bDCYu7cubn88sszderUd922ubk5AwcO7FiampoOZ5cAQA9Qc1gsW7Ys69atS3Nz8yFtv2DBgrS2tnYsLS0tNQ8JAPQMfWrZuKWlJZ/73Ofy+OOPp1+/fof0mPr6+tTX1x/WcABAz1JTWKxduzY7d+7M+PHjO9a1tbVl9erV+ed//udUKpX07t27+JAAQM9QU1hceumlWb9+fad1s2fPzumnn56//uu/FhUAcIyrKSwGDBiQs88+u9O64447LkOGDHnHegDg2OOTNwGAYmo6YrE/K1euLDAGAPB+4IgFAFCMsAAAihEWAEAxwgIAKEZYAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIoRFgBAMcICAChGWAAAxQgLAKAYYQEAFCMsAIBihAUAUIywAACKERYAQDHCAgAoRlgAAMUICwCgGGEBABQjLACAYoQFAFCMsAAAihEWAEAxwgIAKEZYAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIoRFgBAMcICAChGWAAAxQgLAKAYYQEAFCMsAIBihAUAUIywAACKERYAQDHCAgAoRlgAAMUICwCgGGEBABQjLACAYoQFAFCMsAAAihEWAEAxwgIAKEZYAADFCAsAoBhhAQAUIywAgGJqCouFCxdmzJgxaWhoSENDQyZOnJgf/OAHXTUbANDD1BQWJ5xwQm6//fasXbs2a9asySWXXJIZM2bk5Zdf7qr5AIAepE8tG0+fPr3T7dtuuy0LFy7MM888k7POOmu/j6lUKqlUKh23d+/efRhjAgA9wWFfY9HW1pZly5Zl3759mThx4gG3a25uzsCBAzuWpqamw90lAHCUqzks1q9fn/79+6e+vj6f+cxn8vDDD+fMM8884PYLFixIa2trx9LS0vKeBgYAjl41nQpJktNOOy3PP/98Wltb853vfCezZs3KqlWrDhgX9fX1qa+vf8+DAgBHv5rDom/fvjn11FOTJBMmTMhzzz2Xf/qnf8qiRYuKDwcA9Czv+XMs2tvbO12cCQAcu2o6YrFgwYJMmzYtJ554Yvbs2ZP7778/K1euzGOPPdZV8wEAPUhNYbFz58586lOfyi9/+csMHDgwY8aMyWOPPZY//MM/7Kr5AIAepKaw+OY3v9lVcwAA7wO+KwQAKEZYAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIoRFgBAMcICAChGWAAAxQgLAKAYYQEAFCMsAIBihAUAUIywAACKERYAQDHCAgAoRlgAAMUICwCgGGEBABQjLACAYoQFAFCMsAAAihEWAEAxwgIAKEZYAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIoRFgBAMcICAChGWAAAxQgLAKAYYQEAFCMsAIBihAUAUIywAACKERYAQDHCAgAoRlgAAMUICwCgGGEBABQjLACAYoQFAFCMsAAAihEWAEAxwgIAKEZYAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIqpKSyam5tz3nnnZcCAAfnQhz6UK664Ihs3buyq2QCAHqamsFi1alXmzp2bZ555Jo8//njefPPNXHbZZdm3b19XzQcA9CB9atl4+fLlnW4vXbo0H/rQh7J27dpMnjy56GAAQM9TU1j8X62trUmSwYMHH3CbSqWSSqXScXv37t3vZZcAwFHssC/ebG9vz7x58zJp0qScffbZB9yuubk5AwcO7FiampoOd5cAwFHusMNi7ty5eemll7Js2bKDbrdgwYK0trZ2LC0tLYe7SwDgKHdYp0Kuu+66PPLII1m9enVOOOGEg25bX1+f+vr6wxoOAOhZagqLarWaz372s3n44YezcuXKnHTSSV01FwDQA9UUFnPnzs3999+f733vexkwYEBee+21JMnAgQPzgQ98oEsGBAB6jpqusVi4cGFaW1szZcqUDB8+vGN54IEHumo+AKAHqflUCADAgfiuEACgGGEBABQjLACAYoQFAFCMsAAAihEWAEAxwgIAKEZYAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIoRFgBAMcICAChGWAAAxQgLAKAYYQEAFCMsAIBihAUAUIywAACKERYAQDHCAgAoRlgAAMUICwCgGGEBABQjLACAYoQFAFCMsAAAihEWAEAxwgIAKEZYAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIoRFgBAMcICAChGWAAAxQgLAKAYYQEAFCMsAIBihAUAUIywAACKERYAQDHCAgAoRlgAAMUICwCgmD7dPQDA+8HIv3m0u0fgCNp6++XdPcJRyxELAKAYYQEAFCMsAIBihAUAUIywAACKERYAQDE1h8Xq1aszffr0NDY2pq6uLt/97ne7YCwAoCeqOSz27duXsWPH5qtf/WpXzAMA9GA1f0DWtGnTMm3atK6YBQDo4br8kzcrlUoqlUrH7d27d3f1LgGAbtLlF282Nzdn4MCBHUtTU1NX7xIA6CZdHhYLFixIa2trx9LS0tLVuwQAukmXnwqpr69PfX19V+8GADgK+BwLAKCYmo9Y7N27N5s2beq4/eqrr+b555/P4MGDc+KJJxYdDgDoWWoOizVr1uTiiy/uuD1//vwkyaxZs7J06dJigwEAPU/NYTFlypRUq9WumAUA6OFcYwEAFCMsAIBihAUAUIywAACKERYAQDHCAgAoRlgAAMUICwCgGGEBABQjLACAYoQFAFCMsAAAihEWAEAxwgIAKEZYAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIoRFgBAMcICAChGWAAAxQgLAKAYYQEAFCMsAIBihAUAUIywAACKERYAQDHCAgAoRlgAAMUICwCgGGEBABQjLACAYoQFAFCMsAAAihEWAEAxwgIAKEZYAADFCAsAoBhhAQAUIywAgGKEBQBQjLAAAIoRFgBAMcICAChGWAAAxQgLAKAYYQEAFCMsAIBihAUAUIywAACKERYAQDHCAgAoRlgAAMUICwCgmMMKi69+9asZOXJk+vXrl/PPPz/PPvts6bkAgB6o5rB44IEHMn/+/Nx0001Zt25dxo4dm49+9KPZuXNnV8wHAPQgNYfFnXfemb/4i7/I7Nmzc+aZZ+brX/96fvd3fzf33HNPV8wHAPQgfWrZ+De/+U3Wrl2bBQsWdKzr1atXpk6dmh//+Mf7fUylUkmlUum43dramiTZvXv34czbo7VX/l93j8ARdCz+O34s8/4+thyL7++3X3O1Wj3odjWFxa9+9au0tbVl2LBhndYPGzYsP/vZz/b7mObm5tx8883vWN/U1FTLrqHHGXhXd08AdJVj+f29Z8+eDBw48ID31xQWh2PBggWZP39+x+329vbs2rUrQ4YMSV1dXVfvnm62e/fuNDU1paWlJQ0NDd09DlCQ9/expVqtZs+ePWlsbDzodjWFxdChQ9O7d+/s2LGj0/odO3bkwx/+8H4fU19fn/r6+k7rBg0aVMtueR9oaGjwHx54n/L+PnYc7EjF22q6eLNv376ZMGFCnnjiiY517e3teeKJJzJx4sTaJwQA3ldqPhUyf/78zJo1K+eee25+//d/P3fddVf27duX2bNnd8V8AEAPUnNYXH311Xn99ddz44035rXXXss555yT5cuXv+OCTkjeOhV20003veN0GNDzeX+zP3XVd/u9EQCAQ+S7QgCAYoQFAFCMsAAAihEWAEAxwgIAKEZYAADFCAuOmJaWlsyZM6e7xwAK27FjR/7+7/++u8fgKOFzLDhiXnjhhYwfPz5tbW3dPQpQkPc2/1uXf7spx47vf//7B71/y5YtR2gSoKQXX3zxoPdv3LjxCE1CT+CIBcX06tUrdXV1Odi/UnV1df6vBnqYg723317vvc3bXGNBMcOHD89DDz2U9vb2/S7r1q3r7hGBwzB48OAsXrw4r7766juWLVu25JFHHunuETmKOBVCMRMmTMjatWszY8aM/d7/bkczgKPThAkTsn379owYMWK/97/xxhve23QQFhTzV3/1V9m3b98B7z/11FPzox/96AhOBJTwmc985qDv7RNPPDFLliw5ghNxNHONBQA1e+qpp3Luuef6ynTeQVgAULOGhoY8//zzOfnkk7t7FI4yLt4EoGb+n5QDERYAQDHCAoCaLVq0KMOGDevuMTgKucYCACjGEQsAoBhhAQAUIywAgGKEBQBQjLCAY9DnP//5nHPOOd09BvA+5LdC4Bi0d+/eVCqVDBkypLtHAd5nhAX0ML/5zW/St2/f7h6jR/CzgiPPqRA4yk2ZMiXXXXdd5s2bl6FDh+ajH/1oXnrppUybNi39+/fPsGHD8ud//uf51a9+lST5xje+kcbGxrS3t3d6nhkzZmTOnDlJ9n8q5O67784ZZ5yRfv365fTTT8/Xvva1jvs+/vGP57rrruu4PW/evNTV1eVnP/tZkrf+Aj/uuOPywx/+8F1fz3e+852MHj06H/jABzJkyJBMnTq10zdn3nPPPTnrrLNSX1+f4cOHd9rvtm3bMmPGjPTv3z8NDQ35xCc+kR07dnTc//bruvvuu3PSSSelX79+Sd76Wu9rr702xx9/fBoaGnLJJZfkhRdeeNdZgdoJC+gBvvWtb6Vv37556qmncvvtt+eSSy7JuHHjsmbNmixfvjw7duzIJz7xiSTJVVddlV//+tedvqJ+165dWb58eWbOnLnf57/vvvty44035rbbbsuGDRvyhS98IX/3d3+Xb33rW0mSiy66KCtXruzYftWqVRk6dGjHuueeey5vvvlmPvKRjxz0dfzyl7/MNddckzlz5mTDhg1ZuXJlrrzyyo7vnVi4cGHmzp2bv/zLv8z69evz/e9/P6eeemqSpL29PTNmzMiuXbuyatWqPP7449myZUuuvvrqTvvYtGlTHnzwwTz00EN5/vnnO34mO3fuzA9+8IOsXbs248ePz6WXXppdu3Yd2j8A4NBVgaPaRRddVB03blzH7VtuuaV62WWXddqmpaWlmqS6cePGarVarc6YMaM6Z86cjvsXLVpUbWxsrLa1tVWr1Wr1pptuqo4dO7bj/lNOOaV6//33d3rOW265pTpx4sRqtVqtvvjii9W6urrqzp07q7t27ar27du3esstt1SvvvrqarVard56663Vj3zkI+/6WtauXVtNUt26det+729sbKzecMMN+71vxYoV1d69e1e3bdvWse7ll1+uJqk+++yzHa/rd37nd6o7d+7s2ObJJ5+sNjQ0VP/7v/+70/Odcsop1UWLFr3rzEBtHLGAHmDChAkdf37hhRfyox/9KP379+9YTj/99CTJ5s2bkyQzZ87Mgw8+mEqlkuStIxKf/OQn06vXO9/y+/bty+bNm/PpT3+603PeeuutHc939tlnZ/DgwVm1alWefPLJjBs3Lh/72MeyatWqJG8dwZgyZcq7vo6xY8fm0ksvzejRo3PVVVdl8eLF+a//+q8kyc6dO7N9+/Zceuml+33shg0b0tTUlKampo51Z555ZgYNGpQNGzZ0rBsxYkSOP/74Tj+vvXv3ZsiQIZ1e36uvvtrx+oBy+nT3AMC7O+644zr+vHfv3kyfPj1f/OIX37Hd8OHDkyTTp09PtVrNo48+mvPOOy9PPvlk/vEf/3G/z713794kyeLFi3P++ed3uq93795Jkrq6ukyePDkrV65MfX19pkyZkjFjxqRSqeSll17K008/neuvv/5dX0fv3r3z+OOP5+mnn86KFSvyla98JTfccEN+8pOfZOjQoYf2w3gX//tn9fbrGz58eKdTOW8bNGhQkX0C/0NYQA8zfvz4PPjggxk5cmT69Nn/W7hfv3658sorc99992XTpk057bTTMn78+P1uO2zYsDQ2NmbLli0HvAYjees6i8WLF6e+vj633XZbevXqlcmTJ+cf/uEfUqlUMmnSpEOav66uLpMmTcqkSZNy4403ZsSIEXn44Yczf/78jBw5Mk888UQuvvjidzzujDPOSEtLS1paWjqOWvz0pz/NG2+8kTPPPPOA+xs/fnxee+219OnTJyNHjjykGYHD51QI9DBz587Nrl27cs011+S5557L5s2b89hjj2X27Nlpa2vr2G7mzJl59NFHc8899xw0GJLk5ptvTnNzc7785S/n5z//edavX58lS5bkzjvv7NhmypQp+elPf5qXX345F1xwQce6++67L+eee+47jhTsz09+8pN84QtfyJo1a7Jt27Y89NBDef3113PGGWckeeu3Ou644458+ctfziuvvJJ169blK1/5SpJk6tSpGT16dGbOnJl169bl2Wefzac+9alcdNFFOffccw+4z6lTp2bixIm54oorsmLFimzdujVPP/10brjhhqxZs+ZdZwZqIyygh2lsbMxTTz2Vtra2XHbZZRk9enTmzZuXQYMGdbqG4pJLLsngwYOzcePG/Omf/ulBn/Paa6/N3XffnSVLlmT06NG56KKLsnTp0px00kkd24wePTqDBg3KOeeck/79+yd5Kyza2toO6fqKJGloaMjq1avzx3/8xxk1alT+9m//NnfccUemTZuWJJk1a1buuuuufO1rX8tZZ52Vj33sY3nllVeSvHWk43vf+14++MEPZvLkyZk6dWpOPvnkPPDAAwfdZ11dXf793/89kydPzuzZszNq1Kh88pOfzC9+8YsMGzbskOYGDp0PyAIAinHEAgAoRlgAxWzbtq3Tr3T+32Xbtm3dPSLQxZwKAYr57W9/m61btx7w/oP9Jgvw/iAsAIBinAoBAIoRFgBAMcICAChGWAAAxQgLAKAYYQEAFCMsAIBi/j/XcRj4WnpKIAAAAABJRU5ErkJggg==\n"
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"dataset[\"review_score\"].value_counts().plot(kind=\"bar\")"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 128,
|
||
"outputs": [],
|
||
"source": [
|
||
"dataset_without_na = dataset.dropna()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 129,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": " app_id app_name \\\n0 10 Counter-Strike \n1 10 Counter-Strike \n2 10 Counter-Strike \n3 10 Counter-Strike \n4 10 Counter-Strike \n... ... ... \n6417101 99910 Puzzle Pirates \n6417102 99910 Puzzle Pirates \n6417103 99910 Puzzle Pirates \n6417104 99910 Puzzle Pirates \n6417105 99910 Puzzle Pirates \n\n review_text review_score \\\n0 Ruined my life. 1 \n1 This will be more of a ''my experience with th... 1 \n2 This game saved my virginity. 1 \n3 • Do you like original games? • Do you like ga... 1 \n4 Easy to learn, hard to master. 1 \n... ... ... \n6417101 I really ove this game but it needs somethings... -1 \n6417102 Used to play Puzzel Pirates 'way back when', b... -1 \n6417103 This game was aright, though a bit annoying. W... -1 \n6417104 I had a nice review to recommend this game, bu... -1 \n6417105 The puzzles in this game are fun, but you have... -1 \n\n review_votes \n0 0 \n1 1 \n2 0 \n3 0 \n4 1 \n... ... \n6417101 0 \n6417102 0 \n6417103 0 \n6417104 0 \n6417105 0 \n\n[6226728 rows x 5 columns]",
|
||
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>app_id</th>\n <th>app_name</th>\n <th>review_text</th>\n <th>review_score</th>\n <th>review_votes</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>10</td>\n <td>Counter-Strike</td>\n <td>Ruined my life.</td>\n <td>1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>10</td>\n <td>Counter-Strike</td>\n <td>This will be more of a ''my experience with th...</td>\n <td>1</td>\n <td>1</td>\n </tr>\n <tr>\n <th>2</th>\n <td>10</td>\n <td>Counter-Strike</td>\n <td>This game saved my virginity.</td>\n <td>1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>3</th>\n <td>10</td>\n <td>Counter-Strike</td>\n <td>• Do you like original games? • Do you like ga...</td>\n <td>1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>4</th>\n <td>10</td>\n <td>Counter-Strike</td>\n <td>Easy to learn, hard to master.</td>\n <td>1</td>\n <td>1</td>\n </tr>\n <tr>\n <th>...</th>\n <td>...</td>\n <td>...</td>\n <td>...</td>\n <td>...</td>\n <td>...</td>\n </tr>\n <tr>\n <th>6417101</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>I really ove this game but it needs somethings...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417102</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>Used to play Puzzel Pirates 'way back when', b...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417103</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>This game was aright, though a bit annoying. W...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417104</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>I had a nice review to recommend this game, bu...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n <tr>\n <th>6417105</th>\n <td>99910</td>\n <td>Puzzle Pirates</td>\n <td>The puzzles in this game are fun, but you have...</td>\n <td>-1</td>\n <td>0</td>\n </tr>\n </tbody>\n</table>\n<p>6226728 rows × 5 columns</p>\n</div>"
|
||
},
|
||
"execution_count": 129,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"dataset_without_na"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 130,
|
||
"outputs": [],
|
||
"source": [
|
||
"dataset = dataset_without_na"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 131,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": "review_score\n 1 5126132\n-1 1100596\nName: count, dtype: int64"
|
||
},
|
||
"execution_count": 131,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"dataset[\"review_score\"].value_counts()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"### Gry z największą liczbą recenzji w zbiorze danych"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 132,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": "app_name\nDayZ 88850\nPAYDAY 2 88783\nTerraria 84702\nRust 77037\nDota 2 73433\nRocket League 54188\nUndertale 51878\nLeft 4 Dead 2 50863\nWarframe 48164\nGrand Theft Auto V 42323\nRobocraft 41596\nStarbound 41141\nPortal 2 38796\nSpace Engineers 37453\nFallout: New Vegas 32918\nArma 3 32262\nThe Witcher 3: Wild Hunt 31830\nHeroes & Generals 31303\nBioShock Infinite 31076\nThe Forest 29998\nName: count, dtype: int64"
|
||
},
|
||
"execution_count": 132,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"dataset['app_name'].value_counts().nlargest(20)"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"### Recenzje gier dostępnych we wczesnym dostępie wyświetlają się jako \"Early Access Review\", bez tekstu"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 133,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": "review_text\n Early Access Review 977399\n Early Access Review 10571\n10/10 6050\n. 4769\nGreat game 3662\ngreat game 3554\nGreat game! 2440\n:) 2093\nNice game 1793\nGreat Game 1659\n♥♥♥♥ 1645\nGreat game. 1633\ncool 1502\n... 1247\nits good 974\nGreat Game! 924\n9/10 889\n8/10 747\nGreat 746\ni love this game 720\nName: count, dtype: int64"
|
||
},
|
||
"execution_count": 133,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"dataset['review_text'].value_counts().nlargest(20)"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 134,
|
||
"outputs": [],
|
||
"source": [
|
||
"dataset = dataset[dataset['review_text'].str.contains(\"Early Access Review\")==False]"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 135,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": "review_text\n10/10 6050\n. 4769\nGreat game 3662\ngreat game 3554\nGreat game! 2440\nName: count, dtype: int64"
|
||
},
|
||
"execution_count": 135,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"dataset['review_text'].value_counts().nlargest(5)"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 136,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": "review_score\n 1 4341259\n-1 897431\nName: count, dtype: int64"
|
||
},
|
||
"execution_count": 136,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"dataset[\"review_score\"].value_counts()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"### Zbiór danych nadal jest dosyć duży więc obetnę jego większość w celu szybszego treningu"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 137,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": "review_score\n 1 130102\n-1 27059\nName: count, dtype: int64"
|
||
},
|
||
"execution_count": 137,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"dataset = dataset.sample(frac=0.03)\n",
|
||
"dataset[\"review_score\"].value_counts()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"### Usunięcie niepotrzebnych kolumn"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 138,
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"Index: 157161 entries, 1260671 to 6268511\n",
|
||
"Data columns (total 2 columns):\n",
|
||
" # Column Non-Null Count Dtype \n",
|
||
"--- ------ -------------- ----- \n",
|
||
" 0 review_text 157161 non-null object\n",
|
||
" 1 review_score 157161 non-null int64 \n",
|
||
"dtypes: int64(1), object(1)\n",
|
||
"memory usage: 3.6+ MB\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"dataset = dataset.drop(columns=[\"app_id\", \"review_votes\", \"app_name\"])\n",
|
||
"dataset.info()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"### Podział na zbiory train/test/validate"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 139,
|
||
"outputs": [],
|
||
"source": [
|
||
"from sklearn.model_selection import train_test_split\n",
|
||
"\n",
|
||
"train, test_and_valid = train_test_split(dataset, test_size=0.2)\n",
|
||
"test, valid = train_test_split(test_and_valid, test_size=0.5)"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"### Downsampling klasy pozytywnej dla zbioru treningowego"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 140,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": "review_score\n 1 104113\n-1 21615\nName: count, dtype: int64"
|
||
},
|
||
"execution_count": 140,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"train[\"review_score\"].value_counts()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 141,
|
||
"outputs": [],
|
||
"source": [
|
||
"dataset_positive_reviews = train[train[\"review_score\"]==1]\n",
|
||
"dataset_negative_reviews = train[train[\"review_score\"]==-1]\n",
|
||
"\n",
|
||
"dataset_positive_reviews = dataset_positive_reviews.sample(21615)\n",
|
||
"train = pd.concat([dataset_positive_reviews,dataset_negative_reviews])\n",
|
||
"train = train.sample(frac=1.0) # Losowanie kolejności przykładów\n",
|
||
"train = train.reset_index(drop=True)"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 142,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": "<AxesSubplot:ylabel='count'>"
|
||
},
|
||
"execution_count": 142,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": "<Figure size 640x480 with 1 Axes>",
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZkAAAGFCAYAAAAvsY4uAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAgYklEQVR4nO3deXTV9Z3/8dfNSmICGEIWVllkEcOiWKW1iKgV3DpdsKBFOljsaPuzrR2lnV/dZtpxbGvrjLXHVutYFbWWutbWKghKUUAQBAQCyBK2JISQkH27d/64guwkN/d739/l+TiHE3LxeF5y4/f1/Xw+9/v5hCKRSEQAADggyToAAMC/KBkAgGMoGQCAYygZAIBjKBkAgGMoGQCAYygZAIBjKBkAgGMoGQCAYygZAIBjKBkAgGMoGQCAYygZAIBjKBkAgGMoGQCAYygZAIBjKBkAgGMoGQCAYygZAIBjKBkAgGMoGQCAYygZAIBjKBkAgGMoGQCAYygZAIBjKBkggV544QV94QtfUI8ePRQKhbRq1SrrSICjKBkggerq6nThhRfq/vvvt44CJESKdQAgSKZPny5J2rZtm20QIEEYyQAAHEPJAAAcQ8kADpkzZ46ysrIO/Vq0aJF1JCDhWJMBHHLNNdfo/PPPP/R97969DdMANigZwCHZ2dnKzs62jgGYomSABKqsrFRJSYl2794tSSouLpYkFRQUqKCgwDIa4AjWZIAEeuWVVzRmzBhdeeWVkqSpU6dqzJgxeuSRR4yTAc4IRSKRiHUIAIA/MZIBADiGkgEAOIaSAQA4hpIBADiGkgEAOIaSAQA4hpIBADiGkgEAOIaSAQA4hr3LgHaoqm9W2YEmldc0Hvpa09iq1rawWsMRtYUjag1HDn0fiUhJoZBSk0NKTgopJSmk5KQkpSaHlJqcpO6Zqcrv2kV52enRr13TlZnG/47wH36qEWjVDS3aU90QLY4DjSqviX49WCTlNU0qr2lSc2vY8SzZ6Snq2TX90+L55GvPw77v1T1DXVKTHc8CxAt7lyEw9tc1a82u6uivndGvu6oarGN1SEpSSIPzslTUu5uK+nRTUe9uGl7YleKBa1Ey8CU/FEp7HV48I/t009kUD1yEkoHnNba0afm2/fpwZ5XW7qrW6p3+LZT2Olg8Iz8Z7Yzpd7pG9OqqUChkHQ0BQ8nAkypqm/TW+nK9ub5M/9hUoYaWNutIrpffNV0Th+XrsrPy9NlBuYx0kBCUDDxjY1mN3lxXpvnry7RqR5XC/OTGLCM1WReemavLhudr4vA85WalW0eCT1EycK3WtrCWbavUvHXlmr+hTNv31VtH8qWkkDS6b3ddMjxfl52VryH52daR4COUDFzlQGOLFhbv1bx1ZVpYXK4Dja3WkQKnf49MXTIsX5cOz9NnBuQoJZlnthE7SgausLJkv55eUqK/rN6tpgQ8k4L2yctO19Tz+uq68/uroFsX6zjwIEoGZhpb2vTyql16ekmJ1uyqto6Dk0hJCumS4XmafsEZ+tzgHnxKDe1GySDhtlbU6an3tuvPH+xUdUOLdRx00MCep+n68/vrq+f2UbeMVOs4cDlKBgnRFo7ozXVlenrJdi3+uEL81HlfRmqyrhnVS9PH9dfZvbtZx4FLUTJwVHlNo55btkPPLivRnupG6zhwyOi+3TX9gv66alSh0lN4/gafomTgiOLSGj301ib9/aNStbTxIxYUOaelaep5fXXT+IHqnplmHQcuQMkgrnbur9cv39yol1bu4mHJAMvukqJ/uWiQZn5ugDLSGNkEGSWDuNhX26RfL9isOUtK1NzGR5ARlZedrlsvOVNTz+vL8zYBRcmgU+qaWvXooi16bNFW1Tbx4CSOb0DuabrtsiG6amQhH38OGEoGMWluDWvO0u16eMFmVdQ2W8eBRxT17qbbLx+q8UN6WkdBglAy6JBwOKKXP9ylX765UTsqg72dPmL32UE9NHvSMI3q2906ChxGyaDd3tpQpp+9XqwNpTXWUeATVxQV6F+/MFQDe2ZZR4FDKBmc0raKOv3ohTV6b8s+6yjwoZSkkL5+QX/dMWmoMtNSrOMgzigZnFAkEtH/Lt6mn/+9mEPB4Lh+OZn62VdH6oKBPayjII4oGRzXtoo63TF3tZZtq7SOggAJhaQbLuiv2ZOHMarxCUoGR2D0AjdgVOMflAwOYfQCN2FU4w+UDBi9wNUY1XgbJRNwjF7gBYxqvIuSCShGL/AiRjXeQ8kEUEVtk7495wMt3croBd4TCkmzPj9QsycNU3IS+6C5HSUTMGt3VeumJ5drNweIwePGD+mph6aN4Qhol6NkAuQvq3fr9j+tZnoMvjEw9zQ9OmOsBrEtjWtRMgEQiUT0wBsb9esFm62jAHGX3SVFD00bowlD86yj4DgoGZ+ra2rV9/+4Sm+sK7OOAjgmKST9cPIw3TR+kHUUHIWS8bEdlfX65h+Wq7iMXZMRDF8e01v3faVI6Skc+ewWlIxPvftxhb495wPtr2+xjgIk1Oi+3fW76ecqr2sX6ygQJeNLT763Tf/+6jq1hnlrEUz5XdP1u+ljORTNBSgZH2lpC+uulz/Ss8tKrKMA5tJTkvRfXynSl8b0sY4SaJSMT1Q3tGjWk8u1jAcsgSPcPGGQZk8aZh0jsCgZH6isa9b03y/VR7sPWEcBXOnrF/TTf3zxbIVC7BCQaJSMx+2tadL1jy3RxrJa6yiAq005t4/u/8pIJbEVTUJRMh5WWt2o6x5boi1766yjAJ7wxdG99MtrR7PnWQJRMh61c3+9rnt0qUoq662jAJ4y+ewC/c+0MUpNTrKOEgiUjAeV7KvXtEeXaFdVg3UUwJMuHZ6n31x/rtJSKBqn8TfsMbuqGigYoJPmrS/Xd575QK1tYesovkfJeEjZgUZdR8EAcfHGujJ994+r1MZDy46iZDxib02Tpj26RNv3sQYDxMtrq/fo9j99qDBF4xhKxgP21zXr648t5VNkgANeWLlL//+lNWJ52hmUjMtVN7Ro+uNL2UkZcNCzy3bo3lfXWcfwJUrGxVrbwrplzgqt3cWT/IDTnnh3m3779sfWMXyHknGxn7y2Xos377OOAQTG/a9v0ILicusYvkLJuNRzy0r0xLvbrGMAgRKOSLc+u1Kby9mmKV4oGRdavq1Sd738kXUMIJBqGls168nlqm7gwL94oGRcZndVg/7l6RVq5iExwMzWijp955kPeIYmDigZF2lobtOsJ5erorbZOgoQeIs2Veg//7reOobnUTIu8q9zP+RMGMBFfv+PrZq7Yqd1DE+jZFziofmb9NrqPdYxABzl315cow9K9lvH8CxKxgX+/lGpfjlvo3UMAMfR3BrWt55aodLqRusonkTJGCsurdFtf1wldrQA3GtvTZNuemq5GlvarKN4DiVjaH9ds2Y9uVx1zfzgAm63eme1Zv95tXUMz6FkDH3/+VWcbAl4yMurduvJ97ZZx/AUSsbI8+/v0MLivdYxAHTQf/1tg0o4cqPdKBkDe6ob9B+vseMr4EX1zW26fe6HHA3QTpSMgR/+eY1qGlutYwCI0dKtlXryve3WMTyBkkmw59/fobc3Mk0GeN39rzNt1h6UTAIxTQb4B9Nm7UPJJBDTZIC/MG12apRMgjBNBvgT02YnR8kkANNkgH8xbXZylEwCME0G+BvTZidGyTiMaTIgGJg2Oz5KxkFMkwHBwbTZ8VEyDrrnlY+YJgMCZOnWSj2/fId1DFehZBzyQcl+/f2jMusYABLsV29u4kiAw1AyDrn/bxusIwAwUHqgUX94d5t1DNegZBywoLhcS7dWWscAYOQ3Cz9WdUOLdQxXoGTiLBKJ6GevF1vHAGCouqFFj7z9sXUMV6Bk4uyVD3dr/Z4D1jEAGHti8TaVH2i0jmGOkomjlrawHnhjo3UMAC7Q0NKmB+dvso5hjpKJo2eWlnCcMoBDnn9/h7ZW1FnHMEXJxEl9c6seemuzdQwALtIajugXbwR7jZaSiZPHFm1VRW2TdQwALvPXNXu0Zme1dQwzlEwcVNY169F3tljHAOBCkUh0X7OgomTi4OEFm1XTxPYxAI7vH5srtHhzhXUME5RMJ+2qatBTS9jiG8DJBXU0Q8l00mOLtqi5NWwdA4DLrd5ZrXcCeOwHJdMJDc1tmrtip3UMAB4RxFkPSqYTXlq1i638AbTbWxvKtbuqwTpGQlEynfB0AO9KAMSuLRzRM0tLrGMkFCUToxXb9+uj3exRBqBjnnt/h1ragrOOS8nEiFEMgFhU1Dbpb2tLrWMkDCUTg8q6Zr22Zo91DAAe9fR7wblJpWRi8Mf3d/CxZQAxW7atUsWlNdYxEoKS6aBwOKJnlgXnLgSAM55ass06QkJQMh20cGO5dlQG6yOIAOLvpZW7VRuA7agomQ56KkBzqQCcU9vUqhc/8P/D3JRMB+yorNfbAdwWAoAznl7i/2dmKJkOeHrpdoUj1ikA+EVxWY2WbtlnHcNRlEw7hcMR/Zl9ygDE2R+X77CO4ChKpp1WlOxXRW2zdQwAPvPWhnK1+XiKhJJpp3nryqwjAPChqvoWvb+t0jqGYyiZdnpzPSUDwBl+vomlZNphy95abdlbZx0DgE/N8/FNLCXTDn7+AQBgb9u+em0u9+c2M5RMO8xbX24dAYDP+fU6Q8mcQlV9s1Zs328dA4DP+XVdhpI5Bb9/vBCAO3xQsl+Vdf57TIKSOQXWYwAkQjgizffh9YaSOYnm1rDe2VhhHQNAQPjxppaSOYn3tuwLxFbcANxh0aYKNbW2WceIK0rmJPy6EAfAneqb2/TuZn9tmEnJnIQf50cBuJvfdhehZE5gy95a7a5utI4BIGAWb/bXOjAlcwJrdlVbRwAQQNv31au6ocU6RtxQMiewZiclA8DGRz66yaVkToCRDAArfrr+UDLHEYlE9NHuA9YxAATUakrG37ZU1PF8DAAzaykZf/PTGwzAe/y0+E/JHAeL/gCs+WXxn5I5Dj8tugHwJr9chyiZo0QiEa1j0R+AMUrGp7ZU1KmGRX8AxigZn2LRH4Ab+GXxn5I5Cov+ANzCD4v/lMxR/DJEBeB9frgeUTJH2VBaYx0BACT543pEyRymsaXNF3OgAPyh1AfHjVAyhyk74P03FIB/lNV4/5pEyRymvKbJOgIAHLL3gPevSZTMYRjJAHCTmqZW1Td7+7m9mEpm4sSJqqqqOub1AwcOaOLEiZ3NZKbcB3cNAPzF69elmEpm4cKFam5uPub1xsZGLVq0qNOhrDBdBsBtvH5dSunIP7x69epDv1+3bp1KS0sPfd/W1qbXX39dvXv3jl+6BCtnugyAy3h9Gr9DJTN69GiFQiGFQqHjTotlZGTooYceilu4RPP6HQMA//H6dalDJbN161ZFIhENHDhQy5YtU8+ePQ/9WVpamvLy8pScnBz3kIni9TsGAP7j9RmWDpVM//79JUnhcNiRMNa8fscAwH+8fl3qUMkcbtOmTVqwYIHKy8uPKZ277rqr08ESjaf9AbiR12dYYiqZRx99VDfffLNyc3NVUFCgUCh06M9CoZAnS2avx+8WAPhTIEcyP/nJT/TTn/5Us2fPjnceM16/WwDgT16/NsX0nMz+/fs1ZcqUeGcx5fW7BQD+VNPYqsaWNusYMYupZKZMmaI33ngj3llM1XHkMgCX8vL1KabpssGDB+vOO+/UkiVLVFRUpNTU1CP+/NZbb41LuERqC0esIwDAcXn5+hSKRCIdTj9gwIAT/wtDIW3ZsqVToSw8tWS77nxprXUMADjG4h9OVO/uGdYxYhLTSGbr1q3xzmGurc2fz/4A8L62Nu+OZNjq/xOtHh6OAvC3Vg8/AB/TSGbmzJkn/fPHH388pjCWvDznCcDfvHx9iqlk9u/ff8T3LS0tWrt2raqqqjx7ngwjGQBu5eXrU0wl8+KLLx7zWjgc1s0336xBgwZ1OpSF6envaGbv31vHAIBjJIUel9TVOkZMYvp02YkUFxdrwoQJ2rNnT7z+lYnzj19J8+6xTgEAx7r5PSn/LOsUMYnrwv/HH3+s1laPPjSUFPNeoQDgLA9fn2JKfttttx3xfSQS0Z49e/Taa69pxowZcQmWcEmpp/5nAMBCcsBKZuXKlUd8n5SUpJ49e+qBBx445SfPXCvJu4etAfC5oI1kFixYEO8c9jz8JgLwOQ9fnzqVfO/evSouLpYkDR069IjjmD3Hw28iAJ/z8PUppoX/uro6zZw5U4WFhRo/frzGjx+vXr166cYbb1R9fX28MyZGSrp1AgA4vuQ06wQxi6lkbrvtNr399tt69dVXVVVVpaqqKr388st6++239YMf/CDeGRPjtFzrBABwrOQ0KaO7dYqYxfScTG5urubOnasJEyYc8fqCBQt07bXXau/evfHKlzjl66XfXGCdAgCO1K2f9P011iliFtNIpr6+Xvn5+ce8npeX593psqxj/3sAwFy2t69NMZXMuHHjdPfdd6ux8dOzpxsaGnTvvfdq3LhxcQuXUJk5UkoX6xQAcKTsAusEnRLTRxYefPBBTZo0SX369NGoUaMkSR9++KHS09O9fSxzVp5UVWKdAgA+lRXAkikqKtKmTZs0Z84cbdiwQZI0bdo0XX/99crI8ObpbZKk7EJKBoC7BHEkc9999yk/P1+zZs064vXHH39ce/fu1ezZs+MSLuFYlwHgNh4vmZjWZH77299q2LBhx7w+YsQIPfLII50OZcbjbyYAH/L4dFlMJVNaWqrCwsJjXu/Zs6c3t/k/iJIB4DYevy7FVDJ9+/bV4sWLj3l98eLF6tWrV6dDmfH4HQMAH/J4ycS0JjNr1ix973vfU0tLy6HjlufPn6877rjDu0/8S55/MwH4TFKqlNnDOkWnxFQyt99+u/bt26dbbrlFzc3NkqQuXbpo9uzZ+tGPfhTXgAlFyQBwk6x8KRSyTtEpnTp+uba2VuvXr1dGRobOPPNMpad7fJPJ+krpZwOsUwBAVO+x0qz51ik6pVP7R2dlZem8886LVxZ7mTnRzejamq2TAIAvZldiWvj3tZxB1gkAIKqH969HlMzReo22TgAAUYWjrRN0GiVzNB+8qQB8wgc3vZTM0XzwpgLwgS7dpJyB1ik6jZI5WkGRFOKvBYCxwlHWCeKCq+nR0k6TcodYpwAQdD6ZuqdkjqfXGOsEAILOJ9chSuZ4fHIHAcDDfLI+TMkcj0/eXAAe5ZNFf4mSOT4W/wFY8smiv0TJHB+L/wAs+WjKnpI5ER+9yQA8xkdT9pTMifjkkx0APMhH1x9K5kR8dCcBwEN8tOgvUTInVjhKSulinQJA0PT5jHWCuKJkTiQ1QxpwkXUKAEEzdJJ1griiZE5m6GTrBACCZugV1gniipI5maGTJXn7fG0AHlI4SurayzpFXFEyJ5Nd4KtPeQBwOZ+NYiRK5tR8+KYDcCkfTtFTMqfis0U4AC7VtbevtpM5iJI5lYIiqVs/6xQA/G6IP29oKZn2YDQDwGk+nZqnZNrDh/OkAFwkLVsaMN46hSMomfY44/NSelfrFAD8atDFUkqadQpHUDLtkZwqDb7EOgUAv/LpVJlEybSfj38IABgKJUtDLrdO4RhKpr3OvExKSrFOAcBv+n5GysyxTuEYSqa9Mk6XBl9qnQKA3xRNsU7gKEqmI8beaJ0AgJ+kZUsjv2adwlGUTEcMvlQ6/QzrFAD8YtRUKT3LOoWjKJmOSEqSxs60TgHAL877pnUCx1EyHTVmOidmAui8/hdKecOsUziOkumozBxpxJesUwDwuvOCscZLycQiAENcAA7KKpCGX22dIiEomVj0GSsVjrZOAcCrzrkhupNIAFAysQrIUBdAnCWlSGP/2TpFwlAysSqaInXpZp0CgNcMmSR17WWdImEomVilZkijr7dOAcBrAramS8l0xtgbJYWsUwDwih5nSgMnWKdIKEqmM3IHSwMvsk4BwCvGzpRCwboxpWQ664JbrBMA8IL0btLo66xTJBwl01lDLpf6XmCdAoDbfe5WKaO7dYqEo2Ti4dJ7rBMAcLOsgsDOelAy8dB/nHSmf0+2A9BJF90upWVapzBBycTLpXdLIf46ARwlZ6B0zjesU5jhqhgv+SOkomutUwBwm4k/lpKDe3Q7JRNPF/+blJxmnQKAWxSOkkZ82TqFKUomnk7vz6FmAD51yd2Bey7maJRMvI2/PXpuN4BgGzBeGnyJdQpzlEy8nZYrjfu2dQoA1ni0QRIl44zPfkfKzLVOAcDK8Guk3udap3AFSsYJ6dnRaTMAwRNKli65yzqFa1AyThk7U+rezzoFgEQbc72Ue6Z1CtegZJySkiZdfp91CgCJlHG6dPGPrVO4CiXjpOFXSWd/1ToFgESZ/DMpO986hatQMk674ufSaXnWKQA4beiV0kh2/TgaJeO0zBzpql9ZpwDgpIzT+f/8BCiZRGDaDPA3pslOiJJJFKbNAH9imuykKJlEYdoM8B+myU6Jkkkkps0Af2Ga7JQomURj2gzwB6bJ2oWSSTSmzQDvY5qs3SgZC0ybAd7GNFm7UTJWmDYDvIlpsg6hZKxk5khfeiS6YysAb+jaR7r6v61TeAolY2nwJdJl91qnANAeqZnStGekrJ7WSTyFkrH22f8njZpmnQLAqXzxYalwlHUKz6Fk3ODq/5Z6j7VOAeBEPv8D6ewvW6fwJErGDVLSpalzpOxe1kkAHG3oFdLEO61TeBYl4xbZBdGiSelinQTAQXlnSV/+nRQKWSfxLErGTXqfI13za+sUACQpI0ea+oyUnm2dxNMoGbcZOUX63HetUwDBlpQiTXlCyhlgncTzKBk3uuQe6czLrVMAwXX5fdLAi6xT+AIl40ZJSdJXHpNyh1onAYLnnBnS+TdZp/ANSsatunSVpj0rdelunQQIjn6fla58wDqFr1AybtZjkHTtH6TkdOskgP/lDJS+9pSUnGqdxFcoGbcbOCG6AJnEDz7gmO79pBmvSqflWifxHUrGC4ZdEV2jYTNNIP6ye0k3vCJ162OdxJcoGa8Y8U+f7NrMWwbETVZ+dATDR5UdwxXLS0Ze+8k24zx9DHRaZg/phpel3MHWSXyNkvGac26IHngGIHZdukvTX5Lyhlsn8b1QJBKJWIdADJY/Lv3lNkm8fUCHHBzBFBRZJwkESsbLVs6RXvmOFAlbJwG8ISs/usifN8w6SWBQMl63Zq704rekcKt1EsDduvaOLvL3GGSdJFAoGT9Y/6o0d6bU1mydBHCng8/BnH6GdZLAoWT8YuPfpedvkFobrZMA7pIzSJrBczBWKBk/2bVCeu56qWaPdRLAHQaMl6b8QcrMsU4SWJSM39SUSs9dFy0cIMg+c1N0y/7kFOskgUbJ+FFLo/Tqd6XVz1knARIvKVW68hfSud+wTgJRMv62+H+keXfzEWcER2au9LWnpf7jrJPgE5SM3216U5p7o9RUbZ0EcFZBkTT1Wal7X+skOAwlEwQVm6Rnp0r7NlsnAZxx1helf3pESsu0ToKjUDJB0VAVfZbm4/nWSYA4CkkTfiRddIcUYuNYN6JkgiTcJr15l/Ter62TAJ2XlhU9/mL41dZJcBKUTBCtekZ69XtSW5N1EiA23ftJ056T8kdYJ8EpUDJBtWe19NItUtka6yRAxxRdK02+nwcsPYKSCbK2FumdX0iLHpDCLdZpgJPLypeuejB6HDk8g5IBoxq438ivRUcvGadbJ0EHUTKIYlQDN2L04nmUDI7EqAZuwejFFygZHItRDSxlFUhX/YrRi09QMjgxRjVINEYvvkPJ4OTaWqIjmnd+wagGzskqkK5+UBo62ToJ4oySQfuUb5Dm/7tU/Jp1EvhJSoZ0/rekC78vZXS3TgMHUDLomB3LpHn3SNsXWyeBlyWlSGO+Ll30Q6lroXUaOIiSQWw2vhEd2bBegw4JRXdMnninlDvYOgwSgJJB7CIRac2fpLd+IlVtt04DtxtwkXTpPVLvc6yTIIEoGXRea7O04n+ld34u1e21TgO3KRwtXXq3NGiidRIYoGQQP0210nsPS+8+JDXXWKeBtZxB0sQfSyO+xFkvAUbJIP7q9kmLfiG9/3uOEwii7MLoIWJjbpCSU6zTwBglA+fUVUgfPBmdSqsqsU4Dp/UbJ533TWn4NVJKmnUauAQlA+eFw9KmN6T3H4se/xwJWydCvKRlSSOvjZYLB4jhOCgZJFblVmn549LKp6WGSus0iFXP4dJ5N0qjpkrp2dZp4GKUDGy0NklrX4iObnYtt06D9khKlYZfFR21nHGhdRp4BCUDe7tXRctm7Z+llnrrNDha197Sud+QzpkhZedbp4HHUDJwj4YqafXz0oZXpe3vSuFW60TB1aWbNPhSacSXo5tWJiVbJ4JHUTJwp4YqafM8qfiv0a+N1daJ/K97f2noFdFS6f85Pn6MuKBk4H5tLdGRTfHfoqXDFjZxEpL6jJWGTIqWS/5Z1oHgQ5QMvKdsXbRsiv8m7VohiR/hdkvNlAZOiI5WhkySsvKsE8HnKBl4W225tPH16JTa7pU89Hm0pJTox437jJWGXB4tmNQM61QIEEoG/lJfGS2bPauin1rbsyo4xXOwUHqNim5K2WuMlH+2lNrFOhkCjJKB/9VXflI6K/1TPMcUyjnRJ+4pFLgMJYNgOlg8pWuk6l1SzR6ptiz6tabMHRt7pneVsgukrPzoppPZ+dLpZ0iFYygUeAYlAxxPfeUnpVMa/VVb+unvD37feCD6LM/hv463L1soOTrySEqJfiw4KVXK7BEtkIO/sg7//SelkpaZ+P9uIM4oGSCewuFPyybpk3LhLBUEGCUDAHBMknUAAIB/UTIAAMdQMgAAx1AyAADHUDIAAMdQMgAAx1AyAADHUDJAjN555x1dffXV6tWrl0KhkF566SXrSIDrUDJAjOrq6jRq1Cg9/PDD1lEA1+J8VSBGkydP1uTJk61jAK7GSAYA4BhKBgDgGEoGAOAYSgYA4BhKBgDgGD5dBsSotrZWmzdvPvT91q1btWrVKuXk5Khfv36GyQD34NAyIEYLFy7UxRdffMzrM2bM0BNPPJH4QIALUTIAAMewJgMAcAwlAwBwDCUDAHAMJQMAcAwlAwBwDCUDAHAMJQMAcAwlAwBwDCUDAHAMJQMAcAwlAwBwDCUDAHAMJQMAcAwlAwBwDCUDAHAMJQMAcAwlAwBwDCUDAHAMJQMAcAwlAwBwDCUDAHAMJQMAcAwlAwBwDCUDAHAMJQMAcAwlAwBwDCUDAHDM/wEFAHkyHiEV8AAAAABJRU5ErkJggg==\n"
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"train[\"review_score\"].value_counts().plot(kind=\"pie\")"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 143,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": "<AxesSubplot:ylabel='count'>"
|
||
},
|
||
"execution_count": 143,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": "<Figure size 640x480 with 1 Axes>",
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZkAAAGFCAYAAAAvsY4uAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAoaklEQVR4nO3deXxU1cH/8e9kD0lIQjaSIEsI+w5uqEWwVbGuVXFHqv3Zah93H6H2qVqX2hetbW2tSmsf6lr3Bfu4L4CAImvYQZYkJEASspM9mZnfHyiVPUzmzrn3zuf9evGSJLxmvpIw3zn3nHuOx+/3+wUAgAUiTAcAALgXJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwDCUDALAMJQMAsAwlAwCwTJTpAICdtXX4VNvcprqmdtU1t6v2m//Wt7Srqc2rlnavmtq8am73qqXNq3afX9ERHsVERez9FRmh6G/+GxMVodioCEVH/udrCbFRyuoeq+zkeGUkxSoywmP6fxkIKkoGYa+qoVXF1U3aXtWk4qomFVc1qrh67+8rG1pDliMywqOMxFj1TI5Tz+5x6pkcp+zkuH0fZyfHKzslTtGRXICAc3j8fr/fdAjAal6fXxt21WvtjjoVVjXuK5Tt1U1qaO0wHa/TYiIjlJ+ZqGE53ff+yk3WkOzuSozl/SLsiZKBK1U2tGpFcY1WltRqRXGN1uyoU1Ob13QsS3g8Ut+0BA3N6a6h2d+UT06yMpJiTUcDKBk4X4fXp/W76rVye61WbK/Riu01KqluNh3LuMykWI3rk6pT8tN1Wn66+qUnmI6EMETJwJG2VzXpo/Vl+nRDhVaW1Kil3Wc6ku3lJMfplPx0nZqfpgkDMpSWyEgH1qNk4Ah+v1+rSuv08foyfby+XF+XN5iO5GgRHmlErxRNGpShMwZnakRusjweVrYh+CgZ2FZrh1dfbK3Sx+vL9emGcpXXh26lV7jJSIrVpEEZunB0rsbnpSmCpdQIEkoGttLQ2rFvtDJ/0241unSy3s5ykuN04ZhcXTI2V/mZSabjwOEoGdjCksJqvbK0RO+v3eXaVWBONLJXsi4ek6sLRueqR0KM6ThwIEoGxlTUt+i15aV6fXmpCisbTcfBEURHenT6wAxdPLaXvj8kU7FRkaYjwSEoGYTcF1sq9dyXxfpkQ7k6fPz4OU1yfLTOG5mtaaf01cAsLqfhyCgZhER9S7veWF6qFxYXa+tuRi1u4PFIEwdm6MbT++ukvDTTcWBTlAwsVVHfolnzt+nlpduZa3GxMb1T9LMJeTpraE9WpmE/lAwsUV7foqfmbdVLS7artYMbJcNFXnqCbpiQp4vH5jJvA0mUDIKMcoG0976b607tq2tO7qPucdGm48AgSgZBUVbXoifnbdHLS0vURrngG4mxUbrqpN666fT+SmUJdFiiZNAlu+qa9eTcrXplGeWCw+seF6Vbvz9A007py3k4YYaSQUCqGlr12Ceb9crSErV5KRd0Tt+0bvrFOUM0eXhP01EQIpQMjonP59eLXxXr0Y++Vl1zu+k4cKiT83roV+cO1fDcZNNRYDFKBp22cnuN7p2zVmt31JuOAheI8EgXj+2l6WcPUmb3ONNxYBFKBkdV09immR9s1CvLSsRPC4KtW0ykfjahv346IU/xMSx7dhtKBofl8/n18tIS/f7Djapp4tIYrJWdHKdfnTtU547MNh0FQUTJ4JDWlNbpV3PWalVJrekoCDOTh/XUgxcNU2YSl9DcgJLBfupb2jXz/Y16acl2sXclTEnpFq17zx2qS8b1Mh0FXUTJYJ9lRdW67eUC7ahtNh0FkCRNGpSh3148Uj2TGdU4FSUDeX1+/eXTzfrr3C3yMnyBzSTHR+uhi4brglE5pqMgAJRMmCutadLtLxdoWXGN6SjAEV0wKkcPXTRcyfHsheYklEwY+7/VO/XLN9eovqXDdBSgU3KS4/TolFE6JT/ddBR0EiUThhpbO3T/O+v0+vJS01GAY+bxSD+f2F93nTmIs2scgJIJM6tLa3XbywUqrOR0SjjbxEEZ+vMVY7h8ZnOUTJjw+/36++fb9OhHm9Tu5VsOd+iXnqC/Tx2nAVlJpqPgMCiZMNDc5tV/v7ZK767ZZToKEHSJsVF6dMoodna2KUrG5XbVNeuG55axqSVczeORbpmUrzvOHCiPh3kaO6FkXGzl9hr99Pnl2r2n1XQUICR+MCRTf7p8tJI48tk2KBmXmlOwQ9NfX61WTqtEmMnLSNDT1x6v/hmJpqNAlIwr/fWzzXr0o69NxwCMSYqN0mNXjNb3h2SZjhL2KBkX6fD6dO+ctXppSYnpKIBxkREe/f7Skbp4LJtsmkTJuERja4d+/uIKzf96t+kogG14PNKDFw7X1JP7mI4StigZF6hsaNW02Uu0bicryIBD+cU5g3Xj6f1NxwhLlIzDVTa06qqnF+vr8gbTUQBb+69J/XX32YNNxwg7lIyDUTDAsfnxKX11//lDuZcmhCgZh6pqaNWVFAxwzKaM66WZl4xkc80QoWQcqKqhVVc9/ZU2le8xHQVwpHNHZuuxy0crOjLCdBTXo2QcpqqhVVf/4yttLKNggK6YNChDT10zTnHRkaajuBo17iDVjW0UDBAkczft1k0vLFeHl10xrETJOER1Y5uuenoxBQME0dxNu3XPm2tMx3A1SsYBKBjAOq8tL9XvP9xoOoZrUTI219zm1Y//uYSCASz0xNyteu7LItMxXImSsTG/36/bX1mp1aV1pqMArvfrd9bpfQ72CzpKxsZmfrBJH64rNx0DCAs+v3TbKwX6aluV6SiuQsnY1KtLSzRr/lbTMYCw0tbh0w3PLdMmLk8HDSVjQ19urdL/vM2KF8CE+pYOTZu9RDtrm01HcQVKxma27W7QTS8uV7uXe2QBU8rqW3Tt7CWqbWozHcXxKBkbqW1q00+eXabapnbTUYCwt6WiQTf/a6V8Pt7wdQUlYxPtXp9ufGG5CisbTUcB8I2FWyr16EebTMdwNErGJn755hot3lZtOgaAAzw1f6s+WldmOoZjUTI28MyiQr22vNR0DACH4PdLd722iqsMAaJkDNuwq16PvM+WFoCd7Wnp0I3PL1dzm9d0FMehZA7w+eef6/zzz1dOTo48Ho/efvtty56rpd2r215eqbYOdoEF7G5T+R7d/85a0zEch5I5QGNjo0aNGqUnnnjC8uf6zbsbONkScJBXl5VqTsEO0zEcJcp0ALs555xzdM4551j+PJ+sL9fzi4stfx4AwfU/b63VyF4p6peeYDqKIzCSMaBiT4tmvLHadAwAAWho7dDN/1qh1g7mZzqDkgkxv9+vu15dpapG7iQGnGrdznrNfJ/7ZzqDkgmx/11YqAWbK03HANBFz3xRqOXFNaZj2B4lE0Lrdtbpdx/w7gdwA59f+sUbq1kdehSUTIg0t3l128sFavPyAwm4xeaKBj01jyM5joSSOUBDQ4MKCgpUUFAgSSosLFRBQYG2b9/epcf948ebtKWC5cqA2zwxd4u2VHD+zOF4/H4/W4x+x7x58zRp0qSDPj9t2jQ988wzAT3m+p31uuCvC9XBbq6AKx3fJ1Wv3TheHo/HdBTboWQs5vf7dfFTX2jl9lrTUQBY6KELh2nq+L6mY9gOl8ss9q8l2ykYIAz87oNNKqtrMR3DdigZC1U2tGomm18CYWFPa4d+9TZ7mx2IkrHQb9/bqPqWDtMxAITIJxvK9e7qXaZj2AolY5GV22v05krOiAHCzf3vrFN9C0eof4uSsYDf79ev/71eLKkAwk9lQ6v+Np97Z75FyVjgjRU7tKqk1nQMAIb8c1GRdu9pNR3DFiiZIGto7dDMD5jsB8JZU5tXj3+22XQMW6BkgmzWvK28gwGgl5ZsV0l1k+kYxlEyQVTb1KZnvigyHQOADbR7/frDR2yIS8kE0f8uLFRDK0uWAez1zqqd2rCr3nQMoyiZIKlrbtczi4pMxwBgIz6/9PsPw3s0Q8kEyeyFhdrDKAbAAT7bWKGlRdWmYxhDyQRBfUu7/rmo0HQMADYVzttLUTJB8MyiIraPAXBYy4pr9NnGctMxjKBkuqihtUP/u5BRDIAj+9PH4XnfDCXTRc9+UaS6ZvYpAnBka3bUaVkYzs1QMl3Q2NqhfyzYZjoGAIf4ZxjeR0fJdMFzXxarpolRDIDO+XBtmXbVNZuOEVKUTIA6vD7NZkUZgGPQ4fPr+S+LTccIKUomQJ9sKGePMgDH7OWlJWpp95qOETKUTIBeWlJiOgIAB6pubNM7BTtNxwgZSiYApTVNWrB5t+kYABwqnBYAUDIBeGVpiXycegkgQBt21WvxtirTMUKCkjlGXp9fry0rNR0DgMOFy4a6lMwx+mxjhcrqW0zHAOBwH28o145a9y9npmSO0ctLtpuOAMAFvD6/XvrK/a8nlMwx2FXXrHlfM+EPIDjmrNphOoLlKJlj8OrSUnmZ8QcQJCXVzVpeXGM6hqUomU7y+fx6dRn3xgAIrjkF7h7NUDKd9OW2qrCYpAMQWu+t2aUOr890DMtQMp300boy0xEAuFBlQ5sWbqk0HcMylEwnfbKhwnQEAC717updpiNYhpLphPU767lUBsAyn2wod+0lM0qmEz7ZEJ5ncwMIjZqmdn1V6M5TMymZTvh4PSUDwFofrHXnvC8lcxRldS1au7POdAwALvfR+jL5/e67D4+SOYpPNpTLhd93ADZTXt+qFdtrTccIOkrmKJiPARAqi1y4lJmSOYLG1g59sTU8znwAYJ4bz5ihZI7g8693q63DncsKAdjPiu01rnvNoWSOgBswAYRSS7tPBSW1pmMEFSVzBF9udd/1UQD29pXLLplRMoexs7ZZO+s4ARNAaC0upGTCwtIid959C8DeVhTXumpehpI5DLcfJATAnprbvVpdWms6RtBQMoextIiSAWCGm5YyUzKH0NDaoU1l9aZjAAhTi7e553I9JXMIq0tr5WMrGQCGLC+uUbtLtv6nZA5hTSkbYgIwp7ndq3U73XE1hZI5hNU7KBkAZn1dvsd0hKCgZA7BTSs7ADjTlooG0xGCgpI5QE1jm0qqOWoZgFmbGcm4k1uugwJwts2MZNypsNId31gAzrajtlnNbV7TMbqMkjnA9uom0xEAQH6/O+ZlKJkDUDIA7GJzhfPnZSiZAxRXUTIA7MEN8zKUzAFKGMkAsInN5ZSMq1Q2tKrRBRNtANxhC5fL3IX5GAB2UlLTrJZ2Z7/xpWS+YzvzMQBsxOvzq7TG2TeHB1QyZ5xxhmpraw/6fH19vc4444yuZjKGkQwAu6lubDMdoUsCKpl58+apre3g//GWlhYtWLCgy6FMoWQA2E11Y6vpCF0SdSx/ePXq1ft+v379epWVle372Ov16oMPPlBubm7w0oUYl8sA2E2Vw0cyx1Qyo0ePlsfjkcfjOeRlsfj4eD3++ONBCxdqO2qdfe0TgPtUN4RRyRQWFsrv9ysvL09LlixRRkbGvq/FxMQoMzNTkZGRQQ8ZKvXN7aYjAMB+wmok06dPH0mSz+eOY0EP1NjWYToCAOynpimMSua7Nm/erLlz56qiouKg0rnvvvu6HCzUGls75PObTgEA+3P66rKASubpp5/WTTfdpPT0dPXs2VMej2ff1zwej2NLBgDspiqc5mS+9fDDD+s3v/mNZsyYEew8xuyhZADYkNNHMgHdJ1NTU6MpU6YEO4tRjGQA2FFYlsyUKVP00UcfBTuLUQ0tlAwA+2nz+rSnxbkrXwO6XJafn697771Xixcv1ogRIxQdHb3f12+99daghAulBkYyAGxqT0uHkuKij/4Hbcjj9/uPeU1Vv379Dv+AHo+2bdvWpVAmvLmiVHe+usp0DAA4yILpk3Rcj26mYwQkoJFMYWFhsHMYx5wMALvyOvj+Crb6/warywDYVYeDSyagkcz1119/xK/Pnj07oDAmtbS7cxcDAM7nO/ZZDdsIqGRqamr2+7i9vV1r165VbW2tY8+TiY7wHP0PAYABHd4wK5m33nrroM/5fD7ddNNN6t+/f5dDmRATxZVDBM/9/TboiraD/50AgYiImC2pu+kYAQl477IDRURE6M4779TEiRM1ffr0YD1syFAyCKa/7sjXtMRyRTTtNh0FbuBx7pxxUF9Zt27dqo4OZ/5lxEY594gC2E9VW7Te6zHVdAy4hce5b4IDGsnceeed+33s9/u1a9cuvfvuu5o2bVpQgoUaIxkE292FY3R2el9F1xWZjgKn8zj3TXBAJbNy5cr9Po6IiFBGRob+8Ic/HHXlmV1RMgi2Zm+k/tXtGk2re9h0FDhdRJiVzNy5c4Odw7hYSgYW+HXREF2WM0zxVetMR4GTRcaYThCwLr2y7t69WwsXLtTChQu1e7ezJzgZycAKfr9HT0ZcZToGnC4+xXSCgAX0ytrY2Kjrr79e2dnZmjBhgiZMmKCcnBz95Cc/UVNTU7AzhkRsJCUDazxe0k/1WSeZjgGnioiSYpNMpwhYQK+sd955p+bPn69///vfqq2tVW1trebMmaP58+frrrvuCnbGkIiNpmRgnYdaLzcdAU4Vl2I6QZcEtAtzenq6Xn/9dU2cOHG/z8+dO1eXXXaZIy+drSmt0/l/XWg6Blzsq7zZytr5iekYcJq0fOmW5aZTBCygt+9NTU3Kyso66POZmZmOvVwWH8NIBta6p+5H8jt4KSoMiU81naBLAnplHT9+vO6//361tLTs+1xzc7MeeOABjR8/PmjhQik9MdZ0BLjcZ1WpKso933QMOI3DL5cFtIT5scce0+TJk9WrVy+NGjVKkrRq1SrFxsY69ljmlG4xiomKUFsHuzHDOndU/FBvRX0gT0fL0f8wIDl+JBNQyYwYMUKbN2/Wiy++qI0bN0qSrrzySl199dWKj48PasBQykiM1Y7aZtMx4GIF9YlaM2CKRpY8bzoKnMLBy5elAEvmt7/9rbKysnTDDTfs9/nZs2dr9+7dmjFjRlDChVpmd0oG1ru5dJLmx86Rp7XedBQ4gcNHMgHNyfztb3/T4MGDD/r8sGHDNGvWrC6HMiUziXkZWG97c5wWZnKDJjopIcN0gi4JqGTKysqUnZ190OczMjK0a9euLocyJat7nOkICBO3F4+XNyHTdAw4QWo/0wm6JKCSOe6447Ro0aKDPr9o0SLl5OR0OZQpuSnOnU+Cs1S1Reu9VI4CQCf0cHbJBDQnc8MNN+j2229Xe3v7vuOWP/30U02fPt2xd/xLUm4qJYPQmV44WpM5CgBH4omUUnqbTtElAZXM3XffraqqKv385z9XW1ubJCkuLk4zZszQPffcE9SAodQrtZvpCAgjzd5Ivdhtqn5c95DpKLCr5FwpMtp0ii4JaFuZbzU0NGjDhg2Kj4/XgAEDFBvr7InzyoZWHf8w234gdDwev9bn/FbxVWtNR4Ed9TtdmvaO6RRd0qW9VBITE3XCCSdo+PDhji8Yae9d//HRbPuB0PH7PforRwHgcBw+HyN1sWTcqE8al8wQWk+U9FVd1smmY8COHL6yTKJkDjIku7vpCAhDD7deZjoC7IiRjPsMy6FkEHqvlfVUWc6ZpmPAblL7mk7QZZTMAYblJJuOgDD1i9qLOAoA/+GJlNIGmE7RZZTMAYbldpfHYzoFwtG86lQV5l5gOgbsImOwFOP8OWJK5gDd46J1HPfLwJDbK86RP4rtjSApd4zpBEFByRzC8FzmZWDG6vpErc6ZYjoG7CBnrOkEQUHJHALzMjDplpJJ8sfyRifs5TCScS1WmMGk7c1xWpB5tekYMCkyRsoabjpFUFAyhzA8l5EMzLqtiKMAwlrWcCkqxnSKoKBkDiE9MVZZ3Z2/TQ6cq6Y9Su9yFED4csmlMomSOazhzMvAsLu3jVF7svPv+EYAct0x6S9RMoc1to+zz9WG87X6IvRC/DWmY8AEl6wskyiZw5owwNnnasMdHiwerOZ0d0wAo5O6pUuZQ0ynCBpK5jCG53ZXWoI7Jt7gXH6/R4+LowDCSv9JctO2I5TMYXg8Hn1vQLrpGICeLOUogLDS//umEwQVJXMEEwZyyQz28FDr5aYjICQ8Uv8zTIcIKkrmCCYMzHDTqBUO9npZFkcBhIOs4VJSlukUQUXJHEF6Yix3/8M2OAogDOS7axQjUTJHxSoz2MW86lRty73QdAxYyWXzMRIlc1SnMy8DG7mjYjJHAbhVdILUe7zpFEFHyRzF2D6pSoqNMh0DkLT3KIBV2ZeZjgEr9D3VNfuVfRclcxTRkREa3z/NdAxgn5tLJskfy7ZHrjPgLNMJLEHJdMLpg7hkBvsobYnV55ncoOkqnkhp6EWmU1iCkumEc4ZnKzqStcywj9uLxsub4K6lrmEt73Qp0Z1vZimZTuiREKOJgzjbA/ZR0x6l/0vhKADXGOHeI7cpmU66ZGyu6QjAfqYXjlZ7cp7pGOiqqDhp8HmmU1iGkumkMwZnKaVbtOkYwD6tvgg9141jmh1vwFlSnHtv+qZkOikmKkLnj8wxHQPYz8NFHAXgeC6+VCZRMsfkknG9TEcA9uP3e/QXDyvNHCs2WRp4tukUlqJkjsHo41LUPyPBdAxgP0+V9FVtT/fdKR4WhpwnRcWaTmEpSuYYXTyW0Qzs56FmdgFwpBGXmk5gOUrmGP1oTK4iuGUGNvNGeZZ25brzjnHX6pEn5U0yncJylMwxykmJZ5sZ2NIvai/kKAAnOeEGVx2zfDiUTAAu4ZIZbGh+Vaq29rrIdAx0RkyiNOYa0ylCgpIJwA9HZKtHgvt2S4Xz3VE2Wf6oeNMxcDSjrnD1vTHfRckEIC46Ulef1Nt0DOAga/YkqCCHRQD25pFO/JnpECFDyQTo2vF9FRPFXx/s55btEzkKwM76T5IyBppOETK8SgYoIylWF41mBwDYT2lLrOZnst2MbYXRKEaiZLrk/32PzQlhT7cVnSxvQk/TMXCgHnmuv8P/QJRMFwzMStLpA915BgScra49Sv9ODY/VS44SJsuWv4uS6aKbz8g3HQE4pBnbRqsthdG2bXRLk8ZeazpFyFEyXXRC3x46sV8P0zGAg7T6IvR8HKMZ2zj1Nik20XSKkKNkguDmSYxmYE8PFw9SU/oI0zGQkLn3UlkYomSCYMLADI3qxZJR2I/f79FfxFEAxn3vTimmm+kURlAyQfJfjGZgU7NK+6i25ymmY4SvpBzp+OtNpzCGkgmSs4b11NjeKaZjAIf0YPNl8iu8VjXZxoS7XH9mzJFQMkF073lDw211IhzizfJMlXEUQOgl95bGhN+Ksu+iZIJoTO9UXTCKXQBgTzNqLpQ/Isp0jPAy4b+lqPDeTJeSCbIZkwcrLpq/VtjP59Up2pp7oekY4SO1nzSa7X14NQyynJR4/ZTtZmBTt3EUQOic+YAUyciRkrHAjRP7K6t7+E70wb7W7UnQyuzLTcdwv34TpKGMGiVKxhLdYqJ099mDTccADumWkgkcBWCliChp8kzTKWyDkrHIJWNzNSKXf8iwnx0tcZqXyXYzljn+J1LWUNMpbIOSsYjH49F95/ODBnu6vegkjgKwQkKGNOmXplPYCiVjoRP69tAPR/APGfZT1x6lOSlTTcdwn7N+I8WnmE5hK5SMxX75wyFKiIk0HQM4yD2FozgKIJj6fk8aFbxFFW+++abOOusspaWlyePxqKCgIGiPHUqUjMV6pXbTPT8cYjoGcJBWX4Se5SiA4IiMkc79Y1AfsrGxUaeddppmznT2IgIWcYfANSf30YfryrRgc6XpKMB+HikepKtzR6pb5WrTUZzt1NuljIFBfcipU/deziwqKgrq44YaI5kQ+d2lI9U9jk6Hvfj9Hj2mK03HcLbsUdLp002nsC1KJkSyk+N1//nDTMcADvL30j6q6Xmq6RjOFBUnXfy0FBltOoltUTIhdMm4XjpraJbpGMBBHmiewlEAgfjBr6WMQV1+mBdffFGJiYn7fi1YsKDr2WyC6zch9sjFI7S8uEZVjW2mowD7vF2eqen9z1bOjg9MR3GOvInSSTcG5aEuuOACnXTSSfs+zs3NDcrj2gEjmRBLT4zVwxcNNx0DOMh0jgLovLgU6aKnFKwDpJKSkpSfn7/vV3y8ezYxpWQMOGdEti4czbkzsJeF1cnaknuR6RjOcN4fpe7W/huurq5WQUGB1q9fL0natGmTCgoKVFZWZunzBhslY8iDFwxnp2bYzu1lZ3MUwNEMv1QafonlT/POO+9ozJgxOvfccyVJV1xxhcaMGaNZs2ZZ/tzB5PH7/X7TIcLVoi2Vunb2Enl9fAtgH28O/Ehjtz9jOoY9de8l3bSIrWOOASMZg07NT9eMyV1fmQIE0y3bT5cvLsV0DPuJipMuf46COUaUjGE/ndCf+RnYyo6WWM3L4Njgg5z7Ryl3nOkUjkPJ2MDMS0ZqeG530zGAfe4oOknexGzTMezjxJ9JYyjeQFAyNhAXHam/Tz1e6YkxpqMAkr45CiCZzTMlSX1Ok85+xHQKx6JkbCInJV5PXj1O0ZHcdQ17mFE4Wm0p/U3HMKt7L+myZ6VI7h8KFCVjIyf266H7zuM0TdhDu8+jZ8L5KICoOOmKF6SEdNNJHI2SsZmp4/vqihOOMx0DkCQ9UjRITemjTMcw4/w/SzljTKdwPErGhh68cLjG9k4xHQOQJP1RV5mOEHrjb5ZGXWE6hStQMjYUExWhWVPHKTs5znQUQP8oPS68jgIYebl01sOmU7gGJWNTmUlxev4nJyq1G+dUwLxfh8tRAAPOki58MmgbX4KSsbX8zCQ9c92JSoiJNB0FYW5OeaZ25p5tOoa1jjtJmsJKsmCjZGxu1HEp+vu1xysmkm8VzJrh5qMAModJV70ixXQzncR1eOVygFPz0/XnK0YrMoIhPMxZWJ2szbk/Mh0j+FL6SFPflOJTTSdxJUrGIc4Zka2Zl4zkUjGMum3X2fJHu+jdfkKmNPUtKamn6SSuRck4yKXjeumRH42gaGDMhoZuWtHzctMxgiM2WbrmDSktzHc1sBgl4zBXnthbD14wzHQMhLGbS06XL87hl5bie0jXvi1ljzSdxPUoGQeaOr4v28/AmF0tMc4+CiAxS7ruPSl3rOkkYYGScajrT+un+84byqUzGHFb4YnOPAog+TjpuvelzCGmk4QNSsbBrj+tn/58xRiWNyPk9nRE6e3kqaZjHJse/fcWDHMwIeXx+/0cMO9wX2yp1M+eX649rR2moyCMREf4tS7zPsXUbjUd5egyh0pT35aSskwnCTu8BXaBU/LT9crPxiszKdZ0FISRvUcBOGA0kz1a+vG7FIwhjGRcpLSmSdNmL9HW3Y2moyCMrDvud0rYXWA6xqH1PmXvnfxxHG9uCiMZF+mV2k1v3HSKxvVx+PJSOMoffVeajnBoo6+Wrp1DwRjGSMaFWtq9uuWllfp4fbnpKAgTK/o+qR5lC03H2MsTIZ35oHTKLaaTQIxkXCkuOlKzrhmnq07qbToKwsQDzZfa4yiAmCTpypcpGBthJONys+Zv1e8/3CSvj28zrLWo//PK3fG+uQCpffcWDPfA2AolEwYWb6vSrS+tVMWeVtNR4GKnptbphdZb5fG1h/7J+5wmXf681K1H6J8bR8TlsjBwcl6a3rvtezotP910FLjYoppkfZ17UeifeOy1e/cho2BsiZFMGPH5/Przp5v1+GebxdUzWGFIYpPe89wqT3uT9U8W3U06Z+bekoFtUTJhaOHmSt3+ykpVNrSZjgIXemPAxxpX8k9rn6TnSOnS2VL6AGufB11GyYSpivoW3fzSSi0prDYdBS6THdemRXF3KKKlxoJH90gn/1z6wa+lqBgLHh/BxpxMmMrsHqeXbjhZN03sz07OCKpdLTGam3FN8B84IVO6+nVp8iMUjIMwkoHmbqzQ3a+vVmUDq88QHElRHSpI+YUiG3YG5wHzfyBd9JSUmBmcx0PIMJKBJg3O1Kd3nq7Ljz/OdBS4xJ6OKL3VPQibZ0bGSGc/sncEQ8E4EiMZ7OfLrVX65VtrVFjJJpvomr1HAdyvmNotgT1An9Ok8/4kZQwMbjCEFCMZ7Gd8/zS9f9v39F+T+is6kskaBK7d59Hs2ADmZuJ7SBc+If34/ygYF2Akg8PaWFavGW+s0aqSWtNR4GDHdBTAqCuls34jJaRZmgmhQ8ngiHw+v579skiPfrhJjW1e03HgQNfnlui+qhlH/kNp+XsvjfWbEJpQCBlKBp2ys7ZZ9769Vp9urDAdBQ60ot9T6rFrwcFfiIyRTrtD+t5dUhQnu7oRJYNj8sn6cs38YKM2VzSYjgIHOT9zt/5Sf7s8+s7LzaBzpTMf4K59l6NkcMy8Pr9eW1aiP33ytcrrubcGnbMw/wX1Kn1P6nWCdOZDUp/xpiMhBCgZBKy5zat/LNimv32+TQ2tHabjwOam5HXo96f6pWEXmY6CEKJk0GU1jW2a9flWPfdFsZrbWRyA/fVKjdet3x+gS8b2UmQEy+LDDSWDoKlsaNVT87bqhcXFau3wmY4Dw7K6x+rmSfm6/ITeionilrxwRckg6CrqW/TkvK16ZWkJI5sw1LtHN113al9deWJvxUVHmo4DwygZWKauqV2vLivRc4uLVFLdbDoOLHZqfpquO6WfzhicqQgui+EblAws5/P59enGCj37RZEWbqk0HQdBFBcdoR+NydWPT+mnQT2TTMeBDVEyCKnN5Xv07JdFenPFDjWxg4BjZSfHaer4PrryhN5KTeBsFxweJQMj6lva9erSEj2/uFjFVSE4Dx5BMa5Pqq47ta8mD+upqEgm83F0lAyM8vn8mvd1hd5YsUNzN1YwurGhXqnxOndkti4YlaNhOcmm48BhKBnYRnObV3M3Vejd1bv02cYKVqYZlJ0cp3NHZOu8UTkafVyK6ThwMEoGttTc5tVnGyv03hoKJ1QykmL3FsvIbI3rkyqPhxVi6DpKBrbX3ObVpxvL9d6aXZq7cTeFE0RpCTGaPLynzhuZo5P69WDpMYKOkoGjNLV1aOHmSn1VWK2vCqu0fme9fPwEd1pibJSO75uqk/PSND4vTcNzk9nqBZaiZOBo9S3tWlZUra+2VWtxYbXW7ahTB62zT0JMpMb17aHxeWk6Oa+HRuQmsyoMIUXJwFUaWzu0rLhGX22r0leF1VpdWqt2b/j8iHeLidS4PntHKifnpWlUL0oFZlEycLXmNq82ltVrU9kebSzbs+/3NU3tpqN1WW5KvIZkJ2lIdncN7tldg7OT1C8tgXkV2Aolg7BUUd+iLbsbVFjZqKLKRhVWNmpbZaNKqptsNfJJiIlU77QE5WUkqH96gvIyEpWXsfe/ibFRpuMBR0XJAN/h9flV09SmmsY2VX/7a9/H7appalNV43++XtvUpnafX9+OHTweyfPNR9+uAP7P1/Z+pXt8tNISY5SWEKMeCbHf+X2M0hNj1SMh5pvPxSo+hl2M4WyUDADAMswIAgAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAsQ8kAACxDyQAALEPJAAAs8/8BfeZ4v0Vo2vkAAAAASUVORK5CYII=\n"
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"test[\"review_score\"].value_counts().plot(kind=\"pie\")"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 144,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": "<AxesSubplot:ylabel='count'>"
|
||
},
|
||
"execution_count": 144,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": "<Figure size 640x480 with 1 Axes>",
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZkAAAGFCAYAAAAvsY4uAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAo9UlEQVR4nO3dd5yU1aH/8e9s77ssW9hdepciIBBEEYGoiIqiqMF+YzRXvTeJ2AjmZzTFmJjojfFGTUjQqBgLNoyxB5Qi3aWDtGWXsgW295md+f2B4UqRMjvPnOd55vN+vXi5O8DsV1jmO+c85znHEwgEAgIAwAJRpgMAANyLkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYhpIBAFiGkgEAWIaSAQBYJsZ0AMDOWn1+HWhoUW2TT7XNXtU0elXb7FVtk1cNrW1q9h780eRtU7PXr2ZvmyQpPiZa8bFRio+JUlxM1MHPYw5+Hh/7tY9jopWZHKe89AR1Sk9QbDTv++AulAwiXkVdi4orG7W7qlHFBxpVXHnwR0llo0prm+UPhCdHlEfqmBKv/PQE5aUnKi8jQfnpieqUnqD8jIOP5aYlKDrKE55AQAh4AoFAmP4JAeYEAgFtK6/XFyXV2lJap10HDpZISVWjGlvbTMc7adFRHvXMStaggnQNzE879N/UhFjT0YBjomTgSvvrW1RYXK3Ckmp9UVKltSU1qmvxmY5lCY9H6paZpIEF6RqUn65BBWkalJ+uDslxpqMBlAycr9nbpg17aw8WSnGVCkuqtbuqyXQs4woyEnV653Sd2bOjzu6dpd45KaYjIQJRMnCkvdVN+nhTmT7eVK6lOw6o1ec3Hcn28tITdHbvLI3pnaUxfbKUlRJvOhIiACUDRwgEAlqzu0affFUsm/bVmo7kaB6PdHpBuib0z9WE/jkaVJAmj4cFBQg9Sga21dTapkXb9uuTTWX61+Zylde1mI7kWrlp8RrfL0eTBudpTO8sVrAhZCgZ2Epjq0/vrSvVP9ft0+Lt+9XsZRos3HLT4jVlaIGmDu+svrmppuPA4SgZ2MKKokq9trJE/1xXqnqXrgJzosEF6Zp6RoEuHVqgTFarIQiUDIwpr23Wa6t2a+6q3dq5v8F0HBxHbLRH4/vlaOrwzprQP4edCXDSKBmEVSAQ0JLtB/Ti0l36aGOZfOG6nR4h0yEpVpcOydd3RnbVgPw003Fgc5QMwqKm0avXVpXopWXF2sGoxTXG9s3Wbef21Fm9skxHgU1RMrBUZUOrZi3coRc+38W1Fhcb0jldt53bSxMHdlIUK9PwNZQMLFFR16I/f7Zdc5YVO2pvMLRPz6xkfX9sT11xRmfFxXDdBpQMQqystllPL9iul1cUs/w4guWmxevms3vo2lFd2bwzwlEyCIk91U16esE2vbpyN1u84JDUhBhdf2Y33TKmhzqyjU1EomTQLsUHGvXUgm16ffVuedv4VsKxpcbH6L8m9NbNZ/dgGi3CUDIISk2TV49/uEVzlhWzDBknrWtmkmZO6q9Jg/NMR0GYUDI4JYFAQK+sKNFvP9iiAw2tpuPAoUb1yNQDlwzQoIJ001FgMUoGJ23t7mo98PYGrSmpNh0FLhDlkaae0Vn3XthPOakJpuPAIpQMTqiyoVWPvr9Zr64sCdt594gcyXHRun1cL91yTk8lxEabjoMQo2Twjdr8Ac1ZtkuPffilapq8puPA5QoyEjXzov665PR801EQQpQMjmnVrko98NYGbeRwMITZ+QNy9fDlg5hCcwlKBodpbPXpF//YqJdXlIjvDJiSnhirBycP0BVndDYdBe1EyeCQNSXVuvOVQrbdh21M6J+jX10+WJ3SGdU4FSUD+f0BPbVgm37/8VbueYHtpCXE6OHLB2vyEK7VOBElE+H2VDdp+suFWl5UaToKcFxThubr51MGKY290ByFkolgbxfu0f97a73qmtmCH85QkJGox64eojN7djQdBSeJkolAdc1ePfDWer1VuNd0FOCURXmk287tpXsu6MfZNQ5AyUSYlUWVuvOVQu2uajIdBWiXsX2z9eS0YUpPYvrMziiZCPLMp9v12w+2qI2L+3CJ7h2T9OcbR6hvbqrpKPgGlEwEaPa2acbra/U202NwoeS4aD3+naGaOLCT6Sg4BkrG5UprmvX9F1Zq7e4a01EAy3g80g8m9NH08/rI4+E6jZ1QMi62aleVbntxlSrqWkxHAcLi/AG5+p/vDFVKfIzpKPgKJeNSbxfu0b1z13IUMiJOn5wU/fnGEeqRlWw6CkTJuNL//murHvvoS/YeQ8RKS4jRE9cM0/h+OaajRDxKxkV8bX7d/+Y6vbpyt+kogHHRUR79+orBumpEF9NRIhol4xJ1zV7d/uJqLdq233QUwDY8HumhyQN101ndTUeJWJSMC9Q0enX9X5dp3R5WkAHHct+F/XTHuN6mY0QkSsbhqhtbdd1flmnDXg4XA47nv8b30r0T+5uOEXEoGQerajhYMJxeCZyc757dXT+9ZAD30oQRJeNQlQ2tunbWUm0urTMdBXCUq0d01q+vOJ3NNcOEknGg/fUtum7WMm0po2CAYEwekq//uXqIYqKjTEdxPUrGYSrqWnTtrKXaWl5vOgrgaOedlqs/XjdM8THRpqO4GiXjIOV1zbp21jJto2CAkBjXL1t/uXEEIxoL8SfrEOW1zZr256UUDBBCC7ZUaMbr60zHcDVKxgGqGlo1bdZS7ahoMB0FcJ3XV+/Wo+9vNh3DtSgZm2vxten7L6ykYAALPbVgu57/vMh0DFeiZGwsEAjo3tfWakVRlekogOs9NG+D3lu3z3QM16FkbOzxj77UvDWcZgmEgz8g3flKoZbvrDQdxVUoGZuau2q3nvzXNtMxgIjS4vPrlr+t0JfcgxYylIwNLdm+XzPfWGs6BhCRapt9umn2cu2raTIdxRUoGZvZVl6v219cLW8bty8BpuyradZ/zF6hmiav6SiOR8nYyIH6Ft38HN/YgB1sKavTHXNWqc3PG772oGRsotnbplufX6niykbTUQB8ZfG2A3rswy2mYzgaJWMTP359rVYXV5uOAeAIT3+6XR9uKDUdw7EoGRt4dUWJ3ipkqTJgR4GAdPdra1S0nxuig0HJHOGzzz7T5MmTlZ+fL4/Ho7feesvSr7etvF4Pzttg6dcA0D51zT7d9uIqNbW2mY7iOJTMERoaGjRkyBD98Y9/tPxrtfja9MO/f6EmL9+4gN1tLq3TT99ebzqG48SYDmA3kyZN0qRJk8LytR7552aOTgYc5LVVuzWmT5YuG1pgOopjMJIx5JNNZXpuSZHpGABO0U/eXM/1mVNAyRhQVtuse+dyRz/gRPUtPv3331er1ec3HcURKJkw8/sDmv5KoSobWk1HARCk9Xtq9ch7m0zHcARKJsye/nS7lmw/YDoGgHZ6bkmRVhSxY/OJUDJhtLq4Sv/z0ZemYwAIgUBAmvnGOqbNToCSOUJ9fb0KCwtVWFgoSdq5c6cKCwtVXFzcrudt9rbprlcK5WMfJMA1tpXX64/zOZLjeDyBQIBXva9ZsGCBxo8ff9TjN910k5577rmgn/e3H2zWH+dvb0cyAHYUFx2ld384Rn1yU01HsSVKJgy2lNbpkicXsn0/4FIjunXQa7eNlsfjMR3Fdpgus1ggENDMN9ZSMICLrdxVpReXtW9K3a0oGYu9uKyY3ZWBCPDoe5tVWtNsOobtUDIW2l/fokff32w6BoAwqGvx6QH2NjsKJWOhX7+3WXXNPtMxAITJRxvL9N66faZj2AolY5HVxVV6ffVu0zEAhNmD8zaotpkj1P+NkrGA3x/QQ/M2iHV7QOQpr2vR0wu4XeHfKBkLvLKyRGt315iOAcCQ5xYXqbyORQASJRNy9S0+/e6DLaZjADCoydumP3yy1XQMW6BkQuxvS4p0gB2WgYj3yooSFR9oNB3DOEomhOqavZq1cIfpGABswNsW0OMfMatByYTQc4uLVN3IqhIAB81bs1ebIvyIdUomRGqbvfrLop2mYwCwEX9AEX+NlpIJkWcXFammiVEMgMN9srlcq3ZF7uFmlEwI1DR59ddFXIsBcGy/eS9yRzOUTAjMXrRTtWwfA+AbLC+q1Pwt5aZjGEHJtFNNk1ezF3MtBsDxPf5hZB69Tsm0018X7mATTAAntG5PjZbvjLxrM5RMO9Q0evXs4iLTMQA4xN+WFJmOEHaUTDu8uGyX6loYxQA4OR9sKNW+mibTMcKKkgmS3x/Q35dz3CqAk+fzB/TC57tMxwgrSiZIn35Zod1VkfWOBED7vbyiRM3eNtMxwoaSCdKcZZH1bgRAaFQ2tGpe4V7TMcKGkgnC3uomzd9SYToGAId6LoIWAFAyQXh5ebHa/Bx7CSA4G/fVRsxyZkrmFPna/HplZYnpGAAc7rklkXETNyVzij7eVKay2hbTMQA43IcbyrS32v2LhyiZUzRnGcuWAbSfzx/QSxHwekLJnIJdBxq0aNt+0zEAuMRbhXtMR7AcJXMKXlperADX+wGEyO6qJq3aVWU6hqUomZMUCAT05mr3v+sAEF7zXD6aoWRO0uriapXXccEfQGi9u26fq2+JoGRO0ocbSk1HAOBC++tbXX2tl5I5SR9uLDMdAYBL/WONe7eZoWROwpdlddq5v8F0DAAu9fGmMtdOmVEyJ4GpMgBWqmr0atmOA6ZjWIKSOQkfbGCqDIC13lvvzjezlMwJ7K1u0ro9NaZjAHC5DzeWKuDCG/EomRNgqgxAOJTVtmh1cbXpGCFHyZwAq8oAhMvCre47p4qSOY7qxtaIOfMBgHnLdrjv9YaSOY75W8rlc+myQgD280VJlVp8baZjhBQlcxxLtrlzSSEAe2r2+rWmxF0LjSiZ41he5L6hKwB7c9v9MpTMNyirbdauA42mYwCIMEt3UjIRgQv+AExYvata3ja/6RghQ8l8A0oGgAlN3jatKak2HSNkKJlvsILrMQAMWeaiN7mUzDE0tPj0ZVmd6RgAItRSF138p2SOYe3uGnF7DABTVu2qks8l12UomWNYs7vadAQAEayxtU0b9taajhESlMwxFLpwkzoAzrKl1B1T9pTMMRS6aGUHAGfaWk7JuNKB+haV1jabjgEgwm0rrzcdISQomSPs3N9gOgIAaCsl406UDAA72FPdpKZW5+/ITMkcgf3KANhBIOCOKTNK5gg7DzCSAWAPbrj4T8kcoYjpMgA24YbrMpTMEZguA2AXW8soGVepqGtRfYvPdAwAkCRtY7rMXYq4HgPARkqqmtTsdfYKM0rma7geA8BO2vwB7a5y9hR+UCUzYcIEVVdXH/V4bW2tJkyY0N5MxjCSAWA3B+pbTUdol6BKZsGCBWptPfp/vLm5WQsXLmx3KFOK9jv7HQMA96lscHbJxJzKL167du2hjzdu3KjS0tJDn7e1ten9999XQUFB6NKF2b6aJtMRAOAwlY0RVDJDhw6Vx+ORx+M55rRYYmKinnzyyZCFC7eaJq/pCABwmEqHT5edUsns3LlTgUBAPXv21PLly5WdnX3o5+Li4pSTk6Po6OiQhwyX2maWLwOwlwORNF3WrVs3SZLf745jQY9Uy0gGgM1E1DWZr9u6davmz5+v8vLyo0rnpz/9abuDhVurz68WnzvLE4BzRWTJzJo1S7fffruysrLUqVMneTyeQz/n8XgcWTK1zYxiANhPRJbML3/5Sz388MOaMWNGqPMYw1QZADtyeskEdZ9MVVWVrrrqqlBnMYqL/gDsKCJL5qqrrtKHH34Y6ixGMZIBYEetbX7VOXg6P6jpst69e+uBBx7Q0qVLNXjwYMXGxh728z/84Q9DEi6cuCYDwK7qmn1KTYg98S+0IU8gEAic6m/q0aPHNz+hx6MdO3a0K5QJf19erJlvrDMdAwCOsvC+8eqSmWQ6RlCCGsns3Lkz1DmMY7oMgF21+U95LGAbbPX/lVbukQFgUz4Hl0xQI5mbb775uD8/e/bsoMKYFBNN3wKwJ/+pX9WwjaBKpqqq6rDPvV6v1q9fr+rqaseeJxMb7TnxLwIAA3xtEVYyb7755lGP+f1+3X777erVq1e7Q5kQE0XJIHR+3mOjrmp9y3QMuERU1GxJaaZjBCXovcuOFBUVpbvuukvjxo3TfffdF6qnDRumyxBK/7u3j65P2quopkrTUeAGHufeLB7SV9bt27fL53PmHwbTZQil8pZYLci6xnQMuIXHuUeoBDWSueuuuw77PBAIaN++fXr33Xd10003hSRYuMVEMZJBaN1dNEqrUl9VVGOF6ShwuqgIK5kvvvjisM+joqKUnZ2txx577IQrz+wqhpEMQqzKG6MPMq/VpMYnTEeB03mc+yY4qJKZP39+qHMYF8s1GVjgnqIRuiAjX9H1e01HgZNFx5lOELR2vbJWVFRo0aJFWrRokSoqnD0lwOoyWKHBF6130q81HQNOl5hhOkHQgiqZhoYG3XzzzcrLy9PYsWM1duxY5efn63vf+54aGxtDnTEsGMnAKjOLhsiX1tV0DDhVVIwUn2o6RdCCemW966679Omnn+qdd95RdXW1qqur9fbbb+vTTz/V3XffHeqMYcE1GVilqS1ac1OuMx0DTpWQYTpBuwS1C3NWVpbmzp2rcePGHfb4/PnzdfXVVzty6mx1cZWueGqJ6RhwqdiogDbkPKC4auftUA7DOvaRfrDSdIqgBTWSaWxsVG5u7lGP5+TkOHa6LCs53nQEuJjX79FLiYxmEAQHX4+RgiyZ0aNH68EHH1Rzc/Ohx5qamvSzn/1Mo0ePDlm4cOqY4tzVG3CGnxX1V0tmP9Mx4DSJHUwnaJegljD//ve/14UXXqjOnTtryJAhkqQ1a9YoPj7esccyJ8fHKDE2Wk3eNtNR4FKBgEfPxl6j2/SQ6ShwkkgsmcGDB2vr1q2aM2eONm/eLEm65pprdN111ykxMTGkAcMpMzlOe6qbTMeAi/16V1/d1HmQEvevNx0FTuHwC/9Blcwjjzyi3Nxc3XrrrYc9Pnv2bFVUVGjGjBkhCRduWanxlAws90zUNE3X/zMdA07h8JFMUNdk/vSnP6l///5HPT5w4EA988wz7Q5lSm4qF/9hvSeKe6o++wzTMeAUSR1NJ2iXoEqmtLRUeXl5Rz2enZ2tffv2tTuUKfkZzp3qg7M84b/adAQ4RYfuphO0S1Al06VLFy1evPioxxcvXqz8/Px2hzIlLz3BdAREiFl7uqom90zTMeAEmT1NJ2iXoK7J3Hrrrbrzzjvl9XoPHbf8ySef6L777nPsHf+SlMdIBmH0O+9U/UJLTceAnXmipQxnb0kUVMnce++9OnDggO644w61trZKkhISEjRjxgzNnDkzpAHDqSCDkQzC54W9BZre/Rxlli40HQV2lV4gxTj7Hr6gtpX5t/r6em3atEmJiYnq06eP4uOdfeF8b3WTzvr1v0zHQASZmlumx2qmm44Bu+pxrnTTPNMp2qVdWw+npKRo5MiRGjRokOMLRpI6pSUoKc65J9DBeV4vy1V5/gTTMWBXmT1MJ2g39rf/mqgoj/p1cu6W2nCmB+umKCB2AccxOPyiv0TJHOW0vDTTERBh3qvI0r6CiaZjwI46MJJxHUoGJtxfNVkBD1O1OAIjGfcZkMd0GcJvQWUHFRdcbDoG7MQTxTUZN+rfKU0epsdhwI8PTFIgKqi7CuBGWf2kuGTTKdqNkjlCcnyMumYmmY6BCPR5Vbq2F1xmOgbsosAd+9tRMsdwWieuy8CM+8onKhDt7JvvECKUjHtx8R+mrK5J0Zb8y03HgB3kUzKudRoX/2HQ9H3nKxDDPnoRLTpeyh1kOkVIUDLHwEgGJm2qT9K6vCtNx4BJnQY5fs+yf6NkjqFLZpJSE1jlA3Om7xmngAtWFiFILpkqkyiZb/St7pmmIyCCbW9M1KpcDjaLWAXDTScIGUrmG4zpk2U6AiLcnSVjFYhn6jYiuWRlmUTJfKNzKBkYtrs5Xp/nTDMdA+GW2EHq2Md0ipChZL5B75xUdUrjEDOYdWfx2fInMnUbUXqOk6Lc89Lsnv8TCzBlBtPKW2K1IOsa0zEQTr3PM50gpCiZ4xjTm5KBeXcXjZI/Kdt0DIQLJRM5zu6dxWaZMK7KG6MPMq81HQPhkDtISu1kOkVIUTLHkZ0ar3653P0P8+4pGqG2lHzTMWC13t82nSDkKJkTYJUZ7KDBF6130hnNuJ7LpsokSuaExvRhLhz2MLNoiHxpXU3HgFXiUqSuo02nCDlK5gRG9chUXAx/TDCvqS1ac1OuMx0DVukxVoqONZ0i5Hj1PIGE2Gid2bOj6RiAJOmBokFqzXD+ue84Bhdej5EomZNy2RAuuMIevH6PXkpkNOM6niip30WmU1iCkjkJEwd1UmJstOkYgCTpZ0X91ZLZz3QMhFL3MVKaO9/MUjInISU+RucNyDUdA5AkBQIePRvLLgCuMti9O25TMidpylB3vsuAM/16V181Zbnj5MSIF5MgDbjMdArLUDIn6dy+2cpMdsdJdXCHZ6LYodkV+k6UEtx7pAMlc5JioqN08eA80zGAQ54o7qn6bPecOxKxXDxVJlEyp2TKsALTEYDDPOF39wuU6yVkSH0uMJ3CUpTMKRjerYO6dUwyHQM4ZNaerqrJPdN0DARr4BQpxt3T8JTMKeKeGdjN77xTTUdAsFw+VSZRMqeMKTPYzQt7C1TZ6RzTMXCqMrpK3c4yncJylMwp6pmdoiFdMkzHAA7zcNMVpiPgVI28VZFwYBUlE4Qbz+xmOgJwmNfLclWeP8F0DJys2GTpjBtNpwgLSiYIk4fkKzs13nQM4DAP1k1RQO5/Z+wKQ6ZJiRmmU4QFJROEuJgoXT+K0Qzs5b2KLO0rmGg6Bk7II515u+kQYUPJBOm6M7tyzgxs5/6qyQp42MzV1np/W8rqYzpF2PAqGaSslHhdynJm2MyCyg4qLrjYdAwcz6jIGcVIlEy73HJOD9MRgKP8+MAkBaJiTMfAsWT1de3hZN+EkmmH/p3SNL5ftukYwGE+r0rX9gL37urraKP+MyKWLX8dJdNOt4/rbToCcJR7yiYqEM0KSFtJyJCGRN45QJRMO32rR6ZGdOtgOgZwmMLaFG3Ov9x0DHzd6P+W4pJNpwg7SiYEbju3l+kIwFHu2neeAjGJpmNAkpKyImrZ8tdRMiHw7dNy1L9TqukYwGE21Sdpbd6VpmNAksZMl+JTTKcwgpIJAY/Hox9P6m86BnCU6XvGKxCBUzS2kpovjbzFdApjKJkQGdcvR+f0yTIdAzjMjsYErcp1/3bytnbuvVJsgukUxlAyIXT/RacpKrJWJ8IBflQyVoH4dNMxIlOH7tKwG0ynMIqSCaHT8tJ0xRmdTccADrOnOV5LcqaZjhGZxs2UomNNpzCKkgmxey7op8RY9o6CvUwvPkv+xEzTMSJLdv+IOPnyRCiZEOuUnsB2M7Cd8pZYLciKvBsBjRp/vxTFSyx/Aha47dxeykrhbmvYy91Fo+RPYhuksOh2tjSArX0kSsYSyfExmn5+5GzlDWeo8sbog8xrTcdwv6gY6aLfmk5hG5SMRaaN7KreOZF58xXs656iEWpL4YgKS428RcodaDqFbVAyFomO8mgmN2jCZhp80Xong9GMZZKzD16LwSGUjIW+fVquzh+QazoGcJiZO4fIl9bVdAx3Ov8XUgL3JH0dJWOxhy8fpPTEyF4nD3tpaovW3JTrTMdwnx5jpaGs4DsSJWOxnNQEPXTpANMxgMM8UDRIrRnsHh4y0fHSJb83ncKWKJkwuHxYZ513GtNmsA+v36OXErk2EzLn3C11DG1pv/HGG7rgggvUsWNHeTweFRYWhvT5w4WSCZNfMW0Gm/lZUX81Z7I4pd2y+x/cyj/EGhoaNGbMGP3mN78J+XOHU4zpAJEiJy1BD04eoLteXWM6CiBJCgQ8ejb2Gt2uB01Hca7oOGnqX6SYuJA/9Q03HNxYs6ioKOTPHU6MZMLoijM667zTckzHAA75za4+asoaZDqGc014QOo02HQKW6NkwuxXlw9m2gy28nQUOzQHpce50lk/MJ3C9iiZMMtJS9BPL2G1GezjD8U9VZ99hukYzpLYQbr8GckTmgOk5syZo5SUlEM/Fi5cGJLntQOuyRgwdXhn/XPdPn2yudx0FECS9IT/av1Eq03HcI7JT0hpodue59JLL9WoUaMOfV5QUBCy5zaNkYwhv7nydOWlR+6RrLCXWXu6qib3TNMxnGHo9SHfYTk1NVW9e/c+9CMxMTGkz28SJWNIVkq8nrruDMVF81cAe3i09UrTEewvs6c0KTxLiisrK1VYWKiNGzdKkrZs2aLCwkKVlpaG5euHCq9wBg3r2kEPshsAbGLOvnxVdjrHdAz7ioqRrpglxYdnd/V58+Zp2LBhuvjiiyVJ06ZN07Bhw/TMM8+E5euHiicQCARMh4h0M+au1SsrS0zHAHRFbrker7nTdAx7uuh30rduNZ3CcRjJ2MDPpwzUkM7s3Arz3ijLUXn+t03HsJ/h36VggkTJ2EB8TLSevn64OiaH/q5h4FQ9WHeZAgrN0lxX6HY2J122AyVjE/kZiXrymmGKjuIfN8x6ryJLewsmmo5hD+ldpaufl6K5gTpYlIyNnNU7S/dN7Gc6BqCfVE1WwBNtOoZZscnSNS9JyVmmkzgaJWMz/3luL108OM90DES4BZUdVFxwsekYBnmkKU+xL1kIUDI29OiVp2tAXprpGIhw9+2/SIGoCN0UZOy90sApplO4AiVjQ8nxMXru5pHqkumeu37hPMuq07S9ILR3tjvCaZOl8febTuEalIxN5aQm6PmbR7HiDEbdUzZRgeh40zHCp8dYaepfQ7bxJSgZW+uRlaxnvztSyXERfgEWxhTWpmhz/uWmY4RHwQhp2t+lmAgq1TCgZGzu9M4ZeuaG4YqN5p0VzJi+7zwFYlw+dZszULp+bti2jIkklIwDnNMnW09M4x4amLG5Pklr81y8eWZmT+mGNw+eEYOQo2Qc4qLBeXp06ulMFcOI6XvGKxCXbDpG6KUVSDe+LaXmmk7iWpSMg0wd3lm/uIzz2BF+OxoTtCr3atMxQiupo3TDW1JGV9NJXI2ScZjrz+ymn1x0mukYiEA/KhmrQLxLNnKNT5Ouf0PK7ms6ietRMg5069ieuv+i/kydIaz2NMdrSc400zHaLzFTuvEtKX+o6SQRgfNkHOy1lSWa+cY6+fz8FSI8cuK9Wpo0XVFNlaajBCc1/2DBZLNHYLgwknGwq0Z00TPXD1dCLH+NCI/yllgtyLrGdIzgZPaSvvcBBRNmjGRcYEVRpb733ArVNvtMR0EE6BDr06rUuxXVWGE6ysnLHSzd8IaUkmM6ScThLbALjOyeqVdvG62cVO5UhvWqvDH6IPNa0zFOXpczpe++S8EYwkjGRUoqG3Xj7OXaub/BdBS4XHJMm9ZmzFB0/V7TUY6v9/kHDx2LSzKdJGIxknGRLplJeu220RpUwDEBsFaDL1rvZNh8NDNoqnTN3ykYwxjJuFB9i0/ff36llmw/YDoKXCwxuk3rOv5EMbXFpqMcwSOdO0Ma92N2U7YBRjIulBIfo2e/O1JXDu9sOgpcrKktWnNTrjMd43BxKdJ3XpDGz6RgbIKRjMu9tKxYD72zQa0+v+kocKHYqIA25PxUcdXbTUeROvQ4OD2Ww44YdsJIxuWuHdVVc28brYIMl2/VDiO8fo9eSrTBtZleE6Tvz6dgbIiRTISobmzVj14u1KdfOujeBjiCxxPQprxfKKFys5kAo/9bOv/nUhSH+9kRI5kIkZEUp2f/Y6R+9O0+TFUjpAIBj56NNbALQEyCdPmfpYkPUzA2xkgmAs3fUq7prxSqutFrOgpcZFPnXylx//rwfLGsvtLUv0h5Q8Lz9RA0RjIRaHy/HP3jB2N0emeXbNsOW3g6Kkw7NI+8RfrPzygYh2AkE8FafG36xT826sWldrvPAU61vsvvlFKx2ponT86Rpjwl9TnfmueHJRjJRLD4mGj9cspgzblllLpksvoM7feE36LTM/tfIt2xlIJxIEYykCQ1tvr06Ptb9PznReJ4GrTHmm5/UHrZ0tA8WWyydOEj0vCbQvN8CDtKBodZtatS981dq+0VbLKJ4FyXt1cPV93T/ifqPFK6/E9Sx17tfy4YQ8ngKC2+Nv3+462a9dkOTt1EUFZ3f1qZpQuD+83xadL4+6VvfZ+lyS5AyeAbrd9To3vnrtWmfbWmo8Bhrsgt1+M1d576bxx8tXTBL6XU3JBnghmUDI7L2+bXMwu268l/bVNrG/uf4eQt7/lX5ez95OR+cXZ/6aLfST3OsTYUwo6SwUnZVl6vh9/dqPlb2JYGJ2dS9n49VfcjeXScl5jYZGncDOnMO6To2PCFQ9hQMjgli7bu1y/f3ajNpXWmo8ABFvd6XgV73j/2Tw64TJr4iJReEN5QCCtKBqfM7w/otVUleuzDL1Ve12I6DmxsXGaVnm36oTyBtv97MG+odN5DUq/xpmIhjCgZBK2x1afZi3bqT5/tUF2zz3Qc2NSnvV9Wt93zpI59pAk/kQZM4UCxCELJoN2qG1v19Kfb9bclRWr2sjgAh7u0a4v+MKpOGnY9S5IjECWDkCmvbdYf/rVVr6wokbeNb6tIV5CRqDvG99JVw7soLoYdrCIVJYOQK6tt1vOfF+mlZcWq4jiBiNO5Q6L+a3xvXTm8s2KjKZdIR8nAMs3eNs1dtVuzF+/UDrapcb3BBem6cXQ3TRlWQLngEEoGlgsEAlqwpUJ/XbRTi7btNx0HIRQfE6WLT8/TjaO7a2iXDNNxYEOUDMJq075azV60U2+v2atWH4sEnKpLZqKuH9VNV4/oog7JcabjwMYoGRhRUdeiF5fu0ssrilVWy702ThDlkc7tm60bR3fXuX2zFRXFMmScGCUDo/z+gJbtrNS8NXv03vpSVbNQwHY6JMXq6pFddP2obuqSmWQ6DhyGkoFteNv8+uzLCr1duFcfbypTY2vbiX8TLJGVEqcLBnbShQM7aXSvjlzIR9AoGdhSU2ubPtpUpnmFe/XZlxXsAB0GeekJmjiwkyYN6qSR3TOZDkNIUDKwvZpGr95bv0/vrtunFUWV7CoQQt06JunCQZ00aVCehnROl4ftXhBilAwcpdXn1+riKi3Ztl9Lth9QYUk1p3eegthojwYXpGtM7yxdOChPA/LTTEeCy1EycLSGFp+W76zUku37tXjbAW0qrRXf0f8nJT5GZ3TroJHdOmhkj0wN7ZKhhFj2D0P4UDJwlaqGVn2+44CWbN+v1buqtbW8LqL2UctJjdfI7pka0b2DRnbP1Gl5aYrm2goMomTgaq0+v7aW12nD3lpt3Furjftq9WVZneOXSns8Bzeg7JOTor65qerXKVXDu3VQt47JpqMBh6FkEJH217doW3m9tpbXa3t5vbaV12tvTZP217Wo1kZn43RIilW3jsnq1jFJ3TKT1K1jsvrkpqh3ToqS4mJMxwNOiJIBjtDia1NFXYv217eqoq7l0I/99V99XH/w41afX23+gPyBgNr8ga8+1qHP/++/B583MTZaGUmxSk+MVUZSrDIS4w59nP7V5wcfj1VGUpw6ZyYqLYFz7+FslAwQBn5/gPtOEJEoGQCAZdgrAgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYBlKBgBgGUoGAGAZSgYAYJn/D4jHkD8/xq2rAAAAAElFTkSuQmCC\n"
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"valid[\"review_score\"].value_counts().plot(kind=\"pie\")"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 145,
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"RangeIndex: 43230 entries, 0 to 43229\n",
|
||
"Data columns (total 2 columns):\n",
|
||
" # Column Non-Null Count Dtype \n",
|
||
"--- ------ -------------- ----- \n",
|
||
" 0 review_text 43230 non-null object\n",
|
||
" 1 review_score 43230 non-null int64 \n",
|
||
"dtypes: int64(1), object(1)\n",
|
||
"memory usage: 675.6+ KB\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"train.info()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 146,
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"Index: 15716 entries, 1265039 to 5454569\n",
|
||
"Data columns (total 2 columns):\n",
|
||
" # Column Non-Null Count Dtype \n",
|
||
"--- ------ -------------- ----- \n",
|
||
" 0 review_text 15716 non-null object\n",
|
||
" 1 review_score 15716 non-null int64 \n",
|
||
"dtypes: int64(1), object(1)\n",
|
||
"memory usage: 368.3+ KB\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"test.info()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 147,
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"Index: 15717 entries, 5012153 to 4962820\n",
|
||
"Data columns (total 2 columns):\n",
|
||
" # Column Non-Null Count Dtype \n",
|
||
"--- ------ -------------- ----- \n",
|
||
" 0 review_text 15717 non-null object\n",
|
||
" 1 review_score 15717 non-null int64 \n",
|
||
"dtypes: int64(1), object(1)\n",
|
||
"memory usage: 368.4+ KB\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"valid.info()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"### Przykłady z każdego podzbioru"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 148,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": " review_text review_score\n0 I'm the biggest fan you will ever meet of tile... -1\n1 Really an improvement on the old game (Which w... 1\n2 celebrating the four year birthday of payday w... -1\n3 Only fun when playing with friends. Can't join... -1\n4 While smashing planets together can be fun, th... -1",
|
||
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>review_text</th>\n <th>review_score</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>I'm the biggest fan you will ever meet of tile...</td>\n <td>-1</td>\n </tr>\n <tr>\n <th>1</th>\n <td>Really an improvement on the old game (Which w...</td>\n <td>1</td>\n </tr>\n <tr>\n <th>2</th>\n <td>celebrating the four year birthday of payday w...</td>\n <td>-1</td>\n </tr>\n <tr>\n <th>3</th>\n <td>Only fun when playing with friends. Can't join...</td>\n <td>-1</td>\n </tr>\n <tr>\n <th>4</th>\n <td>While smashing planets together can be fun, th...</td>\n <td>-1</td>\n </tr>\n </tbody>\n</table>\n</div>"
|
||
},
|
||
"execution_count": 148,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"train.head()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 149,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": " review_text review_score\n1265039 I love the Fact you can do what EVER you want ... 1\n3132003 Tony Hawk's without the Pro Skater. Finding ou... 1\n880195 It's pretty good. 1\n717128 This the best dungeon game I have played since... 1\n5221356 Totally awesome game alone or with a friend. I... 1",
|
||
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>review_text</th>\n <th>review_score</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>1265039</th>\n <td>I love the Fact you can do what EVER you want ...</td>\n <td>1</td>\n </tr>\n <tr>\n <th>3132003</th>\n <td>Tony Hawk's without the Pro Skater. Finding ou...</td>\n <td>1</td>\n </tr>\n <tr>\n <th>880195</th>\n <td>It's pretty good.</td>\n <td>1</td>\n </tr>\n <tr>\n <th>717128</th>\n <td>This the best dungeon game I have played since...</td>\n <td>1</td>\n </tr>\n <tr>\n <th>5221356</th>\n <td>Totally awesome game alone or with a friend. I...</td>\n <td>1</td>\n </tr>\n </tbody>\n</table>\n</div>"
|
||
},
|
||
"execution_count": 149,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"test.head()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 150,
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": " review_text review_score\n5012153 ..it's like nights into dreams and treasures o... 1\n5818758 As someone who mostly just likes making cool s... 1\n4582102 What can I say about this game the story is sh... 1\n5242842 A very unique and enjoyable puzzle solving str... 1\n5400923 A very adorable, charming game. 1",
|
||
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>review_text</th>\n <th>review_score</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>5012153</th>\n <td>..it's like nights into dreams and treasures o...</td>\n <td>1</td>\n </tr>\n <tr>\n <th>5818758</th>\n <td>As someone who mostly just likes making cool s...</td>\n <td>1</td>\n </tr>\n <tr>\n <th>4582102</th>\n <td>What can I say about this game the story is sh...</td>\n <td>1</td>\n </tr>\n <tr>\n <th>5242842</th>\n <td>A very unique and enjoyable puzzle solving str...</td>\n <td>1</td>\n </tr>\n <tr>\n <th>5400923</th>\n <td>A very adorable, charming game.</td>\n <td>1</td>\n </tr>\n </tbody>\n</table>\n</div>"
|
||
},
|
||
"execution_count": 150,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"valid.head()"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"### Zapis do csv"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 151,
|
||
"outputs": [],
|
||
"source": [
|
||
"train.to_csv(\"train.csv\")\n",
|
||
"test.to_csv(\"test.csv\")\n",
|
||
"valid.to_csv(\"valid.csv\")"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"outputs": [],
|
||
"source": [],
|
||
"metadata": {
|
||
"collapsed": false
|
||
}
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 2
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython2",
|
||
"version": "2.7.6"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 0
|
||
}
|