Initial commit

This commit is contained in:
dzikafoczka 2025-01-19 12:19:38 +01:00
commit bf8ce05930
2 changed files with 59093 additions and 0 deletions

View File

@ -0,0 +1,215 @@
{
"cells": [
{
"metadata": {},
"cell_type": "markdown",
"source": "### Importy",
"id": "7a1e7d26143b0471"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T11:01:47.450Z",
"start_time": "2025-01-19T11:01:46.692135Z"
}
},
"cell_type": "code",
"source": "import pandas as pd",
"id": "1691b104789b8a60",
"outputs": [],
"execution_count": 1
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### Zbiór danych",
"id": "bf04fbdb243e5742"
},
{
"metadata": {},
"cell_type": "markdown",
"source": [
"Zbiór danych zawiera tweety związne z *Mistrzostwami Świata w Piłce Nożnej* w Katarze, które odbyły się w *2022 roku*. \n",
"\n",
"Tweety pochodzą z dnia otwarcia turnieju, tj. *20 listopada 2022 roku*, kiedy to mecz otwarcia rozegrali gospodarze turnieju - Katar przeciwko reprezentacji Ekwadoru. Mecz rozpoczął się o godzinie *16:00 UTC*, a finalny gwizdek sędziego zabrzmiał o godzinie *18:00 UTC*. \n",
"\n",
"Zbiór zawiera *22524* tweetów. Każdy tweet zawiera następujące kolumny:\n",
"\n",
"- `Date Created` - data napisania tweeta\n",
"- `Number of Likes` - liczba polubień tweeta\n",
"- `Source of Tweet` - źródło tweeta (np. Twitter for iPhone)\n",
"- `Tweet` - treść tweeta\n",
"- `Sentiment` - sentyment tweeta (`positive`, `negative`, `neutral`)"
],
"id": "139728685fbfc7cc"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T11:13:14.897578Z",
"start_time": "2025-01-19T11:13:14.637111Z"
}
},
"cell_type": "code",
"outputs": [
{
"data": {
"text/plain": [
" Date Created Number of Likes Source of Tweet \\\n",
"0 2022-11-20 23:59:21+00:00 4 Twitter Web App \n",
"1 2022-11-20 23:59:01+00:00 3 Twitter for iPhone \n",
"2 2022-11-20 23:58:41+00:00 1 Twitter for iPhone \n",
"3 2022-11-20 23:58:33+00:00 1 Twitter Web App \n",
"4 2022-11-20 23:58:28+00:00 0 Twitter for Android \n",
"\n",
" Tweet Sentiment \n",
"0 What are we drinking today @TucanTribe \\n@MadB... neutral \n",
"1 Amazing @CanadaSoccerEN #WorldCup2022 launch ... positive \n",
"2 Worth reading while watching #WorldCup2022 htt... positive \n",
"3 Golden Maknae shinning bright\\n\\nhttps://t.co/... positive \n",
"4 If the BBC cares so much about human rights, h... negative "
],
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Date Created</th>\n",
" <th>Number of Likes</th>\n",
" <th>Source of Tweet</th>\n",
" <th>Tweet</th>\n",
" <th>Sentiment</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2022-11-20 23:59:21+00:00</td>\n",
" <td>4</td>\n",
" <td>Twitter Web App</td>\n",
" <td>What are we drinking today @TucanTribe \\n@MadB...</td>\n",
" <td>neutral</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2022-11-20 23:59:01+00:00</td>\n",
" <td>3</td>\n",
" <td>Twitter for iPhone</td>\n",
" <td>Amazing @CanadaSoccerEN #WorldCup2022 launch ...</td>\n",
" <td>positive</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2022-11-20 23:58:41+00:00</td>\n",
" <td>1</td>\n",
" <td>Twitter for iPhone</td>\n",
" <td>Worth reading while watching #WorldCup2022 htt...</td>\n",
" <td>positive</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2022-11-20 23:58:33+00:00</td>\n",
" <td>1</td>\n",
" <td>Twitter Web App</td>\n",
" <td>Golden Maknae shinning bright\\n\\nhttps://t.co/...</td>\n",
" <td>positive</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2022-11-20 23:58:28+00:00</td>\n",
" <td>0</td>\n",
" <td>Twitter for Android</td>\n",
" <td>If the BBC cares so much about human rights, h...</td>\n",
" <td>negative</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"execution_count": 10,
"source": [
"df = pd.read_csv('dataset/fifa_world_cup_2022_tweets.csv', index_col=0)\n",
"df['Date Created'] = pd.to_datetime(df['Date Created'])\n",
"df.head()"
],
"id": "e760d702af7f88"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T11:19:14.426885Z",
"start_time": "2025-01-19T11:19:14.415327Z"
}
},
"cell_type": "code",
"source": [
"df_before_match = df[df['Date Created'].dt.hour < 16]\n",
"df_during_match = df[(df['Date Created'].dt.hour >= 16) & (df['Date Created'].dt.hour < 18)]\n",
"df_after_match = df[df['Date Created'].dt.hour >= 18]"
],
"id": "c7424dd463db584e",
"outputs": [],
"execution_count": 12
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### Analiza sentymentu",
"id": "6b277da32554cb66"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T11:19:17.763717Z",
"start_time": "2025-01-19T11:19:17.759353Z"
}
},
"cell_type": "code",
"source": "# TODO",
"id": "e7201b91b120968e",
"outputs": [],
"execution_count": 13
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

File diff suppressed because it is too large Load Diff