Adding machine learning to analyze email content. Updating documentation
This commit is contained in:
parent
ce104f49b1
commit
f084188680
Binary file not shown.
Binary file not shown.
80
README.md
80
README.md
@ -1,13 +1,12 @@
|
|||||||
# PhishGuardian
|
# PhishGuardian
|
||||||
|
|
||||||
PhishGuardian is a browser extension designed to detect and manage phishing emails. It uses machine learning to identify potential phishing emails and provides options to mark them as safe or move them to the trash. The extension is built using Flask for the backend and JavaScript for the frontend.
|
PhishGuardian is a browser extension designed to detect and manage suspicious emails. It uses machine learning to identify suspicious emails and provides options to mark them as safe or move them to the trash. The extension is built using Flask for the backend and JavaScript for the frontend.
|
||||||
|
|
||||||
## Features
|
## Features
|
||||||
|
|
||||||
- Detects phishing emails using machine learning
|
- Detects suspiciouds emails using machine learning
|
||||||
- Allows users to mark emails as safe
|
- Allows users to mark emails as safe
|
||||||
- Allows users to move phishing emails to the trash
|
- Allows users to move suspicious emails to the trash
|
||||||
- Provides notifications for detected phishing emails
|
|
||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
@ -18,11 +17,11 @@ PhishGuardian is a browser extension designed to detect and manage phishing emai
|
|||||||
- scikit-learn
|
- scikit-learn
|
||||||
- Chrome browser
|
- Chrome browser
|
||||||
|
|
||||||
### Backend Setup
|
### Backend setup
|
||||||
|
|
||||||
1. Clone the repository:
|
1. Clone the repository:
|
||||||
```sh
|
```sh
|
||||||
git clone https://your-repository-url
|
git clone https://git.wmi.amu.edu.pl/s452649/PhishGuardian.git
|
||||||
cd PhishGuardian/backend
|
cd PhishGuardian/backend
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -33,44 +32,75 @@ PhishGuardian is a browser extension designed to detect and manage phishing emai
|
|||||||
|
|
||||||
3. Run the Flask backend:
|
3. Run the Flask backend:
|
||||||
```sh
|
```sh
|
||||||
python backend.py
|
python app.py
|
||||||
```
|
```
|
||||||
|
|
||||||
### Extension Setup
|
### Extension setup
|
||||||
|
|
||||||
1. Open Chrome and go to `chrome://extensions/`
|
1. Open Chrome and go to `chrome://extensions/`
|
||||||
2. Enable "Developer mode" by toggling the switch in the top right corner.
|
2. Enable "Developer mode" by toggling the switch in the top right corner.
|
||||||
3. Click on "Load unpacked" and select the `PhishGuardian` directory.
|
3. Click on "Load unpacked" and select the `extension` directory within the `PhishGuardian` directory.
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
1. Click on the PhishGuardian extension icon in the Chrome toolbar.
|
1. Click on the PhishGuardian extension icon in the Chrome toolbar.
|
||||||
2. Login with your email credentials. For now only credentials to wp.pl mailing service are supported (this will change in the future).
|
2. Login with your email credentials. For now, only credentials for Outlook are supported (this will change in the future).
|
||||||
3. Use the "Check Mail" button to scan for phishing emails.
|
3. Use the "Fetch Emails" button to retrieve your emails.
|
||||||
4. If a phishing email is detected, a notification will appear with options to mark the email as safe or move it to the trash.
|
4. Select an email from the list and click the "Classify Email" button to scan the email.
|
||||||
|
5. Classification result will be displayed.
|
||||||
|
6. Use the "Mark as Safe" button to mark the email as safe or the "Delete Email" button to delete a suspicious email.
|
||||||
|
|
||||||
## Code Overview
|
## Code overview
|
||||||
|
|
||||||
### Backend (`backend.py`)
|
### Backend (`app.py`)
|
||||||
|
|
||||||
- Uses Flask to handle HTTP requests.
|
- Uses Flask to handle HTTP requests.
|
||||||
- Uses IMAP to connect to the email server and fetch emails.
|
- Uses IMAP to connect to the email server and fetch emails.
|
||||||
- Uses scikit-learn to classify emails as phishing or not based on the subject and sender.
|
- Uses scikit-learn to classify emails as suspicious or not based on the content.
|
||||||
- Provides endpoints for login, checking mail, marking emails as safe, and moving emails to trash.
|
- Provides endpoints for fetching emails, classifying emails, marking emails as safe, and deleting emails.
|
||||||
|
|
||||||
### Frontend
|
### Frontend
|
||||||
|
|
||||||
- `popup.html`: The main interface of the extension.
|
- `popup.html`: The main interface of the extension.
|
||||||
- `popup.js`: Handles interactions in the popup, such as login, checking mail, and handling responses.
|
- `popup.js`: Handles interactions in the popup, such as login, fetching emails, and handling responses.
|
||||||
- `background.js`: Listens for messages from the popup and handles notifications.
|
- `background.js`: Manages the background tasks of the extension, such as opening the popup.
|
||||||
- `notification.html` & `notification.js`: The interface and logic for the notification popup.
|
- `styles.css`: Contains the styles for the extension's UI.
|
||||||
|
- `manifest.json`: Configuration file for the Chrome extension.
|
||||||
|
- `images/icon16.png`, `images/icon48.png`, `images/icon128.png`: Icons used for the extension.
|
||||||
|
|
||||||
## API Endpoints
|
## API endpoints
|
||||||
|
|
||||||
- `POST /login`: Login with email credentials.
|
- `POST /fetch-emails`: Fetch emails from the email server.
|
||||||
- `GET /check_mail`: Check for new emails and classify them.
|
- `POST /classify-email`: Classify an email as phishing or not.
|
||||||
- `POST /logout`: Logout from the email account.
|
- `POST /mark-safe`: Mark an email as safe.
|
||||||
- `POST /mark_safe/<email_id>`: Mark an email as safe.
|
- `POST /delete-email`: Delete an email from the email server.
|
||||||
- `POST /move_trash/<email_id>`: Move an email to the trash.
|
|
||||||
|
|
||||||
|
## Files and directories
|
||||||
|
|
||||||
|
### Backend directory
|
||||||
|
|
||||||
|
- `app.py`: Main Flask application file.
|
||||||
|
- `spam_classifier_model.pkl`: Pre-trained machine learning model for classifying emails.
|
||||||
|
- `vectorizer.pkl`: Pre-trained vectorizer for transforming email content into a format suitable for the classifier.
|
||||||
|
- `source.txt`: Contains a link from which the datasets were downloaded.
|
||||||
|
- `lingSpam.csv`, `enronSpamSubset.csv`, `completeSpamAssasin.csv`: These are the datasets used to train the model (Random Forest is the chosen model).
|
||||||
|
- `data_join.py`: Script which merges the three datasets into one CSV file called `joined_data.csv`.
|
||||||
|
- `joined_data.csv`: The combined dataset resulting from `data_join.py`.
|
||||||
|
- `ML.ipynb`: Jupyter notebook containing all the machine learning and vectorizer information.
|
||||||
|
- `requirements.txt`: File containing the list of required Python packages.
|
||||||
|
|
||||||
|
### Extension directory
|
||||||
|
|
||||||
|
- `popup.html`: The main HTML file for the extension's UI.
|
||||||
|
- `popup.js`: JavaScript for handling UI interactions and communicating with the backend.
|
||||||
|
- `background.js`: JavaScript for background tasks and managing the extension's lifecycle.
|
||||||
|
- `styles.css`: CSS styles for the extension's UI.
|
||||||
|
- `manifest.json`: Configuration file for the Chrome extension.
|
||||||
|
- `images/icon16.png`, `images/icon48.png`, `images/icon128.png`: Icons used for the extension.
|
||||||
|
|
||||||
|
## How it works
|
||||||
|
|
||||||
|
1. **Login**: Users log in with their email credentials using the extension.
|
||||||
|
2. **Fetch emails**: The extension fetches emails from the server and displays them in the popup.
|
||||||
|
3. **Classify emails**: Emails are classified as suspicious or not. The classification results are stored and associated with each email.
|
||||||
|
4. **Mark as safe/Delete**: Users can mark suspicious emails as safe or delete them. The actions are reflected in the backend and the UI is updated accordingly.
|
||||||
|
5
backend/.idea/.gitignore
vendored
5
backend/.idea/.gitignore
vendored
@ -1,3 +1,8 @@
|
|||||||
# Default ignored files
|
# Default ignored files
|
||||||
/shelf/
|
/shelf/
|
||||||
/workspace.xml
|
/workspace.xml
|
||||||
|
# Editor-based HTTP Client requests
|
||||||
|
/httpRequests/
|
||||||
|
# Datasource local storage ignored files
|
||||||
|
/dataSources/
|
||||||
|
/dataSources.local.xml
|
||||||
|
@ -1 +1 @@
|
|||||||
backend.py
|
app.py
|
8
backend/.idea/PhishGuardian.iml
Normal file
8
backend/.idea/PhishGuardian.iml
Normal file
@ -0,0 +1,8 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8"?>
|
||||||
|
<module type="PYTHON_MODULE" version="4">
|
||||||
|
<component name="NewModuleRootManager">
|
||||||
|
<content url="file://$MODULE_DIR$" />
|
||||||
|
<orderEntry type="inheritedJdk" />
|
||||||
|
<orderEntry type="sourceFolder" forTests="false" />
|
||||||
|
</component>
|
||||||
|
</module>
|
@ -1,7 +1,7 @@
|
|||||||
<?xml version="1.0" encoding="UTF-8"?>
|
<?xml version="1.0" encoding="UTF-8"?>
|
||||||
<project version="4">
|
<project version="4">
|
||||||
<component name="ProjectRootManager" version="2" project-jdk-name="Python 3.11" project-jdk-type="Python SDK" />
|
<component name="Black">
|
||||||
<component name="PyCharmProfessionalAdvertiser">
|
<option name="sdkName" value="Python 3.12" />
|
||||||
<option name="shown" value="true" />
|
|
||||||
</component>
|
</component>
|
||||||
|
<component name="ProjectRootManager" version="2" project-jdk-name="Python 3.12" project-jdk-type="Python SDK" />
|
||||||
</project>
|
</project>
|
@ -2,7 +2,7 @@
|
|||||||
<project version="4">
|
<project version="4">
|
||||||
<component name="ProjectModuleManager">
|
<component name="ProjectModuleManager">
|
||||||
<modules>
|
<modules>
|
||||||
<module fileurl="file://$PROJECT_DIR$/.idea/backend.iml" filepath="$PROJECT_DIR$/.idea/backend.iml" />
|
<module fileurl="file://$PROJECT_DIR$/.idea/PhishGuardian.iml" filepath="$PROJECT_DIR$/.idea/PhishGuardian.iml" />
|
||||||
</modules>
|
</modules>
|
||||||
</component>
|
</component>
|
||||||
</project>
|
</project>
|
6
backend/.idea/vcs.xml
Normal file
6
backend/.idea/vcs.xml
Normal file
@ -0,0 +1,6 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8"?>
|
||||||
|
<project version="4">
|
||||||
|
<component name="VcsDirectoryMappings">
|
||||||
|
<mapping directory="$PROJECT_DIR$/.." vcs="Git" />
|
||||||
|
</component>
|
||||||
|
</project>
|
755
backend/.ipynb_checkpoints/ML-checkpoint.ipynb
Normal file
755
backend/.ipynb_checkpoints/ML-checkpoint.ipynb
Normal file
@ -0,0 +1,755 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"metadata": {
|
||||||
|
"jupyter": {
|
||||||
|
"is_executing": true
|
||||||
|
},
|
||||||
|
"ExecuteTime": {
|
||||||
|
"start_time": "2024-06-05T20:03:23.481431Z"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"cell_type": "code",
|
||||||
|
"source": [
|
||||||
|
"%pip install pandas\n",
|
||||||
|
"%pip install matplotlib\n",
|
||||||
|
"%pip install nltk\n",
|
||||||
|
"%pip install wordcloud\n",
|
||||||
|
"%pip install scikit-learn==1.3.2\n",
|
||||||
|
"%pip install scikit-fuzzy==0.4.2\n",
|
||||||
|
"# Import pakietów\n",
|
||||||
|
"import nltk\n",
|
||||||
|
"nltk.download('punkt')\n",
|
||||||
|
"nltk.download('stopwords')\n",
|
||||||
|
"import pandas as pd\n",
|
||||||
|
"import matplotlib.pyplot as plt\n",
|
||||||
|
"import re\n",
|
||||||
|
"import string\n",
|
||||||
|
"from wordcloud import WordCloud\n",
|
||||||
|
"from sklearn.feature_extraction.text import CountVectorizer\n",
|
||||||
|
"from sklearn.model_selection import train_test_split\n",
|
||||||
|
"from sklearn.naive_bayes import MultinomialNB\n",
|
||||||
|
"from sklearn.ensemble import RandomForestClassifier\n",
|
||||||
|
"from sklearn.tree import DecisionTreeClassifier\n",
|
||||||
|
"from sklearn.metrics import accuracy_score, classification_report, confusion_matrix\n",
|
||||||
|
"from nltk.corpus import stopwords\n",
|
||||||
|
"from nltk.stem import PorterStemmer\n",
|
||||||
|
"from nltk.tokenize import word_tokenize\n",
|
||||||
|
"import joblib\n",
|
||||||
|
"import pickle"
|
||||||
|
],
|
||||||
|
"id": "b313cab7d5cc49c0",
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"Requirement already satisfied: pandas in c:\\users\\alicj\\appdata\\local\\programs\\python\\python312\\lib\\site-packages (2.2.2)\n",
|
||||||
|
"Requirement already satisfied: numpy>=1.26.0 in c:\\users\\alicj\\appdata\\local\\programs\\python\\python312\\lib\\site-packages (from pandas) (1.26.4)\n",
|
||||||
|
"Requirement already satisfied: python-dateutil>=2.8.2 in c:\\users\\alicj\\appdata\\roaming\\python\\python312\\site-packages (from pandas) (2.9.0.post0)\n",
|
||||||
|
"Requirement already satisfied: pytz>=2020.1 in c:\\users\\alicj\\appdata\\local\\programs\\python\\python312\\lib\\site-packages (from pandas) (2024.1)\n",
|
||||||
|
"Requirement already satisfied: tzdata>=2022.7 in c:\\users\\alicj\\appdata\\local\\programs\\python\\python312\\lib\\site-packages (from pandas) (2024.1)\n",
|
||||||
|
"Requirement already satisfied: six>=1.5 in c:\\users\\alicj\\appdata\\roaming\\python\\python312\\site-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)\n",
|
||||||
|
"Note: you may need to restart the kernel to use updated packages.\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"execution_count": null
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Załaduj dane\n",
|
||||||
|
"data_path = \"joined_data.csv\"\n",
|
||||||
|
"data = pd.read_csv(data_path)"
|
||||||
|
],
|
||||||
|
"id": "768266dbb79c5e9d"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "print(data.head())",
|
||||||
|
"id": "ee08266d5c30627b"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "print(data.info())",
|
||||||
|
"id": "1798f605e33fe5e5"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "data",
|
||||||
|
"id": "b4f43d913b92485b"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "# Usuwamy NaN",
|
||||||
|
"id": "e3bf0f04a2be4e1a"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "data.dropna(inplace=True)",
|
||||||
|
"id": "71a6bbebdb0dccd4"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "# Usuwamy puste wiadomości i wiadomości zawierające jedynie \"\\n\"",
|
||||||
|
"id": "b7fca25d67381cdd"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "data = data[data['Body'] != '\\n']",
|
||||||
|
"id": "72d84bf6c1e7023a"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "data = data[data['Body'] != 'empty']",
|
||||||
|
"id": "7c94c4dca6c4cdae"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "data.reset_index(drop=True, inplace=True)",
|
||||||
|
"id": "7e6fd3f8014498f3"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "data",
|
||||||
|
"id": "a0c33f82a936c59"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Sprawdźmy rozkład targetów\n",
|
||||||
|
"print(data['Label'].value_counts())"
|
||||||
|
],
|
||||||
|
"id": "19af5936d0cfeba2"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "# Analiza długości wiadomości",
|
||||||
|
"id": "96c861e2655312cb"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"def get_len(row):\n",
|
||||||
|
" try:\n",
|
||||||
|
" return len(row)\n",
|
||||||
|
" except:\n",
|
||||||
|
" return row"
|
||||||
|
],
|
||||||
|
"id": "e1ec1ed8aa7c856d"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "data['message_length'] = data['Body'].apply(get_len)",
|
||||||
|
"id": "63c023f34d234f3e"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "data.sort_values(by='message_length')",
|
||||||
|
"id": "d4fd0e2dcc2bfee9"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "# Jedna wiadomość jest bardzo długa 17085626",
|
||||||
|
"id": "e62112260ebc17f0"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "data['message_length'].value_counts()",
|
||||||
|
"id": "7c369131e3c91ce3"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Histogram długości wiadomości dla każdej kategorii - ograniczamy do 200.000 znaków celem wyświetlenia histogramów\n",
|
||||||
|
"hist_data = data[data['message_length'] < 200000]\n",
|
||||||
|
"plt.figure(figsize=(10, 6))\n",
|
||||||
|
"hist_data[hist_data['Label'] == 0]['message_length'].hist(bins=100, alpha=0.6, label='Not Spam')\n",
|
||||||
|
"hist_data[hist_data['Label'] == 1]['message_length'].hist(bins=100, alpha=0.6, label='Spam')\n",
|
||||||
|
"plt.legend()\n",
|
||||||
|
"plt.xlabel('Długość wiadomości')\n",
|
||||||
|
"plt.ylabel('Liczba wiadomości')\n",
|
||||||
|
"plt.title('Rozkład długości wiadomości')\n",
|
||||||
|
"plt.show()"
|
||||||
|
],
|
||||||
|
"id": "b6b509692fd7c541"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "# Ograniczamy jeszcze bardziej ",
|
||||||
|
"id": "7182d6a1d6600c2"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Histogram długości wiadomości dla każdej kategorii - ograniczamy do 10000 znaków celem wyświetlenia histogramów\n",
|
||||||
|
"hist_data = data[data['message_length'] < 10000]\n",
|
||||||
|
"plt.figure(figsize=(10, 6))\n",
|
||||||
|
"hist_data[hist_data['Label'] == 0]['message_length'].hist(bins=100, alpha=0.6, label='Not Spam')\n",
|
||||||
|
"hist_data[hist_data['Label'] == 1]['message_length'].hist(bins=100, alpha=0.6, label='Spam')\n",
|
||||||
|
"plt.legend()\n",
|
||||||
|
"plt.xlabel('Długość wiadomości')\n",
|
||||||
|
"plt.ylabel('Liczba wiadomości')\n",
|
||||||
|
"plt.title('Rozkład długości wiadomości')\n",
|
||||||
|
"plt.show()"
|
||||||
|
],
|
||||||
|
"id": "962efe0bd652ecdb"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "# Można zauważyć, że trudno odróżnić widomości po samej długości. W tym celu należy skorzystać z bardziej zaawansowanych metod.",
|
||||||
|
"id": "eaa483deb9c81942"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "# Przetwarzanie tekstu",
|
||||||
|
"id": "6e0ee5fccf308cd1"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "data",
|
||||||
|
"id": "50c0131db25859cb"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"stop_words = set(stopwords.words('english'))\n",
|
||||||
|
"ps = PorterStemmer()\n",
|
||||||
|
"\n",
|
||||||
|
"def preprocess_text(text):\n",
|
||||||
|
" # Usuwanie znaków specjalnych i tokenizacja\n",
|
||||||
|
" text = re.sub(r'\\d+', '', text)\n",
|
||||||
|
" text = text.translate(str.maketrans('', '', string.punctuation))\n",
|
||||||
|
" words = word_tokenize(text)\n",
|
||||||
|
" # Usuwanie stopwords i stemming\n",
|
||||||
|
" words = [ps.stem(word) for word in words if word.lower() not in stop_words]\n",
|
||||||
|
" return \" \".join(words)"
|
||||||
|
],
|
||||||
|
"id": "c32c52a7b2575a3b"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "# Ten proces jest czasochłonny",
|
||||||
|
"id": "5953cb974349cb33"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "data['processed_message'] = data['Body'].apply(preprocess_text)",
|
||||||
|
"id": "89b8cdeaa9da5c2d"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "data.head()",
|
||||||
|
"id": "ccce395ac94c39a1"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "data['processed_message']",
|
||||||
|
"id": "7ce382be7bcdff2c"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Analiza słów za pomocą WordCloud\n",
|
||||||
|
"spam_words = ' '.join(list(data[data['Label'] == 1]['processed_message']))\n",
|
||||||
|
"not_spam_words = ' '.join(list(data[data['Label'] == 0]['processed_message']))"
|
||||||
|
],
|
||||||
|
"id": "dc456d793b576f7"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"plt.figure(figsize=(10, 6))\n",
|
||||||
|
"wordcloud_spam = WordCloud(width=800, height=400).generate(spam_words)\n",
|
||||||
|
"plt.imshow(wordcloud_spam, interpolation='bilinear')\n",
|
||||||
|
"plt.axis('off')\n",
|
||||||
|
"plt.title('Word Cloud dla Spam')\n",
|
||||||
|
"plt.show()"
|
||||||
|
],
|
||||||
|
"id": "c9d7d9c9f4ae91ed"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"plt.figure(figsize=(10, 6))\n",
|
||||||
|
"wordcloud_not_spam = WordCloud(width=800, height=400).generate(not_spam_words)\n",
|
||||||
|
"plt.imshow(wordcloud_not_spam, interpolation='bilinear')\n",
|
||||||
|
"plt.axis('off')\n",
|
||||||
|
"plt.title('Word Cloud dla Not Spam')\n",
|
||||||
|
"plt.show()"
|
||||||
|
],
|
||||||
|
"id": "d954e01a1d0b3a97"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "# Budowa modelu klasyfikacyjnego",
|
||||||
|
"id": "743000c7d99b8a85"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Zamiana tekstu na wektory\n",
|
||||||
|
"vectorizer = CountVectorizer()\n",
|
||||||
|
"X = vectorizer.fit_transform(data['processed_message'])\n",
|
||||||
|
"y = data['Label']"
|
||||||
|
],
|
||||||
|
"id": "7b3ba8e5b035cdc0"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Podział na zbiór treningowy i testowy\n",
|
||||||
|
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)"
|
||||||
|
],
|
||||||
|
"id": "5d66dcf506f4f399"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Trenowanie modelu Naiwnego Bayesa\n",
|
||||||
|
"model_NB = MultinomialNB()\n",
|
||||||
|
"model_NB.fit(X_train, y_train)"
|
||||||
|
],
|
||||||
|
"id": "b3c2a6673c718301"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Predykcja i ocena Naiwny Bayes\n",
|
||||||
|
"y_pred_NB = model_NB.predict(X_test)\n",
|
||||||
|
"accuracy_NB = accuracy_score(y_test, y_pred_NB)\n",
|
||||||
|
"classification_rep_NB = classification_report(y_test, y_pred_NB)\n",
|
||||||
|
"confusion_matrix_NB = confusion_matrix(y_test, y_pred_NB)"
|
||||||
|
],
|
||||||
|
"id": "82f18edc9161422a"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "accuracy_NB",
|
||||||
|
"id": "a629b6b89d5cdf34"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "print(classification_rep_NB)",
|
||||||
|
"id": "53c0cf3dc8aa02bc"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "print(confusion_matrix_NB)",
|
||||||
|
"id": "9b915d02828de60"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "# Trening Drzewa Decyzyjnego (DT)",
|
||||||
|
"id": "160da18f95c142a0"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Parametry domyślne\n",
|
||||||
|
"model_DT = DecisionTreeClassifier(criterion= 'gini',\n",
|
||||||
|
" max_depth= None,\n",
|
||||||
|
" min_samples_leaf= 1,\n",
|
||||||
|
" min_samples_split= 2,\n",
|
||||||
|
" splitter= 'best')\n",
|
||||||
|
"model_DT.fit(X_train, y_train)"
|
||||||
|
],
|
||||||
|
"id": "8720ed4fd0ed5c72"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Predykcja i ocena DT\n",
|
||||||
|
"y_pred_DT = model_DT.predict(X_test)\n",
|
||||||
|
"accuracy_DT = accuracy_score(y_test, y_pred_DT)\n",
|
||||||
|
"classification_rep_DT = classification_report(y_test, y_pred_DT)\n",
|
||||||
|
"confusion_matrix_DT = confusion_matrix(y_test, y_pred_DT)"
|
||||||
|
],
|
||||||
|
"id": "7aee079d59bdd4eb"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "accuracy_DT",
|
||||||
|
"id": "57ac5a3ffe724fd5"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "print(classification_rep_DT)",
|
||||||
|
"id": "ed8955dc5d5cdeaf"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "print(confusion_matrix_DT)",
|
||||||
|
"id": "3ebfee20eb06e8cc"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "# Las losowy",
|
||||||
|
"id": "85d3dc4e44a2a4b3"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"model_RF = RandomForestClassifier(n_estimators= 100,\n",
|
||||||
|
" bootstrap= True,\n",
|
||||||
|
" ccp_alpha= 0.0,\n",
|
||||||
|
" criterion= 'gini',\n",
|
||||||
|
" max_depth= None,\n",
|
||||||
|
" min_samples_leaf= 1,\n",
|
||||||
|
" min_samples_split= 2,\n",
|
||||||
|
" random_state=123)\n",
|
||||||
|
"model_RF.fit(X_train, y_train)"
|
||||||
|
],
|
||||||
|
"id": "6f454235f54aa9cc"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Predykcja i ocena RF\n",
|
||||||
|
"y_pred_RF = model_RF.predict(X_test)\n",
|
||||||
|
"accuracy_RF = accuracy_score(y_test, y_pred_RF)\n",
|
||||||
|
"classification_rep_RF = classification_report(y_test, y_pred_RF)\n",
|
||||||
|
"confusion_matrix_RF = confusion_matrix(y_test, y_pred_RF)"
|
||||||
|
],
|
||||||
|
"id": "23d68d066dc47f9"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "accuracy_RF",
|
||||||
|
"id": "55789560bb43f9b8"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "print(classification_rep_RF)",
|
||||||
|
"id": "d15d57c467b94bad"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "print(confusion_matrix_RF)",
|
||||||
|
"id": "477ea9a19dbe7389"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Najlepszym modelem okazał się Las losowy - lepiej sklasyfikować spam jako wiadomość nie będącą spamem niż odwrotnie. \n",
|
||||||
|
"# Dlatego wybieramy RF, a nie NB."
|
||||||
|
],
|
||||||
|
"id": "9c3308c811b9d014"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Teraz dokonamy treningu na pełnych danych i zapiszemy model celem wykorzystania na danych rzeczywistych w późniejszej \n",
|
||||||
|
"# aplikacji."
|
||||||
|
],
|
||||||
|
"id": "81f08fa14ba4daf5"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"model_RF_full = RandomForestClassifier(n_estimators= 100,\n",
|
||||||
|
" bootstrap= True,\n",
|
||||||
|
" ccp_alpha= 0.0,\n",
|
||||||
|
" criterion= 'gini',\n",
|
||||||
|
" max_depth= None,\n",
|
||||||
|
" min_samples_leaf= 1,\n",
|
||||||
|
" min_samples_split= 2,\n",
|
||||||
|
" random_state=123)"
|
||||||
|
],
|
||||||
|
"id": "7f580653f470d7af"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "model_RF_full.fit(X, y)",
|
||||||
|
"id": "f75fc9a4d4746e5a"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Predykcja i ocena RF\n",
|
||||||
|
"y_pred_RF_full = model_RF_full.predict(X)\n",
|
||||||
|
"accuracy_RF_full = accuracy_score(y, y_pred_RF_full)\n",
|
||||||
|
"classification_rep_RF_full = classification_report(y, y_pred_RF_full)\n",
|
||||||
|
"confusion_matrix_RF_full = confusion_matrix(y, y_pred_RF_full)"
|
||||||
|
],
|
||||||
|
"id": "3d77bed327ac2fa1"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "accuracy_RF_full",
|
||||||
|
"id": "a76a53da77128562"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "print(classification_rep_RF_full)",
|
||||||
|
"id": "9a66104fd13572f8"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "print(confusion_matrix_RF_full)",
|
||||||
|
"id": "823635f2315ecf05"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "model_RF_full",
|
||||||
|
"id": "d0136f7b9f6344c4"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": [
|
||||||
|
"# Zapisz model i vectorizer\n",
|
||||||
|
"joblib.dump(model_RF_full, 'spam_classifier_model.pkl')\n",
|
||||||
|
"joblib.dump(vectorizer, 'vectorizer.pkl')"
|
||||||
|
],
|
||||||
|
"id": "e02e9031d10617f6"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "# Uwaga, ważna jest zgodność wersji scikita i joblib tutaj i w środowisku aplikacji",
|
||||||
|
"id": "2ac5943e18571301"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "pip freeze | findstr scikit",
|
||||||
|
"id": "a238743e07978f4"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"metadata": {},
|
||||||
|
"cell_type": "code",
|
||||||
|
"outputs": [],
|
||||||
|
"execution_count": null,
|
||||||
|
"source": "# Jak instalować?",
|
||||||
|
"id": "a64099b8c61a884"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 140,
|
||||||
|
"id": "d99c1dbe",
|
||||||
|
"metadata": {
|
||||||
|
"ExecuteTime": {
|
||||||
|
"end_time": "2024-06-05T16:57:22.800834Z",
|
||||||
|
"start_time": "2024-06-05T16:57:22.798725Z"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Np. tak\n",
|
||||||
|
"# pip install scikit-learn==1.3.2"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3 (ipykernel)",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.12.3"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 5
|
||||||
|
}
|
1886
backend/ML.ipynb
Normal file
1886
backend/ML.ipynb
Normal file
File diff suppressed because one or more lines are too long
95
backend/app.py
Normal file
95
backend/app.py
Normal file
@ -0,0 +1,95 @@
|
|||||||
|
from flask import Flask, request, jsonify
|
||||||
|
from flask_cors import CORS
|
||||||
|
import imaplib
|
||||||
|
import email
|
||||||
|
from email.header import decode_header
|
||||||
|
import joblib
|
||||||
|
|
||||||
|
app = Flask(__name__)
|
||||||
|
CORS(app)
|
||||||
|
|
||||||
|
model = joblib.load('spam_classifier_model.pkl')
|
||||||
|
vectorizer = joblib.load('vectorizer.pkl')
|
||||||
|
|
||||||
|
|
||||||
|
@app.route('/fetch-emails', methods=['POST'])
|
||||||
|
def fetch_emails():
|
||||||
|
data = request.json
|
||||||
|
username = data['username']
|
||||||
|
password = data['password']
|
||||||
|
|
||||||
|
try:
|
||||||
|
mail = imaplib.IMAP4_SSL("outlook.office365.com")
|
||||||
|
mail.login(username, password)
|
||||||
|
mail.select("inbox")
|
||||||
|
except imaplib.IMAP4.error:
|
||||||
|
return jsonify({"error": "Login failed. Check your email and password."}), 401
|
||||||
|
|
||||||
|
status, messages = mail.search(None, "ALL")
|
||||||
|
email_ids = messages[0].split()
|
||||||
|
|
||||||
|
emails = []
|
||||||
|
|
||||||
|
for email_id in email_ids:
|
||||||
|
res, msg = mail.fetch(email_id, "(RFC822)")
|
||||||
|
for response_part in msg:
|
||||||
|
if isinstance(response_part, tuple):
|
||||||
|
msg = email.message_from_bytes(response_part[1])
|
||||||
|
subject, encoding = decode_header(msg["Subject"])[0]
|
||||||
|
if isinstance(subject, bytes):
|
||||||
|
subject = subject.decode(encoding if encoding else "utf-8")
|
||||||
|
from_ = msg.get("From")
|
||||||
|
name, email_address = email.utils.parseaddr(from_)
|
||||||
|
body = ""
|
||||||
|
if msg.is_multipart():
|
||||||
|
for part in msg.walk():
|
||||||
|
if part.get_content_type() == "text/plain" and part.get("Content-Disposition") is None:
|
||||||
|
body += part.get_payload(decode=True).decode(part.get_content_charset() or "utf-8")
|
||||||
|
else:
|
||||||
|
body = msg.get_payload(decode=True).decode(msg.get_content_charset() or "utf-8")
|
||||||
|
|
||||||
|
emails.append({"id": email_id.decode(), "from": from_, "name": name, "email_address": email_address,
|
||||||
|
"subject": subject, "body": body})
|
||||||
|
|
||||||
|
return jsonify(emails)
|
||||||
|
|
||||||
|
|
||||||
|
@app.route('/classify-email', methods=['POST'])
|
||||||
|
def classify_email():
|
||||||
|
data = request.json
|
||||||
|
email_body = data['body']
|
||||||
|
email_vectorized = vectorizer.transform([email_body])
|
||||||
|
prediction = model.predict(email_vectorized)
|
||||||
|
result = "Suspicious" if prediction == 1 else "Not suspicious"
|
||||||
|
return jsonify({"result": result})
|
||||||
|
|
||||||
|
|
||||||
|
@app.route('/mark-safe', methods=['POST'])
|
||||||
|
def mark_safe():
|
||||||
|
data = request.json
|
||||||
|
email_id = data['email_id']
|
||||||
|
# Logic to mark email as safe
|
||||||
|
return jsonify({"message": f"Email {email_id} marked as safe"})
|
||||||
|
|
||||||
|
|
||||||
|
@app.route('/delete-email', methods=['POST'])
|
||||||
|
def delete_email():
|
||||||
|
data = request.json
|
||||||
|
email_id = data['email_id']
|
||||||
|
|
||||||
|
# Connect to the mail server and delete the email
|
||||||
|
username = data['username']
|
||||||
|
password = data['password']
|
||||||
|
try:
|
||||||
|
mail = imaplib.IMAP4_SSL("outlook.office365.com")
|
||||||
|
mail.login(username, password)
|
||||||
|
mail.select("inbox")
|
||||||
|
mail.store(email_id, '+FLAGS', '\\Deleted')
|
||||||
|
mail.expunge()
|
||||||
|
return jsonify({"message": f"Email {email_id} deleted"})
|
||||||
|
except imaplib.IMAP4.error:
|
||||||
|
return jsonify({"error": "Failed to delete email"}), 500
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
app.run(debug=True)
|
@ -1,202 +0,0 @@
|
|||||||
from flask import Flask, request, jsonify, session
|
|
||||||
from flask_cors import CORS
|
|
||||||
import imaplib
|
|
||||||
import email
|
|
||||||
from email.header import decode_header
|
|
||||||
from sklearn.feature_extraction.text import TfidfVectorizer
|
|
||||||
from sklearn.naive_bayes import MultinomialNB
|
|
||||||
import traceback
|
|
||||||
import json
|
|
||||||
import os
|
|
||||||
|
|
||||||
app = Flask(__name__)
|
|
||||||
CORS(app)
|
|
||||||
app.secret_key = 'your_secret_key'
|
|
||||||
|
|
||||||
SAFE_EMAILS_FILE = 'safe_emails.json'
|
|
||||||
|
|
||||||
# Load safe emails from file
|
|
||||||
def load_safe_emails():
|
|
||||||
if os.path.exists(SAFE_EMAILS_FILE):
|
|
||||||
with open(SAFE_EMAILS_FILE, 'r') as file:
|
|
||||||
return json.load(file)
|
|
||||||
return []
|
|
||||||
|
|
||||||
# Save safe emails to file
|
|
||||||
def save_safe_emails(safe_emails):
|
|
||||||
with open(SAFE_EMAILS_FILE, 'w') as file:
|
|
||||||
json.dump(safe_emails, file)
|
|
||||||
|
|
||||||
safe_emails = load_safe_emails()
|
|
||||||
|
|
||||||
# Dane treningowe
|
|
||||||
training_data = [
|
|
||||||
("Urgent account verification", "support@example.com", 1),
|
|
||||||
("Meeting agenda", "boss@example.com", 0),
|
|
||||||
("Password reset request", "no-reply@example.com", 1),
|
|
||||||
("Team lunch schedule", "hr@example.com", 0),
|
|
||||||
("Suspicious login attempt", "security@example.com", 1),
|
|
||||||
("Project update", "colleague@example.com", 0),
|
|
||||||
("Verify your email address", "verification@example.com", 1),
|
|
||||||
("Weekly report", "manager@example.com", 0),
|
|
||||||
("Your account has been suspended", "no-reply@example.com", 1),
|
|
||||||
("Company policy update", "admin@example.com", 0),
|
|
||||||
("Immediate action required", "alert@example.com", 1),
|
|
||||||
("Holiday party invitation", "events@example.com", 0),
|
|
||||||
("Important security update", "security@example.com", 1),
|
|
||||||
("Monthly performance review", "boss@example.com", 0),
|
|
||||||
("Claim your prize now", "lottery@example.com", 1),
|
|
||||||
("Training session details", "training@example.com", 0),
|
|
||||||
("Unauthorized access detected", "alert@example.com", 1),
|
|
||||||
("Office relocation notice", "admin@example.com", 0),
|
|
||||||
("Confirm your subscription", "newsletter@example.com", 1),
|
|
||||||
("Sales team meeting", "sales@example.com", 0),
|
|
||||||
("Your payment is overdue", "billing@example.com", 1),
|
|
||||||
("Client feedback", "client@example.com", 0),
|
|
||||||
("Update your account details", "update@example.com", 1),
|
|
||||||
("Social event invitation", "social@example.com", 0),
|
|
||||||
("Action required: Update password", "security@example.com", 1),
|
|
||||||
("New project assignment", "manager@example.com", 0),
|
|
||||||
("Notice of data breach", "security@example.com", 1),
|
|
||||||
("Weekly newsletter", "newsletter@example.com", 0),
|
|
||||||
("Re: Your recent purchase", "support@example.com", 1),
|
|
||||||
("Performance appraisal meeting", "hr@example.com", 0),
|
|
||||||
("Important account notice", "no-reply@example.com", 1),
|
|
||||||
("Quarterly earnings report", "finance@example.com", 0),
|
|
||||||
("Urgent: Verify your identity", "security@example.com", 1),
|
|
||||||
("Birthday celebration", "events@example.com", 0),
|
|
||||||
]
|
|
||||||
|
|
||||||
subjects = [x[0] for x in training_data]
|
|
||||||
senders = [x[1] for x in training_data]
|
|
||||||
labels = [x[2] for x in training_data]
|
|
||||||
|
|
||||||
# Połączenie tytułów i nadawców
|
|
||||||
combined_features = [s + ' ' + senders[i] for i, s in enumerate(subjects)]
|
|
||||||
vectorizer = TfidfVectorizer()
|
|
||||||
X = vectorizer.fit_transform(combined_features)
|
|
||||||
y = labels
|
|
||||||
|
|
||||||
model = MultinomialNB()
|
|
||||||
model.fit(X, y)
|
|
||||||
|
|
||||||
@app.route('/login', methods=['POST'])
|
|
||||||
def login():
|
|
||||||
data = request.get_json()
|
|
||||||
username = data.get('username')
|
|
||||||
password = data.get('password')
|
|
||||||
|
|
||||||
try:
|
|
||||||
mail = imaplib.IMAP4_SSL('imap.wp.pl')
|
|
||||||
mail.login(username, password)
|
|
||||||
session['username'] = username
|
|
||||||
session['password'] = password
|
|
||||||
return jsonify({'message': 'Login successful'}), 200
|
|
||||||
except imaplib.IMAP4.error as e:
|
|
||||||
print(f'Login failed: {e}')
|
|
||||||
return jsonify({'message': 'Login failed'}), 401
|
|
||||||
except Exception as e:
|
|
||||||
print('Error during login:', e)
|
|
||||||
traceback.print_exc()
|
|
||||||
return jsonify({'message': 'Internal server error'}), 500
|
|
||||||
|
|
||||||
@app.route('/check_mail', methods=['GET'])
|
|
||||||
def check_mail():
|
|
||||||
if 'username' not in session or 'password' not in session:
|
|
||||||
return jsonify({'message': 'Not logged in'}), 401
|
|
||||||
|
|
||||||
username = session['username']
|
|
||||||
password = session['password']
|
|
||||||
|
|
||||||
try:
|
|
||||||
mail = imaplib.IMAP4_SSL('imap.wp.pl')
|
|
||||||
mail.login(username, password)
|
|
||||||
mail.select('INBOX')
|
|
||||||
result, data = mail.search(None, 'ALL')
|
|
||||||
email_ids = data[0].split()[-10:] # Pobierz ostatnie 10 e-maili
|
|
||||||
emails = []
|
|
||||||
|
|
||||||
for e_id in email_ids:
|
|
||||||
result, email_data = mail.fetch(e_id, '(RFC822)')
|
|
||||||
raw_email = email_data[0][1]
|
|
||||||
msg = email.message_from_bytes(raw_email)
|
|
||||||
subject = decode_header_value(msg['subject'])
|
|
||||||
sender = decode_header_value(msg['from'])
|
|
||||||
is_phishing = detect_phishing(subject, sender, e_id.decode())
|
|
||||||
emails.append({'subject': subject, 'from': sender, 'is_phishing': is_phishing, 'id': e_id.decode()})
|
|
||||||
|
|
||||||
return jsonify(emails), 200
|
|
||||||
except Exception as e:
|
|
||||||
print('Error during email check:', e)
|
|
||||||
traceback.print_exc()
|
|
||||||
return jsonify({'message': 'Internal server error'}), 500
|
|
||||||
|
|
||||||
@app.route('/logout', methods=['POST'])
|
|
||||||
def logout():
|
|
||||||
try:
|
|
||||||
session.pop('username', None)
|
|
||||||
session.pop('password', None)
|
|
||||||
return jsonify({'message': 'Logged out'}), 200
|
|
||||||
except Exception as e:
|
|
||||||
print('Error during logout:', e)
|
|
||||||
traceback.print_exc()
|
|
||||||
return jsonify({'message': 'Internal server error'}), 500
|
|
||||||
|
|
||||||
@app.route('/mark_safe/<email_id>', methods=['POST'])
|
|
||||||
def mark_safe(email_id):
|
|
||||||
global safe_emails
|
|
||||||
safe_emails.append(email_id)
|
|
||||||
save_safe_emails(safe_emails)
|
|
||||||
print(f'Email {email_id} marked as safe')
|
|
||||||
return jsonify({"message": f"Email {email_id} marked as safe"}), 200
|
|
||||||
|
|
||||||
@app.route('/move_trash/<email_id>', methods=['POST'])
|
|
||||||
def move_trash(email_id):
|
|
||||||
if 'username' not in session or 'password' not in session:
|
|
||||||
return jsonify({'message': 'Not logged in'}), 401
|
|
||||||
|
|
||||||
username = session['username']
|
|
||||||
password = session['password']
|
|
||||||
|
|
||||||
try:
|
|
||||||
mail = imaplib.IMAP4_SSL('imap.wp.pl')
|
|
||||||
mail.login(username, password)
|
|
||||||
mail.select('INBOX')
|
|
||||||
print(f'Trying to move email ID {email_id} to Trash') # Logging email ID
|
|
||||||
mail.store(email_id, '+FLAGS', '\\Deleted')
|
|
||||||
mail.expunge()
|
|
||||||
print(f'Email {email_id} deleted') # Logging deletion
|
|
||||||
return jsonify({"message": f"Email {email_id} deleted"}), 200
|
|
||||||
except Exception as e:
|
|
||||||
print(f'Error during moving email to trash: {e}')
|
|
||||||
traceback.print_exc()
|
|
||||||
return jsonify({'message': 'Internal server error'}), 500
|
|
||||||
|
|
||||||
def decode_header_value(value):
|
|
||||||
parts = decode_header(value)
|
|
||||||
header_parts = []
|
|
||||||
for part, encoding in parts:
|
|
||||||
if isinstance(part, bytes):
|
|
||||||
try:
|
|
||||||
if encoding:
|
|
||||||
header_parts.append(part.decode(encoding))
|
|
||||||
else:
|
|
||||||
header_parts.append(part.decode('utf-8'))
|
|
||||||
except (LookupError, UnicodeDecodeError):
|
|
||||||
header_parts.append(part.decode('utf-8', errors='ignore'))
|
|
||||||
else:
|
|
||||||
header_parts.append(part)
|
|
||||||
return ''.join(header_parts)
|
|
||||||
|
|
||||||
def detect_phishing(subject, sender, email_id):
|
|
||||||
if email_id in safe_emails:
|
|
||||||
return False # If email is marked as safe, it's not phishing
|
|
||||||
|
|
||||||
phishing_keywords = ['urgent', 'verify', 'account', 'suspend', 'login']
|
|
||||||
phishing_senders = ['support@example.com', 'no-reply@example.com']
|
|
||||||
if any(keyword in subject.lower() for keyword in phishing_keywords) or sender.lower() in phishing_senders:
|
|
||||||
return True
|
|
||||||
return False
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
app.run(port=5000)
|
|
183469
backend/completeSpamAssassin.csv
Normal file
183469
backend/completeSpamAssassin.csv
Normal file
File diff suppressed because one or more lines are too long
20
backend/data_join.py
Normal file
20
backend/data_join.py
Normal file
@ -0,0 +1,20 @@
|
|||||||
|
import pandas as pd
|
||||||
|
|
||||||
|
df_1 = pd.read_csv("completeSpamAssassin.csv")
|
||||||
|
df_2 = pd.read_csv("enronSpamSubset.csv")
|
||||||
|
df_3 = pd.read_csv("lingSpam.csv")
|
||||||
|
|
||||||
|
df = pd.concat([df_1, df_2, df_3])
|
||||||
|
df = df[['Body', 'Label']]
|
||||||
|
|
||||||
|
df = df.sample(len(df))
|
||||||
|
|
||||||
|
df.reset_index(drop=True, inplace=True)
|
||||||
|
|
||||||
|
df.to_csv('joined_data.csv')
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
252259
backend/enronSpamSubset.csv
Normal file
252259
backend/enronSpamSubset.csv
Normal file
File diff suppressed because one or more lines are too long
467279
backend/joined_data.csv
Normal file
467279
backend/joined_data.csv
Normal file
File diff suppressed because one or more lines are too long
31553
backend/lingSpam.csv
Normal file
31553
backend/lingSpam.csv
Normal file
File diff suppressed because one or more lines are too long
4
backend/requirements.txt
Normal file
4
backend/requirements.txt
Normal file
@ -0,0 +1,4 @@
|
|||||||
|
Flask==3.0.3
|
||||||
|
Flask-Cors==4.0.1
|
||||||
|
scikit-learn==1.3.2
|
||||||
|
joblib==1.4.2
|
@ -1 +0,0 @@
|
|||||||
["3", "13", "14", "15", "16"]
|
|
BIN
backend/spam_classifier_model.pkl
Normal file
BIN
backend/spam_classifier_model.pkl
Normal file
Binary file not shown.
BIN
backend/vectorizer.pkl
Normal file
BIN
backend/vectorizer.pkl
Normal file
Binary file not shown.
@ -1,51 +1,8 @@
|
|||||||
chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
|
chrome.action.onClicked.addListener((tab) => {
|
||||||
if (message.type === 'phishing-detected') {
|
chrome.windows.create({
|
||||||
const emails = message.emails;
|
url: chrome.runtime.getURL("popup.html"),
|
||||||
let notificationTitle = 'You are safe!';
|
type: "popup",
|
||||||
let notificationMessage = 'No phishing emails detected.';
|
width: 850,
|
||||||
|
height: 700
|
||||||
const phishingEmails = emails.filter(email => email.is_phishing);
|
});
|
||||||
if (phishingEmails.length > 0) {
|
|
||||||
notificationTitle = 'You are in danger!';
|
|
||||||
notificationMessage = `Email from ${phishingEmails[0].from} titled "${phishingEmails[0].subject}" has been identified as phishing.`;
|
|
||||||
|
|
||||||
chrome.windows.create({
|
|
||||||
url: 'notification.html',
|
|
||||||
type: 'popup',
|
|
||||||
width: 300,
|
|
||||||
height: 200
|
|
||||||
}, function(window) {
|
|
||||||
chrome.storage.local.set({
|
|
||||||
notificationTitle: notificationTitle,
|
|
||||||
notificationMessage: notificationMessage,
|
|
||||||
emailId: phishingEmails[0].id
|
|
||||||
});
|
|
||||||
});
|
|
||||||
} else {
|
|
||||||
chrome.windows.create({
|
|
||||||
url: 'notification.html',
|
|
||||||
type: 'popup',
|
|
||||||
width: 300,
|
|
||||||
height: 200
|
|
||||||
}, function(window) {
|
|
||||||
chrome.storage.local.set({
|
|
||||||
notificationTitle: notificationTitle,
|
|
||||||
notificationMessage: notificationMessage,
|
|
||||||
emailId: null
|
|
||||||
});
|
|
||||||
});
|
|
||||||
}
|
|
||||||
} else if (message.type === 'mark-safe') {
|
|
||||||
fetch(`http://localhost:5000/mark_safe/${message.emailId}`, {
|
|
||||||
method: 'POST'
|
|
||||||
}).then(response => response.json())
|
|
||||||
.then(data => console.log(data))
|
|
||||||
.catch(error => console.error('Error:', error));
|
|
||||||
} else if (message.type === 'move-trash') {
|
|
||||||
fetch(`http://localhost:5000/move_trash/${message.emailId}`, {
|
|
||||||
method: 'POST'
|
|
||||||
}).then(response => response.json())
|
|
||||||
.then(data => console.log(data))
|
|
||||||
.catch(error => console.error('Error:', error));
|
|
||||||
}
|
|
||||||
});
|
});
|
||||||
|
Before Width: | Height: | Size: 13 KiB After Width: | Height: | Size: 13 KiB |
Before Width: | Height: | Size: 537 B After Width: | Height: | Size: 537 B |
Before Width: | Height: | Size: 2.6 KiB After Width: | Height: | Size: 2.6 KiB |
@ -2,27 +2,16 @@
|
|||||||
"manifest_version": 3,
|
"manifest_version": 3,
|
||||||
"name": "PhishGuardian",
|
"name": "PhishGuardian",
|
||||||
"version": "1.0",
|
"version": "1.0",
|
||||||
"permissions": ["storage", "activeTab", "scripting", "notifications"],
|
"description": "Classify emails as spam or not spam.",
|
||||||
"host_permissions": [
|
"permissions": ["storage", "activeTab", "scripting", "windows"],
|
||||||
"http://localhost:5000/*"
|
|
||||||
],
|
|
||||||
"background": {
|
"background": {
|
||||||
"service_worker": "background.js"
|
"service_worker": "background.js"
|
||||||
},
|
},
|
||||||
"action": {
|
"action": {
|
||||||
"default_popup": "popup.html",
|
|
||||||
"default_icon": {
|
"default_icon": {
|
||||||
"16": "icon16.png",
|
"16": "images/icon16.png",
|
||||||
"48": "icon48.png",
|
"48": "images/icon48.png",
|
||||||
"128": "icon128.png"
|
"128": "images/icon128.png"
|
||||||
}
|
}
|
||||||
},
|
|
||||||
"icons": {
|
|
||||||
"16": "icon16.png",
|
|
||||||
"48": "icon48.png",
|
|
||||||
"128": "icon128.png"
|
|
||||||
},
|
|
||||||
"content_security_policy": {
|
|
||||||
"extension_pages": "script-src 'self'; object-src 'self'"
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -1,17 +0,0 @@
|
|||||||
<!DOCTYPE html>
|
|
||||||
<html lang="en">
|
|
||||||
<head>
|
|
||||||
<meta charset="UTF-8">
|
|
||||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
||||||
<title>Notification</title>
|
|
||||||
<script src="notification.js" defer></script>
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
<div id="notification-content">
|
|
||||||
<p id="notification-message"></p>
|
|
||||||
<button id="mark-safe">Mark as Safe</button>
|
|
||||||
<button id="move-trash">Move to Trash</button>
|
|
||||||
<button id="close">Close</button>
|
|
||||||
</div>
|
|
||||||
</body>
|
|
||||||
</html>
|
|
@ -1,47 +0,0 @@
|
|||||||
document.addEventListener('DOMContentLoaded', function() {
|
|
||||||
const markSafeButton = document.getElementById('mark-safe');
|
|
||||||
const moveTrashButton = document.getElementById('move-trash');
|
|
||||||
const closeButton = document.getElementById('close');
|
|
||||||
const notificationMessage = document.getElementById('notification-message');
|
|
||||||
|
|
||||||
chrome.storage.local.get(['notificationTitle', 'notificationMessage', 'emailId'], function(items) {
|
|
||||||
notificationMessage.textContent = items.notificationMessage;
|
|
||||||
|
|
||||||
if (items.emailId) {
|
|
||||||
// Show action buttons if there's a phishing email
|
|
||||||
markSafeButton.style.display = 'inline-block';
|
|
||||||
moveTrashButton.style.display = 'inline-block';
|
|
||||||
} else {
|
|
||||||
// Hide action buttons if no phishing emails
|
|
||||||
markSafeButton.style.display = 'none';
|
|
||||||
moveTrashButton.style.display = 'none';
|
|
||||||
}
|
|
||||||
|
|
||||||
markSafeButton.addEventListener('click', function() {
|
|
||||||
console.log('Mark Safe button clicked for emailId:', items.emailId);
|
|
||||||
chrome.runtime.sendMessage({
|
|
||||||
type: 'mark-safe',
|
|
||||||
emailId: items.emailId
|
|
||||||
}, function(response) {
|
|
||||||
console.log('Mark safe response:', response);
|
|
||||||
});
|
|
||||||
setTimeout(() => window.close(), 50); // Wait for 50ms before closing the window
|
|
||||||
});
|
|
||||||
|
|
||||||
moveTrashButton.addEventListener('click', function() {
|
|
||||||
console.log('Move to Trash button clicked for emailId:', items.emailId);
|
|
||||||
chrome.runtime.sendMessage({
|
|
||||||
type: 'move-trash',
|
|
||||||
emailId: items.emailId
|
|
||||||
}, function(response) {
|
|
||||||
console.log('Move to trash response:', response);
|
|
||||||
});
|
|
||||||
setTimeout(() => window.close(), 50); // Wait for 50ms before closing the window
|
|
||||||
});
|
|
||||||
|
|
||||||
closeButton.addEventListener('click', function() {
|
|
||||||
console.log('Close button clicked');
|
|
||||||
window.close();
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
@ -1,21 +1,40 @@
|
|||||||
<!DOCTYPE html>
|
<!DOCTYPE html>
|
||||||
<html lang="en">
|
<html>
|
||||||
<head>
|
<head>
|
||||||
<meta charset="UTF-8">
|
<title>PhishGuardian</title>
|
||||||
<title>Phishing Email Detector</title>
|
<link rel="stylesheet" href="styles.css">
|
||||||
<script src="popup.js" defer></script>
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<div id="login-section">
|
<h1>PhishGuardian</h1>
|
||||||
<h2>Login</h2>
|
<div id="loginSection">
|
||||||
<input type="text" id="username" placeholder="Username">
|
<div>
|
||||||
<input type="password" id="password" placeholder="Password">
|
<label>Email:</label>
|
||||||
<button id="login">Login</button>
|
<input type="text" id="email">
|
||||||
|
</div>
|
||||||
|
<div>
|
||||||
|
<label>Password:</label>
|
||||||
|
<input type="password" id="password">
|
||||||
|
</div>
|
||||||
|
<button id="loginButton">Log In</button>
|
||||||
</div>
|
</div>
|
||||||
<div id="control-section" style="display: none;">
|
<div id="loggedInSection" style="display:none;">
|
||||||
<button id="check-mail">Check Mail</button>
|
<div id="loggedInInfo">
|
||||||
|
Logged in as: <span id="loggedInEmail"></span>
|
||||||
|
</div>
|
||||||
<button id="logout">Logout</button>
|
<button id="logout">Logout</button>
|
||||||
|
<button id="fetchEmails">Fetch Emails</button>
|
||||||
</div>
|
</div>
|
||||||
<div id="results"></div>
|
<ul id="emailList"></ul>
|
||||||
|
<div>
|
||||||
|
<h2>Email Body</h2>
|
||||||
|
<textarea id="emailBody" rows="10" cols="50"></textarea>
|
||||||
|
<button id="classifyEmail">Classify Email</button>
|
||||||
|
</div>
|
||||||
|
<div id="result"></div>
|
||||||
|
<div id="actions" style="display:none;">
|
||||||
|
<button id="markSafe">Mark as Safe</button>
|
||||||
|
<button id="deleteEmail">Delete Email</button>
|
||||||
|
</div>
|
||||||
|
<script src="popup.js"></script>
|
||||||
</body>
|
</body>
|
||||||
</html>
|
</html>
|
||||||
|
@ -1,92 +1,254 @@
|
|||||||
document.addEventListener('DOMContentLoaded', function() {
|
let classificationResults = {}; // Dictionary to store classification results
|
||||||
const loginButton = document.getElementById('login');
|
|
||||||
const checkMailButton = document.getElementById('check-mail');
|
|
||||||
const logoutButton = document.getElementById('logout');
|
|
||||||
const loginSection = document.getElementById('login-section');
|
|
||||||
const controlSection = document.getElementById('control-section');
|
|
||||||
|
|
||||||
// Check if already logged in
|
document.addEventListener('DOMContentLoaded', () => {
|
||||||
chrome.storage.local.get(['username', 'password'], function(items) {
|
loadClassificationResults(); // Load classification results on start
|
||||||
if (items.username && items.password) {
|
checkLoginState();
|
||||||
loginSection.style.display = 'none';
|
|
||||||
controlSection.style.display = 'block';
|
document.getElementById('loginButton').addEventListener('click', () => {
|
||||||
} else {
|
const email = document.getElementById('email').value;
|
||||||
loginSection.style.display = 'block';
|
const password = document.getElementById('password').value;
|
||||||
controlSection.style.display = 'none';
|
|
||||||
|
login(email, password);
|
||||||
|
});
|
||||||
|
|
||||||
|
document.getElementById('fetchEmails').addEventListener('click', () => {
|
||||||
|
fetchEmails();
|
||||||
|
});
|
||||||
|
|
||||||
|
document.getElementById('classifyEmail').addEventListener('click', () => {
|
||||||
|
const emailBody = document.getElementById('emailBody').value;
|
||||||
|
const emailId = getCurrentEmailId();
|
||||||
|
|
||||||
|
fetch('http://localhost:5000/classify-email', {
|
||||||
|
method: 'POST',
|
||||||
|
headers: {
|
||||||
|
'Content-Type': 'application/json'
|
||||||
|
},
|
||||||
|
body: JSON.stringify({ body: emailBody })
|
||||||
|
})
|
||||||
|
.then(response => response.json())
|
||||||
|
.then(data => {
|
||||||
|
classificationResults[emailId] = data.result; // Store the result
|
||||||
|
saveClassificationResults(); // Save results to chrome storage
|
||||||
|
updateClassificationResult(emailId); // Update the displayed result
|
||||||
|
if (data.result === "Suspicious") {
|
||||||
|
showActions();
|
||||||
|
} else {
|
||||||
|
hideActions();
|
||||||
|
}
|
||||||
|
})
|
||||||
|
.catch(error => console.error('Error classifying email:', error));
|
||||||
|
});
|
||||||
|
|
||||||
|
document.getElementById('logout').addEventListener('click', () => {
|
||||||
|
chrome.storage.local.remove(['email', 'password', 'classificationResults'], () => {
|
||||||
|
console.log('Logged out');
|
||||||
|
clearEmailList(); // Clear the email list
|
||||||
|
clearCredentials(); // Clear the credentials
|
||||||
|
showLoginSection();
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
document.getElementById('markSafe').addEventListener('click', () => {
|
||||||
|
const emailId = getCurrentEmailId();
|
||||||
|
if (emailId) {
|
||||||
|
markEmailAsSafe(emailId);
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
loginButton.addEventListener('click', function() {
|
document.getElementById('deleteEmail').addEventListener('click', () => {
|
||||||
const username = document.getElementById('username').value;
|
const emailId = getCurrentEmailId();
|
||||||
const password = document.getElementById('password').value;
|
if (emailId) {
|
||||||
|
deleteEmail(emailId);
|
||||||
fetch('http://localhost:5000/login', {
|
}
|
||||||
method: 'POST',
|
|
||||||
headers: {
|
|
||||||
'Content-Type': 'application/json'
|
|
||||||
},
|
|
||||||
body: JSON.stringify({ username, password }),
|
|
||||||
credentials: 'include'
|
|
||||||
})
|
|
||||||
.then(response => response.json())
|
|
||||||
.then(data => {
|
|
||||||
alert(data.message);
|
|
||||||
if (data.message === 'Login successful') {
|
|
||||||
chrome.storage.local.set({ 'username': username, 'password': password }, function() {
|
|
||||||
loginSection.style.display = 'none';
|
|
||||||
controlSection.style.display = 'block';
|
|
||||||
});
|
|
||||||
}
|
|
||||||
})
|
|
||||||
.catch(error => {
|
|
||||||
console.error('Error during login request:', error);
|
|
||||||
alert('An error occurred while logging in');
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
checkMailButton.addEventListener('click', function() {
|
|
||||||
fetch('http://localhost:5000/check_mail', {
|
|
||||||
method: 'GET',
|
|
||||||
headers: {
|
|
||||||
'Content-Type': 'application/json'
|
|
||||||
},
|
|
||||||
credentials: 'include'
|
|
||||||
})
|
|
||||||
.then(response => response.json())
|
|
||||||
.then(data => {
|
|
||||||
console.log('Check mail response:', data);
|
|
||||||
chrome.runtime.sendMessage({
|
|
||||||
type: 'phishing-detected',
|
|
||||||
emails: data
|
|
||||||
});
|
|
||||||
})
|
|
||||||
.catch(error => {
|
|
||||||
console.error('Error during check mail request:', error);
|
|
||||||
alert('An error occurred while checking mail');
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
logoutButton.addEventListener('click', function() {
|
|
||||||
fetch('http://localhost:5000/logout', {
|
|
||||||
method: 'POST',
|
|
||||||
headers: {
|
|
||||||
'Content-Type': 'application/json'
|
|
||||||
},
|
|
||||||
credentials: 'include'
|
|
||||||
})
|
|
||||||
.then(response => response.json())
|
|
||||||
.then(data => {
|
|
||||||
alert(data.message);
|
|
||||||
if (data.message === 'Logged out') {
|
|
||||||
chrome.storage.local.remove(['username', 'password'], function() {
|
|
||||||
loginSection.style.display = 'block';
|
|
||||||
controlSection.style.display = 'none';
|
|
||||||
});
|
|
||||||
}
|
|
||||||
})
|
|
||||||
.catch(error => {
|
|
||||||
console.error('Error during logout request:', error);
|
|
||||||
alert('An error occurred while logging out');
|
|
||||||
});
|
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
|
function login(email, password) {
|
||||||
|
chrome.storage.local.set({ email: email, password: password }, () => {
|
||||||
|
console.log('Credentials saved');
|
||||||
|
showLoggedInSection(email);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function fetchEmails() {
|
||||||
|
chrome.storage.local.get(['email', 'password'], (result) => {
|
||||||
|
if (result.email && result.password) {
|
||||||
|
fetch('http://localhost:5000/fetch-emails', {
|
||||||
|
method: 'POST',
|
||||||
|
headers: {
|
||||||
|
'Content-Type': 'application/json'
|
||||||
|
},
|
||||||
|
body: JSON.stringify({ username: result.email, password: result.password })
|
||||||
|
})
|
||||||
|
.then(response => {
|
||||||
|
if (!response.ok) {
|
||||||
|
throw new Error('Login failed. Check your email and password.');
|
||||||
|
}
|
||||||
|
return response.json();
|
||||||
|
})
|
||||||
|
.then(data => {
|
||||||
|
updateEmailList(data);
|
||||||
|
})
|
||||||
|
.catch(error => console.error('Error fetching emails:', error));
|
||||||
|
} else {
|
||||||
|
alert("Login credentials not found.");
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function checkLoginState() {
|
||||||
|
chrome.storage.local.get(['email', 'password'], (result) => {
|
||||||
|
if (result.email && result.password) {
|
||||||
|
showLoggedInSection(result.email);
|
||||||
|
} else {
|
||||||
|
showLoginSection();
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function showLoginSection() {
|
||||||
|
document.getElementById('loginSection').style.display = 'block';
|
||||||
|
document.getElementById('loggedInSection').style.display = 'none';
|
||||||
|
document.getElementById('loggedInEmail').textContent = '';
|
||||||
|
document.getElementById('actions').style.display = 'none';
|
||||||
|
clearEmailList(); // Clear the email list when showing the login section
|
||||||
|
}
|
||||||
|
|
||||||
|
function showLoggedInSection(email) {
|
||||||
|
document.getElementById('loginSection').style.display = 'none';
|
||||||
|
document.getElementById('loggedInSection').style.display = 'block';
|
||||||
|
document.getElementById('loggedInEmail').textContent = email;
|
||||||
|
}
|
||||||
|
|
||||||
|
function showActions() {
|
||||||
|
document.getElementById('actions').style.display = 'block';
|
||||||
|
}
|
||||||
|
|
||||||
|
function hideActions() {
|
||||||
|
document.getElementById('actions').style.display = 'none';
|
||||||
|
}
|
||||||
|
|
||||||
|
function getCurrentEmailId() {
|
||||||
|
return document.getElementById('emailBody').dataset.emailId;
|
||||||
|
}
|
||||||
|
|
||||||
|
function markEmailAsSafe(emailId) {
|
||||||
|
fetch('http://localhost:5000/mark-safe', {
|
||||||
|
method: 'POST',
|
||||||
|
headers: {
|
||||||
|
'Content-Type': 'application/json'
|
||||||
|
},
|
||||||
|
body: JSON.stringify({ email_id: emailId })
|
||||||
|
})
|
||||||
|
.then(response => response.json())
|
||||||
|
.then(data => {
|
||||||
|
classificationResults[emailId] = "Not suspicious (marked safe by user)"; // Update the classification result
|
||||||
|
saveClassificationResults(); // Save results to chrome storage
|
||||||
|
updateClassificationResult(emailId); // Update the displayed result
|
||||||
|
hideActions(); // Hide actions since it's now marked as safe
|
||||||
|
})
|
||||||
|
.catch(error => console.error('Error marking email as safe:', error));
|
||||||
|
}
|
||||||
|
|
||||||
|
function deleteEmail(emailId) {
|
||||||
|
chrome.storage.local.get(['email', 'password'], (result) => {
|
||||||
|
if (result.email && result.password) {
|
||||||
|
fetch('http://localhost:5000/delete-email', {
|
||||||
|
method: 'POST',
|
||||||
|
headers: {
|
||||||
|
'Content-Type': 'application/json'
|
||||||
|
},
|
||||||
|
body: JSON.stringify({ email_id: emailId, username: result.email, password: result.password })
|
||||||
|
})
|
||||||
|
.then(response => response.json())
|
||||||
|
.then(data => {
|
||||||
|
alert(data.message);
|
||||||
|
removeEmailFromList(emailId); // Update the list and reindex
|
||||||
|
})
|
||||||
|
.catch(error => console.error('Error deleting email:', error));
|
||||||
|
} else {
|
||||||
|
alert("Login credentials not found.");
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function removeEmailFromList(emailId) {
|
||||||
|
const emailList = document.getElementById('emailList');
|
||||||
|
const emailItems = emailList.getElementsByTagName('li');
|
||||||
|
for (let i = 0; i < emailItems.length; i++) {
|
||||||
|
if (emailItems[i].dataset.emailId === emailId) {
|
||||||
|
emailList.removeChild(emailItems[i]);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// Reindex the remaining emails
|
||||||
|
reindexEmailList();
|
||||||
|
document.getElementById('emailBody').value = ''; // Clear email body display
|
||||||
|
document.getElementById('result').textContent = ''; // Clear classification result message
|
||||||
|
hideActions(); // Hide actions
|
||||||
|
}
|
||||||
|
|
||||||
|
function reindexEmailList() {
|
||||||
|
const emailList = document.getElementById('emailList');
|
||||||
|
const emailItems = emailList.getElementsByTagName('li');
|
||||||
|
for (let i = 0; i < emailItems.length; i++) {
|
||||||
|
emailItems[i].textContent = `${i + 1}: ${emailItems[i].textContent.split(': ')[1]}`;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function updateClassificationResult(emailId) {
|
||||||
|
const resultDiv = document.getElementById('result');
|
||||||
|
if (classificationResults[emailId]) {
|
||||||
|
resultDiv.textContent = `This email is: ${classificationResults[emailId]}`;
|
||||||
|
if (classificationResults[emailId] === "Suspicious") {
|
||||||
|
showActions();
|
||||||
|
} else {
|
||||||
|
hideActions();
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
resultDiv.textContent = ''; // Clear message if not classified
|
||||||
|
hideActions();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function saveClassificationResults() {
|
||||||
|
chrome.storage.local.set({ classificationResults: classificationResults }, () => {
|
||||||
|
console.log('Classification results saved');
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function loadClassificationResults() {
|
||||||
|
chrome.storage.local.get(['classificationResults'], (result) => {
|
||||||
|
if (result.classificationResults) {
|
||||||
|
classificationResults = result.classificationResults;
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function clearEmailList() {
|
||||||
|
document.getElementById('emailList').innerHTML = ''; // Clear the email list
|
||||||
|
document.getElementById('emailBody').value = ''; // Clear email body display
|
||||||
|
document.getElementById('result').textContent = ''; // Clear classification result message
|
||||||
|
}
|
||||||
|
|
||||||
|
function clearCredentials() {
|
||||||
|
document.getElementById('email').value = ''; // Clear email input field
|
||||||
|
document.getElementById('password').value = ''; // Clear password input field
|
||||||
|
}
|
||||||
|
|
||||||
|
function updateEmailList(emails) {
|
||||||
|
const emailList = document.getElementById('emailList');
|
||||||
|
emailList.innerHTML = '';
|
||||||
|
emails.forEach((email, index) => {
|
||||||
|
const li = document.createElement('li');
|
||||||
|
li.textContent = `${index + 1}: ${email.subject} (from ${email.name} <${email.email_address}>)`;
|
||||||
|
li.dataset.emailId = email.id; // Assuming email objects have an id property
|
||||||
|
li.addEventListener('click', () => {
|
||||||
|
document.getElementById('emailBody').value = email.body;
|
||||||
|
document.getElementById('emailBody').dataset.emailId = email.id; // Store the email ID
|
||||||
|
updateClassificationResult(email.id); // Update the displayed result
|
||||||
|
});
|
||||||
|
emailList.appendChild(li);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
55
extension/styles.css
Normal file
55
extension/styles.css
Normal file
@ -0,0 +1,55 @@
|
|||||||
|
body {
|
||||||
|
font-family: Arial, sans-serif;
|
||||||
|
}
|
||||||
|
|
||||||
|
h1 {
|
||||||
|
font-size: 20px;
|
||||||
|
}
|
||||||
|
|
||||||
|
div {
|
||||||
|
margin-bottom: 10px;
|
||||||
|
}
|
||||||
|
|
||||||
|
label {
|
||||||
|
display: block;
|
||||||
|
margin-bottom: 5px;
|
||||||
|
}
|
||||||
|
|
||||||
|
input, textarea {
|
||||||
|
width: 100%;
|
||||||
|
padding: 8px;
|
||||||
|
box-sizing: border-box;
|
||||||
|
}
|
||||||
|
|
||||||
|
button {
|
||||||
|
padding: 10px 20px;
|
||||||
|
background-color: #007bff;
|
||||||
|
color: white;
|
||||||
|
border: none;
|
||||||
|
cursor: pointer;
|
||||||
|
}
|
||||||
|
|
||||||
|
button:hover {
|
||||||
|
background-color: #0056b3;
|
||||||
|
}
|
||||||
|
|
||||||
|
ul {
|
||||||
|
list-style: none;
|
||||||
|
padding: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
li {
|
||||||
|
padding: 10px;
|
||||||
|
border: 1px solid #ccc;
|
||||||
|
margin-bottom: 5px;
|
||||||
|
cursor: pointer;
|
||||||
|
}
|
||||||
|
|
||||||
|
li:hover {
|
||||||
|
background-color: #f0f0f0;
|
||||||
|
}
|
||||||
|
|
||||||
|
#result {
|
||||||
|
margin-top: 10px;
|
||||||
|
font-weight: bold;
|
||||||
|
}
|
Loading…
Reference in New Issue
Block a user