Update README.md

This commit is contained in:
s452624 2024-06-09 18:17:24 +03:00
parent 6e55acd064
commit fd65b2d964

130
README.md
View File

@ -1,12 +1,13 @@
# PhishGuardian # PhishGuardian
PhishGuardian is a browser extension designed to detect and manage suspicious emails. It uses machine learning to identify suspicious emails and provides options to mark them as safe or move them to the trash. The extension is built using Flask for the backend and JavaScript for the frontend. PhishGuardian is a browser extension designed to detect and manage phishing emails. It uses machine learning to identify potential phishing emails and provides options to mark them as safe or move them to the trash. The extension is built using Flask for the backend and JavaScript for the frontend.
## Features ## Features
- Detects suspicious emails using machine learning - Detects phishing emails using machine learning
- Allows users to mark emails as safe - Allows users to mark emails as safe
- Allows users to move suspicious emails to the trash - Allows users to move phishing emails to the trash
- Provides notifications for detected phishing emails
## Installation ## Installation
@ -17,90 +18,105 @@ PhishGuardian is a browser extension designed to detect and manage suspicious em
- scikit-learn - scikit-learn
- Chrome browser - Chrome browser
### Backend setup ### Backend Setup
1. Clone the repository: 1. Clone the repository:
```sh ```sh
git clone https://git.wmi.amu.edu.pl/s452649/PhishGuardian.git git clone https://your-repository-url
cd PhishGuardian/backend cd PhishGuardian/backend
``` ```
2. Install the required Python packages: 2. Install the required Python packages:
```sh ```sh
pip install -r requirements.txt pip install -r requirements.txt
``` ```
3. Run the Flask backend: 3. Run the Flask backend:
```sh ```sh
python app.py python backend.py
``` ```
### Extension setup ### Extension Setup
1. Open Chrome and go to `chrome://extensions/` 1. Open Chrome and go to `chrome://extensions/`
2. Enable "Developer mode" by toggling the switch in the top right corner. 2. Enable "Developer mode" by toggling the switch in the top right corner.
3. Click on "Load unpacked" and select the `extension` directory within the `PhishGuardian` directory. 3. Click on "Load unpacked" and select the `PhishGuardian` directory.
### Setting up Gmail API
### To integrate the Gmail API, follow these steps:
1. Visit the Google Cloud Console at https://console.cloud.google.com/apis/credentials.
2. Click on "Create Project" and fill in the required fields.
3. Navigate to the library, find the Gmail API, and enable it.
4. Go to the Credentials tab and click on "Create Credentials". Then, select "OAuth client ID", choose "Web application", and fill in the following fields:
- redirect_uris: ["http://localhost:5000/oauth2callback"]
- javascript_origins: ["http://localhost:5000"]
5. A popup window will appear where you can obtain the client_id which must be noted in manifest.json. There will also be a button to download a JSON file. Place this file in the backend folder for convenience. Name the file "client_secret.json", or you can rename it in backend.py as needed.
6. These setup instructions will assist you in integrating and using the Gmail API effectively in your project.
## Usage ## Usage
1. Click on the PhishGuardian extension icon in the Chrome toolbar. 1. Click on the PhishGuardian extension icon in the Chrome toolbar.
2. Login with your email credentials. For now, only credentials for Outlook are supported (this will change in the future). 2. Login with your email credentials. Currently, the extension supports credentials for the wp.pl mailing service and Outlook.
3. Use the "Fetch Emails" button to retrieve your emails. 3. Use the "Check Mail" button to scan for phishing emails.
4. Select an email from the list and click the "Classify Email" button to scan the email. 4. If a phishing email is detected, a notification will appear with options to mark the email as safe or move it to the trash.
5. Classification result will be displayed.
6. Use the "Mark as Safe" button to mark the email as safe or the "Delete Email" button to delete a suspicious email.
## Code overview ## Code Overview
### Backend (`app.py`) ### Backend Architecture Description
- Uses Flask to handle HTTP requests. #### Flask Application:
- Uses IMAP to connect to the email server and fetch emails.
- Uses scikit-learn to classify emails as suspicious or not based on the content. - Flask is used as the main web framework.
- Provides endpoints for fetching emails, classifying emails, marking emails as safe, and deleting emails.
#### CORS:
- Enabled using Flask-CORS to handle cross-origin requests.
#### Configuration and Environment:
- Secret Key: Defined for managing sessions.
- Environment Variable: OAUTHLIB_INSECURE_TRANSPORT is set to '1' to allow insecure HTTP connections during development. This can be removed once a certificate is obtained from Google.
#### OAuth 2.0:
- Client Secrets: Loaded from the client_secret.json file.
- Scopes: Defined for permissions to read and modify Gmail data.
- Routes: /authorize to initiate the OAuth process and /oauth2callback to handle the OAuth callback.
#### Session Management:
- Saving and Loading Credentials: Functions for storing and loading OAuth credentials in the session.
#### Email Handling:
- Safe Emails: Stored in a JSON file safe_emails.json.
### Frontend ### Frontend
- `popup.html`: The main interface of the extension. - `popup.html`: The main interface of the extension.
- `popup.js`: Handles interactions in the popup, such as login, fetching emails, and handling responses. - `popup.js`: Handles interactions in the popup, such as login, checking mail, and handling responses.
- `background.js`: Manages the background tasks of the extension, such as opening the popup. - `background.js`: Listens for messages from the popup and handles notifications.
- `styles.css`: Contains the styles for the extension's UI. - `notification.html` & `notification.js`: The interface and logic for the notification popup.
- `manifest.json`: Configuration file for the Chrome extension.
- `images/icon16.png`, `images/icon48.png`, `images/icon128.png`: Icons used for the extension.
## API endpoints ### Routes:
- `POST /fetch-emails`: Fetch emails from the email server. - /validate-outlook-login: For validating Outlook credentials.
- `POST /classify-email`: Classify an email as phishing or not. - /fetch-emails: To fetch and classify emails from Outlook.
- `POST /mark-safe`: Mark an email as safe. - /check_mail: To check emails from Gmail.
- `POST /delete-email`: Delete an email from the email server. - /mark_safe/<email_id>: To mark emails as safe.
- /move_trash/<email_id>: To move emails to the trash in Gmail.
- /delete-email: To delete an email from Outlook.
- /logout: To clear the session.
## Files and directories ### Backend Checks
### Backend directory The backend performs checks on the email content itself to detect any presence of links, buttons (for confirmation or verification purposes). The model has been trained on a dataset containing a high volume of invalid emails, which includes additions of HTML, incorrect and suspicious text. Emails that are too short will also be considered suspicious.
- `app.py`: Main Flask application file.
- `spam_classifier_model.pkl`: Pre-trained machine learning model for classifying emails.
- `vectorizer.pkl`: Pre-trained vectorizer for transforming email content into a format suitable for the classifier.
- `source.txt`: Contains a link from which the datasets were downloaded.
- `lingSpam.csv`, `enronSpamSubset.csv`, `completeSpamAssasin.csv`: These are the datasets used to train the model (Random Forest is the chosen model).
- `data_join.py`: Script which merges the three datasets into one CSV file called `joined_data.csv`.
- `joined_data.csv`: The combined dataset resulting from `data_join.py`.
- `ML.ipynb`: Jupyter notebook containing all the machine learning and vectorizer information.
- `requirements.txt`: File containing the list of required Python packages.
### Extension directory
- `popup.html`: The main HTML file for the extension's UI.
- `popup.js`: JavaScript for handling UI interactions and communicating with the backend.
- `background.js`: JavaScript for background tasks and managing the extension's lifecycle.
- `styles.css`: CSS styles for the extension's UI.
- `manifest.json`: Configuration file for the Chrome extension.
- `images/icon16.png`, `images/icon48.png`, `images/icon128.png`: Icons used for the extension.
## How it works
1. **Login**: Users log in with their email credentials using the extension.
2. **Fetch emails**: The extension fetches emails from the server and displays them in the popup.
3. **Classify emails**: Emails are classified as suspicious or not. The classification results are stored and associated with each email.
4. **Mark as safe/Delete**: Users can mark suspicious emails as safe or delete them. The actions are reflected in the backend and the UI is updated accordingly.