PhishGuardian/README.md
2024-06-09 18:17:24 +03:00

4.4 KiB

PhishGuardian

PhishGuardian is a browser extension designed to detect and manage phishing emails. It uses machine learning to identify potential phishing emails and provides options to mark them as safe or move them to the trash. The extension is built using Flask for the backend and JavaScript for the frontend.

Features

  • Detects phishing emails using machine learning
  • Allows users to mark emails as safe
  • Allows users to move phishing emails to the trash
  • Provides notifications for detected phishing emails

Installation

Prerequisites

  • Python 3.6+
  • Flask
  • scikit-learn
  • Chrome browser

Backend Setup

  1. Clone the repository:

    git clone https://your-repository-url
    cd PhishGuardian/backend
    
  2. Install the required Python packages:

    pip install -r requirements.txt
    
  3. Run the Flask backend:

    python backend.py
    

Extension Setup

  1. Open Chrome and go to chrome://extensions/
  2. Enable "Developer mode" by toggling the switch in the top right corner.
  3. Click on "Load unpacked" and select the PhishGuardian directory.

Setting up Gmail API

To integrate the Gmail API, follow these steps:

  1. Visit the Google Cloud Console at https://console.cloud.google.com/apis/credentials.
  2. Click on "Create Project" and fill in the required fields.
  3. Navigate to the library, find the Gmail API, and enable it.
  4. Go to the Credentials tab and click on "Create Credentials". Then, select "OAuth client ID", choose "Web application", and fill in the following fields:
  1. A popup window will appear where you can obtain the client_id which must be noted in manifest.json. There will also be a button to download a JSON file. Place this file in the backend folder for convenience. Name the file "client_secret.json", or you can rename it in backend.py as needed.
  2. These setup instructions will assist you in integrating and using the Gmail API effectively in your project.

Usage

  1. Click on the PhishGuardian extension icon in the Chrome toolbar.
  2. Login with your email credentials. Currently, the extension supports credentials for the wp.pl mailing service and Outlook.
  3. Use the "Check Mail" button to scan for phishing emails.
  4. If a phishing email is detected, a notification will appear with options to mark the email as safe or move it to the trash.

Code Overview

Backend Architecture Description

Flask Application:

  • Flask is used as the main web framework.

CORS:

  • Enabled using Flask-CORS to handle cross-origin requests.

Configuration and Environment:

  • Secret Key: Defined for managing sessions.

  • Environment Variable: OAUTHLIB_INSECURE_TRANSPORT is set to '1' to allow insecure HTTP connections during development. This can be removed once a certificate is obtained from Google.

OAuth 2.0:

  • Client Secrets: Loaded from the client_secret.json file.

  • Scopes: Defined for permissions to read and modify Gmail data.

  • Routes: /authorize to initiate the OAuth process and /oauth2callback to handle the OAuth callback.

Session Management:

  • Saving and Loading Credentials: Functions for storing and loading OAuth credentials in the session.

Email Handling:

  • Safe Emails: Stored in a JSON file safe_emails.json.

Frontend

  • popup.html: The main interface of the extension.
  • popup.js: Handles interactions in the popup, such as login, checking mail, and handling responses.
  • background.js: Listens for messages from the popup and handles notifications.
  • notification.html & notification.js: The interface and logic for the notification popup.

Routes:

  • /validate-outlook-login: For validating Outlook credentials.
  • /fetch-emails: To fetch and classify emails from Outlook.
  • /check_mail: To check emails from Gmail.
  • /mark_safe/<email_id>: To mark emails as safe.
  • /move_trash/<email_id>: To move emails to the trash in Gmail.
  • /delete-email: To delete an email from Outlook.
  • /logout: To clear the session.

Backend Checks

The backend performs checks on the email content itself to detect any presence of links, buttons (for confirmation or verification purposes). The model has been trained on a dataset containing a high volume of invalid emails, which includes additions of HTML, incorrect and suspicious text. Emails that are too short will also be considered suspicious.