ASD: AI-Powered Spam Detector

ASD (AI Spam Detector) is a comprehensive, production-ready spam detection system. It's built with a modern stack including Python, Flask, PyTorch, and a fine-tunable DistilBERT transformer model. This project is designed to be a practical tool for developers and end-users, offering multiple ways to integrate spam detection into your workflow.

Features

Accurate Spam Detection: Utilizes a DistilBERT model that can be fine-tuned on your own datasets for higher accuracy.
REST API: A Flask-based API for batch email predictions, allowing integration with any application.
Interactive Web Dashboard: A user-friendly interface to check individual emails and view real-time statistics.
IMAP Integration: Automatically scan your email accounts (like Gmail or Outlook) for spam.
Gmail Browser Extension: A client-side extension that integrates spam detection directly into the Gmail interface.
Modular Architecture: Use individual components or run the entire system as an integrated application.
Easy Setup: A simple setup script to get you started in minutes.

System Architecture

The system is designed in a modular way, with different components for different functionalities. Here's a high-level overview of the architecture:

┌─────────────────────────────────────────────────────┐
│              User Interfaces                        │
├──────────────────┬──────────────────┬───────────────┤
│  Gmail Browser   │  Web Dashboard   │  REST API     │
│  Extension       │  (Port 5001)     │  (Port 5000)  │
│  (JavaScript)    │                  │               │
└────────┬─────────┴────────┬─────────┴────────┬──────┘
         │                  │                  │
         └──────────────────┼──────────────────┘
                            │
              ┌─────────────▼─────────────┐
              │  Flask Applications       │
              │ ├─ email_processor.py     │
              │ ├─ run_api.py             │
              │ └─ email_integration.py   │
              └────────────┬──────────────┘
                           │
              ┌────────────▼─────────────┐
              │  Core ML Engine          │
              │  (spam_detec.py)         │
              │ ├─ DistilBERT Model      │
              │ ├─ Tokenizer             │
              │ └─ Trainer               │
              └────────────┬─────────────┘
                           │
                    ┌──────▼──────┐
                    │ PyTorch GPU │
                    │ or CPU      │
                    └─────────────┘

How It Works

spam_detec.py: This is the core of the system, containing the EmailSpamDetector class which handles all the machine learning logic, from loading the model to training and prediction.
run_training.py: This script is used to train the spam detection model on your own dataset.
run_api.py: This script runs a Flask server that exposes a REST API for batch spam predictions.
email_processor.py: This script runs another Flask server with a web dashboard for checking individual emails.
email_integration.py: This script connects to your email account via IMAP and scans for spam.
email_extension/: This directory contains the source code for the Gmail browser extension.
master_startup.py: This is an orchestrator script that can start and manage all the services of the system.
quick_setup.py: This script helps you to set up the environment and install all the required dependencies.

Setup and Installation

Clone the repository:

git clone <repository-url>
cd <repository-directory>

Run the quick setup script: This will create a requirements.txt file and install all the necessary packages.
```
python quick_setup.py
```

Usage

1. Train the Model (Optional)

You can fine-tune the model on your own dataset. The training data should be a CSV file with 'text' and 'label' columns (0 for legitimate, 1 for spam).

python run_training.py

The trained model will be saved in the ./trained_spam_model directory. If you skip this step, the system will use the pre-trained DistilBERT model.

2. Start the System

You can start all the services using the master_startup.py script.

python master_startup.py

This will:

Check for dependencies.
Check the model status.
Start the API server on port 5000.
Start the email processor and dashboard on port 5001.
Display a status dashboard with links to the services.

3. Access the Services

REST API: http://localhost:5000
- GET /health: Health check.
- POST /predict: Batch email predictions.
  - Request body: {"emails": ["email1_content", "email2_content"]}
Web Dashboard: http://localhost:5001/dashboard

4. IMAP Email Integration

You can use email_integration.py to scan your email account. You'll need to provide your email credentials and IMAP server details in the script.

5. Gmail Browser Extension

The email_extension can be loaded as an unpacked extension in Chrome or Firefox. It will automatically connect to the email processor service running on port 5001.

Dependencies

This project uses the following major libraries:

transformers
torch
pandas
numpy
datasets
scikit-learn
flask
tqdm
accelerate
requests

A full list of dependencies is available in requirements.txt.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ASD: AI-Powered Spam Detector

Features

System Architecture

How It Works

Setup and Installation

Usage

1. Train the Model (Optional)

2. Start the System

3. Access the Services

4. IMAP Email Integration

5. Gmail Browser Extension

Dependencies

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
email_extension		email_extension
email_integration.py		email_integration.py
email_processor.py		email_processor.py
master_startup.py		master_startup.py
quick_setup.py		quick_setup.py
requirements.txt		requirements.txt
run_api.py		run_api.py
run_training.py		run_training.py
spam_detec.py		spam_detec.py

Folders and files

Latest commit

History

Repository files navigation

ASD: AI-Powered Spam Detector

Features

System Architecture

How It Works

Setup and Installation

Usage

1. Train the Model (Optional)

2. Start the System

3. Access the Services

4. IMAP Email Integration

5. Gmail Browser Extension

Dependencies

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages