Skip to content

PyDevDeep/serotonin-script

Repository files navigation

🧠 Serotonin Script

Python FastAPI Taskiq Coverage Lint License

AI-driven medical content engine using RAG (LlamaIndex + Qdrant), FastAPI, and Taskiq for automated multi-platform publishing with physician style preservation.


🎯 Overview

Serotonin Script is an autonomous system for generating and distributing medically-accurate content across social platforms. It leverages RAG (Retrieval-Augmented Generation) to ensure medical precision while preserving the unique authorial voice of healthcare professionals.

The system covers the full content lifecycle: from a single /draft Slack command β†’ RAG-powered generation β†’ physician approval β†’ multi-platform publishing β†’ post-publish vectorization for continuous style improvement.

Key Capabilities

  • Style Preservation β€” Vector-based retrieval of physician's writing patterns via hybrid search (dense + BM25)
  • Medical Accuracy β€” Fact-checking against PubMed API and clinical guidelines (Chain-of-Verification)
  • Multi-Platform Publishing β€” Automated distribution to Telegram, X (Twitter), Threads via n8n workflows
  • Async-First Architecture β€” High-performance task processing via Taskiq + Redis (chosen over Celery: ~50-80 MB memory footprint vs ~150-200 MB, 1-2s startup vs 7-10s)
  • Slack-Native UX β€” Draft approval workflow with interactive Block Kit UI
  • RAG Feedback Loop β€” Published posts automatically vectorized back into Qdrant for continuous style learning
  • Production Observability β€” Prometheus metrics, Grafana dashboards (backend, LLM costs, Taskiq queue), Loki log aggregation

πŸ›  Tech Stack

Layer Technology Purpose
API Framework FastAPI Async-native REST API
Task Queue Taskiq 0.11+ + Redis Background job processing, async-native
AI Engine Claude 3.5 Sonnet / GPT-4o Content generation with LLM router + fallback
Vector Store Qdrant Semantic search for style matching and knowledge retrieval
RAG Framework LlamaIndex Retrieval-augmented generation pipeline
Search Hybrid (dense + BM25) Qdrant hybrid mode for improved retrieval precision
External Data PubMed API + BeautifulSoup Medical fact verification
Orchestration n8n (self-hosted) Workflow automation, scheduling, social delivery
Database PostgreSQL + Alembic Relational data with async sessions (asyncpg)
Monitoring Prometheus + Grafana + Loki + Promtail Metrics, dashboards, log aggregation
Reverse Proxy Nginx HTTPS termination

Why Taskiq over Celery?

Aspect Celery Taskiq
Architecture Sync-first Async-native (shared event loop with FastAPI)
Dependency Injection Manual wiring TaskiqDepends β€” identical to FastAPI
Memory per worker ~150-200 MB ~50-80 MB
Cold start 7-10 seconds 1-2 seconds
Type hints Partial Full (Pydantic-native)
Testing Complex mocking Direct async function calls

See ADR: Taskiq over Celery for the full decision record.


πŸ“ Project Structure

serotonin_script/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   β”œβ”€β”€ middleware/          # auth (Slack sig), error_handler, logging, rate_limit (Redis sliding-window)
β”‚   β”‚   └── routes/              # drafts, feedback (Slack interactions), health
β”‚   β”œβ”€β”€ config/                  # settings (Pydantic), system_prompts, lexicon (Slack UI text)
β”‚   β”œβ”€β”€ integrations/
β”‚   β”‚   β”œβ”€β”€ external/            # pubmed_client (NCBI E-utils), web_scraper (BeautifulSoup)
β”‚   β”‚   └── llm/                 # anthropic_client, openai_client, router (fallback logic)
β”‚   β”œβ”€β”€ models/                  # db_models (SQLAlchemy 2.0), schemas (Pydantic v2), enums
β”‚   β”œβ”€β”€ rag/
β”‚   β”‚   β”œβ”€β”€ indexing/            # document_loader (MD/PDF/TXT), chunking (SentenceSplitter), embedder
β”‚   β”‚   β”œβ”€β”€ pipelines/           # hybrid_search (dense + BM25)
β”‚   β”‚   └── retrieval/           # style_retriever, knowledge_retriever, base protocol
β”‚   β”œβ”€β”€ repositories/            # draft_repository, feedback_repository, post_repository
β”‚   β”œβ”€β”€ services/                # content_generator, draft_service, fact_checker, style_matcher, publisher_service
β”‚   β”œβ”€β”€ utils/                   # structured logging (Structlog)
β”‚   β”œβ”€β”€ workers/
β”‚   β”‚   β”œβ”€β”€ middlewares/         # LoggingMiddleware, RetryMiddleware (exp. backoff), PrometheusMiddleware
β”‚   β”‚   β”œβ”€β”€ tasks/               # generate_draft, publish_post, ingest_guideline, scheduled_post, vectorize_post
β”‚   β”‚   β”œβ”€β”€ broker.py            # Taskiq Redis broker (ListQueueBroker + RedisAsyncResultBackend, TTL 1h)
β”‚   β”‚   β”œβ”€β”€ callbacks.py         # Slack Block Kit notifications on task complete/failure
β”‚   β”‚   └── dependencies.py      # TaskiqDepends: StyleMatcher, FactChecker, LLMRouter, ContentGenerator, PublisherService
β”‚   └── tests/
β”‚       β”œβ”€β”€ unit/                # 20 test modules β€” services, RAG, workers, API, middleware
β”‚       └── integration/         # test_draft_service.py (full service stack)
β”œβ”€β”€ knowledge_base/
β”‚   β”œβ”€β”€ doctor_style/            # Physician's articles & posts (.md) + metadata.json
β”‚   └── medical_guidelines/      # Clinical protocol PDFs
β”œβ”€β”€ slack_app/
β”‚   β”œβ”€β”€ blocks/                  # draft_card.json, approval_modal.json, status_message.json
β”‚   β”œβ”€β”€ handlers/                # slash_commands.py (/draft), interactions.py, events.py
β”‚   └── utils/block_builder.py   # Dynamic Block Kit UI constructor
β”œβ”€β”€ orchestration/
β”‚   β”œβ”€β”€ n8n/                     # Workflow definitions + credentials guide
β”‚   └── monitoring/              # n8n health check (circuit breaker)
β”œβ”€β”€ database/
β”‚   β”œβ”€β”€ migrations/              # Alembic versions (initial schema + platform/scheduled_at)
β”‚   └── seeds/initial_data.sql
β”œβ”€β”€ infra/
β”‚   β”œβ”€β”€ docker/                  # Dockerfile.backend, Dockerfile.worker, Dockerfile.base
β”‚   β”œβ”€β”€ monitoring/              # Prometheus, Grafana dashboards (backend/llm_costs/taskiq), Loki, Promtail
β”‚   └── nginx/nginx.conf
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ index_knowledge_base.py  # Bulk ingestion into Qdrant
β”‚   β”œβ”€β”€ test_pipeline.py         # E2E pipeline test
β”‚   └── deploy.sh / migrate.sh / setup.sh
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ architecture.md
β”‚   β”œβ”€β”€ api_spec.yaml            # OpenAPI 3.0
β”‚   β”œβ”€β”€ deployment.md
β”‚   β”œβ”€β”€ runbook.md
β”‚   β”œβ”€β”€ taskiq_guide.md
β”‚   └── adr/                     # 001-vector-store, 002-llm-selection, 003-taskiq-over-celery
└── docker-compose.yml

πŸš€ Quick Start

Prerequisites

  • Docker & Docker Compose
  • Python 3.13 (for local development)
  • Slack workspace with /draft slash command configured
  • API keys: Anthropic, OpenAI
  • n8n credentials: Telegram Bot Token, X (Twitter) OAuth2, Threads Access Token (configured inside n8n, not in .env)

Installation

# Clone repository
git clone https://github.com/PyDevDeep/serotonin-script.git
cd serotonin-script

# Configure environment
cp .env.example .env
# Edit .env with your API keys and credentials

# Start all services (API + worker + Redis + Qdrant + PostgreSQL + n8n + monitoring)
docker-compose up --build

Service URLs

Service URL
API http://localhost:8000
API Docs (Swagger) http://localhost:8000/docs
n8n Workflows http://localhost:5678
Grafana http://localhost:3000

πŸ“– Usage

1. Index Knowledge Base

# Ingest physician's writing samples + medical guidelines into Qdrant
python scripts/index_knowledge_base.py

Loads documents from knowledge_base/doctor_style/ and knowledge_base/medical_guidelines/ β€” chunks, embeds, and stores vectors in two separate Qdrant collections.

2. Generate Draft via Slack

/draft anxiety management tips
/draft depression coping strategies telegram

Full workflow:

Slack /draft
  └─► n8n Webhook
        └─► POST /api/v1/draft          ← returns task_id immediately (< 500ms)
              └─► Taskiq generate_draft task
                    β”œβ”€β”€ StyleMatcher   β€” retrieves top-5 physician posts (Qdrant)
                    β”œβ”€β”€ FactChecker    β€” PubMed API + web scraping + Chain-of-Verification
                    └── ContentGenerator (Claude 3.5 Sonnet β†’ GPT-4o fallback)
                          └─► Slack callback β†’ Block Kit draft card

3. Approve & Publish

From the Slack draft card:

  • Publish to Telegram / X / Threads β€” triggers publish_post Taskiq task β†’ publisher_service.py dispatches a webhook to n8n β†’ n8n executes the platform-specific workflow (Telegram Bot API / Twitter API v2 / Threads API)
  • Edit β€” opens Slack modal with full text editor + platform/schedule selector
  • Regenerate β€” re-queues generate_draft with same topic

Publishing architecture note: publisher_service.py is a thin dispatcher β€” it sends a structured webhook payload to n8n and tracks publication status in PostgreSQL. The actual social platform API calls (auth, formatting, retry logic) live entirely in n8n workflows under orchestration/n8n/workflows/. To modify platform-specific publishing behavior, edit the n8n workflow β€” not the Python service.

4. RAG Feedback Loop

After publishing, vectorize_post task automatically embeds the final approved text back into Qdrant (doctor_style collection) β€” the system continuously learns the physician's evolving style.


πŸ”§ Development

Run Tests

# Full test suite with coverage
make test

# Unit tests only
make test-unit

# Integration tests (requires running containers)
make test-integration

Local Backend

# Install dependencies
poetry install

# Run API server
poetry run uvicorn backend.api.main:app --reload

# Run Taskiq worker (2 processes, max 10 concurrent async tasks)
poetry run taskiq worker backend.workers.broker:broker --workers 2 --max-async-tasks 10

Database Migrations

alembic revision --autogenerate -m "description"
alembic upgrade head

βœ… Test Coverage

Overall: 98% (4627 statements, 103 missed)

Module Coverage
services/content_generator.py 100%
services/draft_service.py 100%
services/fact_checker.py 100%
services/style_matcher.py 100%
api/middleware/auth.py 100%
api/middleware/error_handler.py 100%
integrations/external/pubmed_client.py 100%
integrations/llm/router.py 100%
rag/pipelines/hybrid_search.py 100%
rag/retrieval/knowledge_retriever.py 100%
rag/retrieval/style_retriever.py 100%
workers/tasks/generate_draft.py 100%
workers/tasks/publish_post.py 100%
workers/callbacks.py 100%
api/routes/feedback.py 96%
api/middleware/rate_limit.py 91%
services/publisher_service.py 91%
api/routes/drafts.py 40%
integrations/external/web_scraper.py 38%

api/routes/drafts.py (40%) and web_scraper.py (38%) are the remaining gaps β€” route integration tests and scraper HTTP mocking are the next testing targets.


πŸ“Š Monitoring

Three pre-built Grafana dashboards:

Dashboard URL Tracks
Backend Metrics http://localhost:3000/d/backend_metrics Request rate, latency (p95), error rate
LLM Costs http://localhost:3000/d/llm_costs Token usage, API calls, cost per platform
Taskiq Metrics http://localhost:3000/d/taskiq_metrics Queue depth, task duration, failure rate

Prometheus alert rules configured for:

  • Task failure rate > 5%/hour
  • Queue depth > 100 tasks
  • Task duration p95 > 60s
  • LLM error rate > 10% in 5 minutes

πŸ“š Documentation

Document Description
Architecture System design and component interactions
API Spec OpenAPI 3.0 specification
Taskiq Guide Async worker patterns and configuration
Deployment Production deployment guide
Runbook Operational procedures and troubleshooting
ADR: Vector Store Qdrant selection rationale
ADR: LLM Selection Claude + GPT-4o fallback design
ADR: Taskiq vs Celery Task queue decision record

βš™οΈ CI/CD

Four GitHub Actions workflows form a fully automated pipeline triggered on push to main:

Workflow Trigger What it does
lint.yml push / PR β†’ main Ruff linter, Ruff formatter check, Pyright type checker
test.yml push / PR β†’ main poetry install β†’ cp .env.example .env β†’ pytest
build.yml push β†’ main Builds and pushes 3 Docker images to GHCR (backend, worker, scheduler) tagged latest + commit SHA
deploy.yml on build.yml success SSH into VPS β†’ git pull origin main β†’ bash scripts/deploy.sh

Pipeline flow on every merge to main:

push β†’ main
  β”œβ”€β–Ί lint.yml     (parallel)
  β”œβ”€β–Ί test.yml     (parallel)
  └─► build.yml    β†’ pushes ghcr.io/<owner>/serotonin_script-{backend,worker,scheduler}
                         └─► deploy.yml  β†’ SSH β†’ git pull β†’ deploy.sh

deploy.yml runs only if build.yml concluded with success (if: github.event.workflow_run.conclusion == 'success'). Required GitHub Secrets: SERVER_HOST, SERVER_USER, SERVER_SSH_KEY.


🏭 Production Deployment

See docs/deployment.md for the full guide. Quick reference:

Docker Compose (VPS)

The production stack uses two Compose files layered together: docker-compose.yml (infrastructure services) and infra/docker-compose.prod.yml (application services).

# One-command deployment
bash scripts/deploy.sh

deploy.sh executes in order:

  1. Tears down existing application containers (preserves named volumes)
  2. Builds new images from infra/docker/Dockerfile.base (multi-stage, non-root user seratonin)
  3. Starts postgres + redis and waits for health checks
  4. Runs Alembic migrations via scripts/migrate.sh
  5. Brings up all services

Services in Production

Service Image Port Notes
backend Dockerfile.base 8001 2 Uvicorn workers, metrics disabled
worker Dockerfile.base 9000 Taskiq worker, Prometheus metrics on :9000
scheduler Dockerfile.base 9001 Taskiq scheduler for cron tasks
postgres postgres:15-alpine internal External named volume docker_postgres_data
redis redis:7.2-alpine internal AOF persistence, external volume docker_redis_data
qdrant qdrant/qdrant:latest internal External volume docker_qdrant_data
n8n n8nio/n8n:latest 5678 External volume docker_n8n_data
prometheus prom/prometheus 9090 Scrapes backend :8001/metrics and worker :9000
grafana grafana/grafana 3000 Dashboards: backend, LLM costs, Taskiq
loki + promtail Grafana stack 3100 Log aggregation from Docker socket

Docker Image

Dockerfile.base uses a two-stage build:

Stage 1 (builder): python:3.13-slim
  └─ Poetry 2.0.1 exports requirements.txt (prod deps only)

Stage 2 (runtime): python:3.13-slim
  └─ Non-root user: seratonin:seratonin
  └─ Model cache dirs: /app/cache/huggingface, /app/cache/fastembed
  └─ Shared by: backend, worker, scheduler (different CMD per service)

Persistent Volumes

All data volumes are declared as external: true with fixed names β€” they survive docker-compose down and must be pre-created on the host:

docker volume create docker_postgres_data
docker volume create docker_redis_data
docker volume create docker_qdrant_data
docker volume create docker_n8n_data

🀝 Contributing

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open Pull Request

See CONTRIBUTING.md for detailed guidelines.


πŸ“„ License

This project is licensed under the MIT License β€” see LICENSE for details.


πŸ™ Acknowledgments

  • LlamaIndex for the RAG framework
  • Taskiq for modern async-native task processing
  • Qdrant for vector search with hybrid mode

Created by PyDevDeep

About

AI-driven medical content engine using RAG (LlamaIndex/Qdrant), FastAPI, and Taskiq for automated multi-platform publishing.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages