AI-driven medical content engine using RAG (LlamaIndex + Qdrant), FastAPI, and Taskiq for automated multi-platform publishing with physician style preservation.
Serotonin Script is an autonomous system for generating and distributing medically-accurate content across social platforms. It leverages RAG (Retrieval-Augmented Generation) to ensure medical precision while preserving the unique authorial voice of healthcare professionals.
The system covers the full content lifecycle: from a single /draft Slack command β RAG-powered generation β physician approval β multi-platform publishing β post-publish vectorization for continuous style improvement.
- Style Preservation β Vector-based retrieval of physician's writing patterns via hybrid search (dense + BM25)
- Medical Accuracy β Fact-checking against PubMed API and clinical guidelines (Chain-of-Verification)
- Multi-Platform Publishing β Automated distribution to Telegram, X (Twitter), Threads via n8n workflows
- Async-First Architecture β High-performance task processing via Taskiq + Redis (chosen over Celery: ~50-80 MB memory footprint vs ~150-200 MB, 1-2s startup vs 7-10s)
- Slack-Native UX β Draft approval workflow with interactive Block Kit UI
- RAG Feedback Loop β Published posts automatically vectorized back into Qdrant for continuous style learning
- Production Observability β Prometheus metrics, Grafana dashboards (backend, LLM costs, Taskiq queue), Loki log aggregation
| Layer | Technology | Purpose |
|---|---|---|
| API Framework | FastAPI | Async-native REST API |
| Task Queue | Taskiq 0.11+ + Redis | Background job processing, async-native |
| AI Engine | Claude 3.5 Sonnet / GPT-4o | Content generation with LLM router + fallback |
| Vector Store | Qdrant | Semantic search for style matching and knowledge retrieval |
| RAG Framework | LlamaIndex | Retrieval-augmented generation pipeline |
| Search | Hybrid (dense + BM25) | Qdrant hybrid mode for improved retrieval precision |
| External Data | PubMed API + BeautifulSoup | Medical fact verification |
| Orchestration | n8n (self-hosted) | Workflow automation, scheduling, social delivery |
| Database | PostgreSQL + Alembic | Relational data with async sessions (asyncpg) |
| Monitoring | Prometheus + Grafana + Loki + Promtail | Metrics, dashboards, log aggregation |
| Reverse Proxy | Nginx | HTTPS termination |
| Aspect | Celery | Taskiq |
|---|---|---|
| Architecture | Sync-first | Async-native (shared event loop with FastAPI) |
| Dependency Injection | Manual wiring | TaskiqDepends β identical to FastAPI |
| Memory per worker | ~150-200 MB | ~50-80 MB |
| Cold start | 7-10 seconds | 1-2 seconds |
| Type hints | Partial | Full (Pydantic-native) |
| Testing | Complex mocking | Direct async function calls |
See ADR: Taskiq over Celery for the full decision record.
serotonin_script/
βββ backend/
β βββ api/
β β βββ middleware/ # auth (Slack sig), error_handler, logging, rate_limit (Redis sliding-window)
β β βββ routes/ # drafts, feedback (Slack interactions), health
β βββ config/ # settings (Pydantic), system_prompts, lexicon (Slack UI text)
β βββ integrations/
β β βββ external/ # pubmed_client (NCBI E-utils), web_scraper (BeautifulSoup)
β β βββ llm/ # anthropic_client, openai_client, router (fallback logic)
β βββ models/ # db_models (SQLAlchemy 2.0), schemas (Pydantic v2), enums
β βββ rag/
β β βββ indexing/ # document_loader (MD/PDF/TXT), chunking (SentenceSplitter), embedder
β β βββ pipelines/ # hybrid_search (dense + BM25)
β β βββ retrieval/ # style_retriever, knowledge_retriever, base protocol
β βββ repositories/ # draft_repository, feedback_repository, post_repository
β βββ services/ # content_generator, draft_service, fact_checker, style_matcher, publisher_service
β βββ utils/ # structured logging (Structlog)
β βββ workers/
β β βββ middlewares/ # LoggingMiddleware, RetryMiddleware (exp. backoff), PrometheusMiddleware
β β βββ tasks/ # generate_draft, publish_post, ingest_guideline, scheduled_post, vectorize_post
β β βββ broker.py # Taskiq Redis broker (ListQueueBroker + RedisAsyncResultBackend, TTL 1h)
β β βββ callbacks.py # Slack Block Kit notifications on task complete/failure
β β βββ dependencies.py # TaskiqDepends: StyleMatcher, FactChecker, LLMRouter, ContentGenerator, PublisherService
β βββ tests/
β βββ unit/ # 20 test modules β services, RAG, workers, API, middleware
β βββ integration/ # test_draft_service.py (full service stack)
βββ knowledge_base/
β βββ doctor_style/ # Physician's articles & posts (.md) + metadata.json
β βββ medical_guidelines/ # Clinical protocol PDFs
βββ slack_app/
β βββ blocks/ # draft_card.json, approval_modal.json, status_message.json
β βββ handlers/ # slash_commands.py (/draft), interactions.py, events.py
β βββ utils/block_builder.py # Dynamic Block Kit UI constructor
βββ orchestration/
β βββ n8n/ # Workflow definitions + credentials guide
β βββ monitoring/ # n8n health check (circuit breaker)
βββ database/
β βββ migrations/ # Alembic versions (initial schema + platform/scheduled_at)
β βββ seeds/initial_data.sql
βββ infra/
β βββ docker/ # Dockerfile.backend, Dockerfile.worker, Dockerfile.base
β βββ monitoring/ # Prometheus, Grafana dashboards (backend/llm_costs/taskiq), Loki, Promtail
β βββ nginx/nginx.conf
βββ scripts/
β βββ index_knowledge_base.py # Bulk ingestion into Qdrant
β βββ test_pipeline.py # E2E pipeline test
β βββ deploy.sh / migrate.sh / setup.sh
βββ docs/
β βββ architecture.md
β βββ api_spec.yaml # OpenAPI 3.0
β βββ deployment.md
β βββ runbook.md
β βββ taskiq_guide.md
β βββ adr/ # 001-vector-store, 002-llm-selection, 003-taskiq-over-celery
βββ docker-compose.yml
- Docker & Docker Compose
- Python 3.13 (for local development)
- Slack workspace with
/draftslash command configured - API keys: Anthropic, OpenAI
- n8n credentials: Telegram Bot Token, X (Twitter) OAuth2, Threads Access Token (configured inside n8n, not in
.env)
# Clone repository
git clone https://github.com/PyDevDeep/serotonin-script.git
cd serotonin-script
# Configure environment
cp .env.example .env
# Edit .env with your API keys and credentials
# Start all services (API + worker + Redis + Qdrant + PostgreSQL + n8n + monitoring)
docker-compose up --build| Service | URL |
|---|---|
| API | http://localhost:8000 |
| API Docs (Swagger) | http://localhost:8000/docs |
| n8n Workflows | http://localhost:5678 |
| Grafana | http://localhost:3000 |
# Ingest physician's writing samples + medical guidelines into Qdrant
python scripts/index_knowledge_base.pyLoads documents from knowledge_base/doctor_style/ and knowledge_base/medical_guidelines/ β chunks, embeds, and stores vectors in two separate Qdrant collections.
/draft anxiety management tips
/draft depression coping strategies telegram
Full workflow:
Slack /draft
βββΊ n8n Webhook
βββΊ POST /api/v1/draft β returns task_id immediately (< 500ms)
βββΊ Taskiq generate_draft task
βββ StyleMatcher β retrieves top-5 physician posts (Qdrant)
βββ FactChecker β PubMed API + web scraping + Chain-of-Verification
βββ ContentGenerator (Claude 3.5 Sonnet β GPT-4o fallback)
βββΊ Slack callback β Block Kit draft card
From the Slack draft card:
- Publish to Telegram / X / Threads β triggers
publish_postTaskiq task βpublisher_service.pydispatches a webhook to n8n β n8n executes the platform-specific workflow (Telegram Bot API / Twitter API v2 / Threads API) - Edit β opens Slack modal with full text editor + platform/schedule selector
- Regenerate β re-queues
generate_draftwith same topic
Publishing architecture note:
publisher_service.pyis a thin dispatcher β it sends a structured webhook payload to n8n and tracks publication status in PostgreSQL. The actual social platform API calls (auth, formatting, retry logic) live entirely in n8n workflows underorchestration/n8n/workflows/. To modify platform-specific publishing behavior, edit the n8n workflow β not the Python service.
After publishing, vectorize_post task automatically embeds the final approved text back into Qdrant (doctor_style collection) β the system continuously learns the physician's evolving style.
# Full test suite with coverage
make test
# Unit tests only
make test-unit
# Integration tests (requires running containers)
make test-integration# Install dependencies
poetry install
# Run API server
poetry run uvicorn backend.api.main:app --reload
# Run Taskiq worker (2 processes, max 10 concurrent async tasks)
poetry run taskiq worker backend.workers.broker:broker --workers 2 --max-async-tasks 10alembic revision --autogenerate -m "description"
alembic upgrade headOverall: 98% (4627 statements, 103 missed)
| Module | Coverage |
|---|---|
services/content_generator.py |
100% |
services/draft_service.py |
100% |
services/fact_checker.py |
100% |
services/style_matcher.py |
100% |
api/middleware/auth.py |
100% |
api/middleware/error_handler.py |
100% |
integrations/external/pubmed_client.py |
100% |
integrations/llm/router.py |
100% |
rag/pipelines/hybrid_search.py |
100% |
rag/retrieval/knowledge_retriever.py |
100% |
rag/retrieval/style_retriever.py |
100% |
workers/tasks/generate_draft.py |
100% |
workers/tasks/publish_post.py |
100% |
workers/callbacks.py |
100% |
api/routes/feedback.py |
96% |
api/middleware/rate_limit.py |
91% |
services/publisher_service.py |
91% |
api/routes/drafts.py |
40% |
integrations/external/web_scraper.py |
38% |
api/routes/drafts.py(40%) andweb_scraper.py(38%) are the remaining gaps β route integration tests and scraper HTTP mocking are the next testing targets.
Three pre-built Grafana dashboards:
| Dashboard | URL | Tracks |
|---|---|---|
| Backend Metrics | http://localhost:3000/d/backend_metrics | Request rate, latency (p95), error rate |
| LLM Costs | http://localhost:3000/d/llm_costs | Token usage, API calls, cost per platform |
| Taskiq Metrics | http://localhost:3000/d/taskiq_metrics | Queue depth, task duration, failure rate |
Prometheus alert rules configured for:
- Task failure rate > 5%/hour
- Queue depth > 100 tasks
- Task duration p95 > 60s
- LLM error rate > 10% in 5 minutes
| Document | Description |
|---|---|
| Architecture | System design and component interactions |
| API Spec | OpenAPI 3.0 specification |
| Taskiq Guide | Async worker patterns and configuration |
| Deployment | Production deployment guide |
| Runbook | Operational procedures and troubleshooting |
| ADR: Vector Store | Qdrant selection rationale |
| ADR: LLM Selection | Claude + GPT-4o fallback design |
| ADR: Taskiq vs Celery | Task queue decision record |
Four GitHub Actions workflows form a fully automated pipeline triggered on push to main:
| Workflow | Trigger | What it does |
|---|---|---|
lint.yml |
push / PR β main |
Ruff linter, Ruff formatter check, Pyright type checker |
test.yml |
push / PR β main |
poetry install β cp .env.example .env β pytest |
build.yml |
push β main |
Builds and pushes 3 Docker images to GHCR (backend, worker, scheduler) tagged latest + commit SHA |
deploy.yml |
on build.yml success |
SSH into VPS β git pull origin main β bash scripts/deploy.sh |
Pipeline flow on every merge to main:
push β main
βββΊ lint.yml (parallel)
βββΊ test.yml (parallel)
βββΊ build.yml β pushes ghcr.io/<owner>/serotonin_script-{backend,worker,scheduler}
βββΊ deploy.yml β SSH β git pull β deploy.sh
deploy.yml runs only if build.yml concluded with success (if: github.event.workflow_run.conclusion == 'success'). Required GitHub Secrets: SERVER_HOST, SERVER_USER, SERVER_SSH_KEY.
See docs/deployment.md for the full guide. Quick reference:
The production stack uses two Compose files layered together: docker-compose.yml (infrastructure services) and infra/docker-compose.prod.yml (application services).
# One-command deployment
bash scripts/deploy.shdeploy.sh executes in order:
- Tears down existing application containers (preserves named volumes)
- Builds new images from
infra/docker/Dockerfile.base(multi-stage, non-root userseratonin) - Starts
postgres+redisand waits for health checks - Runs Alembic migrations via
scripts/migrate.sh - Brings up all services
| Service | Image | Port | Notes |
|---|---|---|---|
backend |
Dockerfile.base |
8001 |
2 Uvicorn workers, metrics disabled |
worker |
Dockerfile.base |
9000 |
Taskiq worker, Prometheus metrics on :9000 |
scheduler |
Dockerfile.base |
9001 |
Taskiq scheduler for cron tasks |
postgres |
postgres:15-alpine |
internal | External named volume docker_postgres_data |
redis |
redis:7.2-alpine |
internal | AOF persistence, external volume docker_redis_data |
qdrant |
qdrant/qdrant:latest |
internal | External volume docker_qdrant_data |
n8n |
n8nio/n8n:latest |
5678 |
External volume docker_n8n_data |
prometheus |
prom/prometheus |
9090 |
Scrapes backend :8001/metrics and worker :9000 |
grafana |
grafana/grafana |
3000 |
Dashboards: backend, LLM costs, Taskiq |
loki + promtail |
Grafana stack | 3100 |
Log aggregation from Docker socket |
Dockerfile.base uses a two-stage build:
Stage 1 (builder): python:3.13-slim
ββ Poetry 2.0.1 exports requirements.txt (prod deps only)
Stage 2 (runtime): python:3.13-slim
ββ Non-root user: seratonin:seratonin
ββ Model cache dirs: /app/cache/huggingface, /app/cache/fastembed
ββ Shared by: backend, worker, scheduler (different CMD per service)
All data volumes are declared as external: true with fixed names β they survive docker-compose down and must be pre-created on the host:
docker volume create docker_postgres_data
docker volume create docker_redis_data
docker volume create docker_qdrant_data
docker volume create docker_n8n_data- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open Pull Request
See CONTRIBUTING.md for detailed guidelines.
This project is licensed under the MIT License β see LICENSE for details.
- LlamaIndex for the RAG framework
- Taskiq for modern async-native task processing
- Qdrant for vector search with hybrid mode
Created by PyDevDeep