A conversational AI system that answers student loan questions using verified federal documentation. Built with Claude (Anthropic), Pinecone vector search, Voyage AI embeddings, and a custom quality evaluation framework.
Live demo: huggingface.co/spaces/DronA23/student-loan-rag-chatbot
Student loan information is complex, scattered, and hard to find.
When you ask a general AI "What's the interest rate for my student loan?", it gives a generic answer. It does not know your loan type, your school, or the current 2024-2025 rates. It might even make something up.
This is called hallucination: AI confidently giving wrong information.
For student loans, wrong information is dangerous. Wrong repayment advice could cost someone thousands of dollars.
RAG stands for Retrieval-Augmented Generation.
Instead of asking AI to guess, we give it the right documents first, then ask it to answer.
Without RAG:
User: "What's the interest rate for subsidized loans?"
AI: "It's usually around 3-7%..." (guessing, possibly wrong)
With RAG:
User: "What's the interest rate for subsidized loans?"
System: [finds interest_rates.txt, 92% match]
AI: "Based on Document 1, the rate is 6.53% for 2024-2025."
(grounded in real data, cited source)
- Affects 43+ million students in America
- Interest rates reset every year, so static AI knowledge goes stale
- High stakes: wrong advice means real financial harm
- Rich public data from Federal Student Aid and StudentLoans.gov
- Mirrors real-world fintech and lending AI use cases
flowchart TD
A([User Question]) --> B[Generate Embedding\nConvert text to numbers]
B --> C[(Vector Database\nPinecone Cloud)]
C --> D[Retrieve Top Documents\nMost similar to question]
D --> E[Build Structured Prompt\nDocuments + Question + Instructions]
E --> F[Claude Sonnet 4.6\nAnthropic API]
F --> G([Answer + Sources])
G --> H[Evaluate Quality\nFaithfulness + Relevance + Score]
style A fill:#4CAF50,color:#fff
style F fill:#FF6B35,color:#fff
style G fill:#2196F3,color:#fff
style H fill:#9C27B0,color:#fff
sequenceDiagram
participant U as User
participant R as RAG Agent
participant V as Pinecone
participant C as Claude API
participant E as Evaluator
U->>R: "What is the interest rate?"
R->>R: Convert question to embedding vector (Voyage AI)
R->>V: Search for similar documents
V-->>R: Return top 5 matched documents
R->>R: Build structured prompt
R->>C: Send prompt (docs + question)
C-->>R: Answer with citations
R->>E: Evaluate response quality
E-->>R: Faithfulness 88% + Overall 76%
R-->>U: Answer + Sources + Quality Score
flowchart LR
A[RAG Response] --> B[Faithfulness\n40% weight]
A --> C[Relevance\n30% weight]
A --> D[Answer Relevance\n30% weight]
B --> E{Overall Score}
C --> E
D --> E
E --> F{Grade}
F -->|> 75%| G[Excellent]
F -->|60-75%| H[Good]
F -->|45-60%| I[Fair]
F -->|< 45%| J[Poor]
style G fill:#4CAF50,color:#fff
style H fill:#2196F3,color:#fff
style I fill:#FF9800,color:#fff
style J fill:#F44336,color:#fff
Tested against 8 real student loan questions:
| Metric | Score | Meaning |
|---|---|---|
| Overall Quality | 76.2% | Combined quality score |
| Faithfulness | 88.4% | Low hallucination rate |
| Relevance | 61.2% | Uses source documents |
| Answer Relevance | 75.0% | Actually answers the question |
Verdict: EXCELLENT, Production Ready
| Question | Score |
|---|---|
| Types of federal student loans | 82% |
| Interest rate for subsidized loans | 82% |
| Income-driven repayment plans | 74% |
| Eligibility requirements | 85% |
| Subsidized vs unsubsidized | 64% |
| Public Service Loan Forgiveness | 70% |
| Getting a loan with bad credit | 71% |
| When to start repaying | 83% |
llm-rag-chatbot/
├── src/
│ ├── llm.py # Claude API wrapper + Voyage AI embeddings
│ ├── vector_db.py # Pinecone cloud integration (Phase 3)
│ ├── vector_db_mock.py # In-memory vector store (Phase 1-2 baseline)
│ ├── rag_agent.py # RAG orchestrator
│ └── evaluation.py # Quality metrics (faithfulness, relevance)
├── data/student_loans/ # 8 verified federal documents
├── app.py # Gradio web UI (local + HF Spaces dual-mode)
├── lambda_handler.py # AWS Lambda entry point
├── Dockerfile # Lambda container image
├── cloudwatch_config.json # CloudWatch alarms and dashboard config
├── requirements_spaces.txt # Lightweight deps for Hugging Face Spaces
├── setup_pinecone.py # One-time document ingestion script
├── test_phase1_step1.py # Tests response generation
├── test_phase1_step2.py # Tests full RAG pipeline
├── test_phase2_step1.py # Tests evaluation metrics
├── test_e2e.py # End-to-end system test (8 questions)
└── .env.example # Required environment variables
- Python 3.11+
- Anthropic API key
- Pinecone API key
- Voyage AI API key
# 1. Clone the repo
git clone https://github.com/drona23/llm-rag-chatbot.git
cd llm-rag-chatbot
# 2. Create virtual environment
python3.11 -m venv venv
source venv/bin/activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Configure environment variables
cp .env.example .env
# Edit .env with your API keys
# 5. Upload documents to Pinecone (one time)
python setup_pinecone.py
# 6. Run end-to-end test
python test_e2e.py
# 7. Launch the web UI
python app.pyPublic demo: huggingface.co/spaces/DronA23/student-loan-rag-chatbot
The Gradio interface runs locally at http://localhost:7860.
Layout:
- Left panel: Chat interface with example questions
- Right panel: Live source documents with cosine similarity scores and confidence level per response
source venv/bin/activate
python app.pyThe app supports two modes controlled by the RAG_API_URL environment variable:
| Mode | How to activate | What runs |
|---|---|---|
| Local | No RAG_API_URL set |
Full RAG agent in-process |
| Cloud | RAG_API_URL=https://... |
Forwards requests to AWS Lambda |
The Hugging Face Spaces deployment uses cloud mode. Only gradio and requests are installed there. The heavy ML dependencies (Pinecone, Voyage AI, Anthropic) run entirely on Lambda.
flowchart TD
U([User / App]) --> AG[API Gateway\nPublic HTTPS endpoint]
AG --> LM[AWS Lambda\nlambda_handler.py]
LM --> EV[Environment Variables\nAPI Keys stored securely]
LM --> CW[CloudWatch\nLogs + Metrics + Alarms]
LM --> PC[(Pinecone\nVector Database)]
LM --> AN[Claude API\nAnthropic]
style U fill:#4CAF50,color:#fff
style AG fill:#FF9800,color:#fff
style LM fill:#2196F3,color:#fff
style PC fill:#9C27B0,color:#fff
style AN fill:#FF6B35,color:#fff
| Feature | Benefit |
|---|---|
| Serverless | No server to manage, auto-scales |
| Pay per request | Zero cost when idle |
| Container image | Supports large dependencies (10GB limit) |
| API Gateway | Public HTTPS URL instantly |
| CloudWatch | Built-in logging, alarms, and dashboards |
# 1. Build container image
docker build -t rag-chatbot .
# 2. Tag and push to Amazon ECR
aws ecr create-repository --repository-name rag-chatbot
docker tag rag-chatbot:latest <account>.dkr.ecr.us-east-1.amazonaws.com/rag-chatbot:latest
docker push <account>.dkr.ecr.us-east-1.amazonaws.com/rag-chatbot:latest
# 3. Create Lambda function from container
aws lambda create-function \
--function-name rag-chatbot \
--package-type Image \
--code ImageUri=<account>.dkr.ecr.us-east-1.amazonaws.com/rag-chatbot:latest \
--role arn:aws:iam::<account>:role/lambda-execution-role
# 4. Set environment variables (API keys stored securely)
aws lambda update-function-configuration \
--function-name rag-chatbot \
--environment Variables="{ANTHROPIC_API_KEY=...,PINECONE_API_KEY=...,VOYAGE_API_KEY=...}"
# 5. Create API Gateway endpoint
aws apigateway create-rest-api --name "StudentLoanRAG"POST https://shzjgfckxe.execute-api.us-east-1.amazonaws.com/prod/chat
POST https://shzjgfckxe.execute-api.us-east-1.amazonaws.com/prod/chat
Content-Type: application/json
{
"message": "What is the interest rate for subsidized loans?"
}
Response:
{
"response": "Based on Document 1, the rate is 6.53% for 2024-2025...",
"sources": ["Based on the 2024-2025 academic year..."],
"confidence": 0.92,
"retrieval_scores": [0.94, 0.91, 0.88, 0.85, 0.82]
}timeline
title Project Roadmap
Phase 1 : Core RAG System
: Claude API integration
: Mock vector database
: RAG agent orchestrator
Phase 2 : Evaluation Framework
: 3 quality metrics
: 8 student loan documents
: E2E testing at 76.2% score
Phase 3 : Real Vector DB
: Pinecone cloud integration
: Voyage AI semantic embeddings
: Semantic search replacing mock
Phase 4 : Cloud Deployment
: AWS Lambda container image
: API Gateway public endpoint
: CloudWatch monitoring and alarms
: Gradio web UI for live demos
: Hugging Face Spaces public demo
Phase 5 : Live Data
: Firecrawl integration
: Auto-sync from StudentLoans.gov
: Weekly rate updates
- Phase 1: Core RAG (LLM + Vector DB + Agent)
- Phase 2: Evaluation framework + 8 documents
- Phase 3: Real Pinecone + Voyage AI embeddings
- Phase 4: AWS Lambda + API Gateway + CloudWatch + Gradio UI
- Phase 5: Live data via Firecrawl + weekly sync
| Component | Phase 1-2 | Phase 3+ |
|---|---|---|
| LLM | Claude Sonnet 4.6 | Claude Sonnet 4.6 |
| Vector DB | MockVectorStore | Pinecone (cloud) |
| Embeddings | Hash-based mock | Voyage AI (semantic) |
| Deployment | Local | AWS Lambda + Docker |
| UI | None | Gradio web interface |
| Data | Static .txt files | Static .txt (Firecrawl planned) |
| Language | Python 3.11+ | Python 3.11+ |
MIT