ML Lead · Team Data Voyagers · CMRIT Bengaluru, AI & Data Science · Class of 2027
I build AI systems that survive real-world complexity — not demos that collapse outside a notebook.
My stack is LangGraph pipelines, agentic RAG, production FastAPI backends, and LLMs wired to do actual work. Three national hackathon finals. Two open-source codebases debugged upstream. One operating principle:
Ship systems. Not slides.
# ══════════════════════════════════════════════════════
# SYSTEM DEFINITION — AnubhabPradhan.v2
# ══════════════════════════════════════════════════════
class AnubhabPradhan:
role = "ML Engineer · AI Systems Builder"
location = "Bengaluru, India"
systems_built = [
"Agentic Job Application Assistant → 7-node LangGraph StateGraph + Claude API",
"DeepDive Video Intelligence → Multi-video RAG + Speaker Diarization",
"F1 AI Race Engineer → FastF1 telemetry + XGBoost + LLM chat",
"Multi-Cancer Detection → CNN + RF ensemble · 92% accuracy · SHAP",
"Network Intrusion Detection → Random Forest on NSL-KDD · Streamlit",
]
current_focus = ["F1 AI Race Engineer (Day 2: feature engineering)",
"LangChain upstream contributions",
"Open-source AI tooling"]
domains = ["Agentic AI", "LLM Orchestration", "RAG",
"Deep Learning", "Explainable AI"]
mission = "Build AI that operates — not AI that impresses."
mindset = "Systems thinking. Zero tolerance for ambiguity. Ship or iterate."
open_to = "ML / AI Engineer Internships · FAANG · AI-native startups"01 · Agentic Job Application Assistant
Problem — Job applications break at the tailoring step. Generic resumes lose to the ATS. Manual tailoring is slow and guesswork.
What I built — A production full-stack system with a 7-node LangGraph StateGraph: JD Decomposer → Resume Analyzer → Resume Rewriter → Cover Letter Writer → Gap Analyzer → Interview Brief → Score Aggregator. FastAPI backend with Server-Sent Events streaming. Claude Sonnet via Anthropic API. PyMuPDF parsing. Dark editorial frontend with match-score dashboard and live before/after diff view.
Impact — Not a "generate cover letter" button. An orchestrated reasoning pipeline that understands skill gap, rewrites strategically, and delivers a complete application package — streamed live with full transparency.
02 · DeepDive — AI Video Intelligence Platform
Problem — Video content is information-dense and unsearchable. You watch it linearly or lose it.
What I built — Full agentic pipeline: Groq Whisper for speaker-diarized transcription, visual frame intelligence for scene context, a custom NumPy + JSON vector store (no external DB; solved Windows ChromaDB compile issues), and a Hybrid Context RAG layer on top of LLaMA 3.3 70B — supporting multi-video cross-search in Hindi and English.
Impact — A core RAG failure was fixed where basic factual queries failed despite correct embeddings. The hybrid retrieval layer blends semantic search with metadata-constrained lookup, eliminating the blind spot.
03 · F1 AI Race Engineer · in progress
Problem — Formula 1 race strategy is opaque, decided in seconds using telemetry and institutional memory. No public tool reasons about it.
What I built — FastF1 telemetry ingestion → feature engineering pipeline → XGBoost tyre degradation and pit-stop prediction models → Claude API-powered conversational Race Engineer that explains strategy decisions in plain language, grounded in real session data — not hallucinations.
Impact — The LLM is not the brain. The ML model is. The LLM is the translator. That distinction is the architecture decision that makes this defensible.
04 · Multi-Cancer Detection Platform
Problem — Cancer screening accuracy matters. Model accuracy alone is not enough — clinical adoption requires explainability.
What I built — CNN + Random Forest ensemble across 8 cancer types (92% accuracy). SHAP explainability layer exposing feature-level reasoning. Plotly Dash diagnostic dashboard. The ensemble fuses deep convolutional features with classical ML decision boundaries.
Impact — 92% accuracy is a number. SHAP explanations are trust. Clinicians adopt systems that show their reasoning — not black boxes.
𝗔𝗜 · 𝗟𝗟𝗠 · 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗦𝘆𝘀𝘁𝗲𝗺𝘀
𝗦𝘆𝘀𝘁𝗲𝗺𝘀 · 𝗕𝗮𝗰𝗸𝗲𝗻𝗱 · 𝗜𝗻𝗳𝗿𝗮
𝗗𝗮𝘁𝗮 · 𝗩𝗶𝘀𝘂𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻
┌───────────────────────────────────────────────────────────────────────────────────────┐
│ REPO FINDING │
├───────────────────────────────────────────────────────────────────────────────────────┤
│ langchain-ai / langchain Isolated an independent else-branch logic fault in │
│ langchain_core/tracers/ _evaluate_in_project() — evaluation callbacks were │
│ evaluation.py · Issue #31802 silently skipped under specific condition sequences. │
├───────────────────────────────────────────────────────────────────────────────────────┤
│ AOSSIE / OpenVerifiableLLM Analysed PRs #66 + #39. Found critical Merkle tree │
│ logic error producing false proof verification + │
│ broken evaluation pipeline in the scoring layer. │
└───────────────────────────────────────────────────────────────────────────────────────┘
I read the codebase. I traced the failure path. I documented the root cause. Not a typo fix — a logic audit.
██████╗██╗ ██████╗ ███████╗███████╗██████╗
██╔════╝██║ ██╔═══██╗██╔════╝██╔════╝██╔══██╗
██║ ██║ ██║ ██║███████╗█████╗ ██║ ██║
██║ ██║ ██║ ██║╚════██║██╔══╝ ██║ ██║
╚██████╗███████╗╚██████╔╝███████║███████╗██████╔╝
╚═════╝╚══════╝ ╚═════╝ ╚══════╝╚══════╝╚═════╝
> signal_strength : MAXIMUM
> systems_shipped : CONFIRMED
> noise_generated : ZERO
> STATUS : This isn't a profile. It's a signal.