Agentic AI systems in production: RAG, LLM evaluation, and a GCP data platform, with real metrics.
- 70→90%validation
- −75%data cost
- 3.6Mrecords
- Multi-tenant conversational assistant with an agentic architecture (orchestrator and sub-agents) on Google ADK, with multi-provider routing via LiteLLM configurable without redeploy.
- Automated RAG quality evaluation subsystem (LLM-as-judge, multiple dimensions) with dataset generation at ingestion; fixed a defect that cut false negatives from 91% to nearly 0%.
- Document validation pipeline via multi-LLM consensus (GPT + Claude + Gemini) with OCR: validation pass rate from 70% to 90%.
- GCP data platform (BigQuery, Dataform, Dataflow; medallion architecture): 3.6M+ records processed with zero data loss, backfill 2× faster (34 → 17 min) and orphan-document cleanup; process cost reduced ~75% (from ~€1,400 to ~€300).