AI Agent Evaluation & QA — Test Before You Deploy

Evaluation frameworks for AI systems: RAGAS metrics, LLM-as-judge patterns, answer relevance scoring, faithfulness measurement, hallucination detection, regression test suites, and human evaluation workflows for production AI.