Eval-driven development
Anthropic - Demystifying evals for AI agents
(paper) Evaluation-Driven Development of LLM Agents: A Process Model and Reference Architecture
Anthropic - Demystifying evals for AI agents
(paper) Evaluation-Driven Development of LLM Agents: A Process Model and Reference Architecture