Search Agent
agent.py
Running Evaluations
Example Evaluation Output
Successful Validation
summary.json
Failed Validation
summary.json
Validation Configuration
evals.yaml
Key Features
- Tools Used Validation: Verify agents use the correct tools with proper parameters
- Tool Sequence Validation: Check that tools are used in the expected order
- Workflow Enforcement: Ensure agents follow expected business logic
- Semantic Tool Validation: Use LLM to validate tool usage appropriateness