Creative Writing Agent
agent.py
Running Evaluations
Validation Configuration
evals.yaml
Example Evaluation Output
Successful Validation
summary.json
Failed Validation
summary.json
Key Features
- Content Validation: Verify agent responses include required keywords and exclude unwanted content
- Format Checking: Ensure responses follow expected structure and patterns
- Semantic Evaluation: Use LLM to validate response quality and relevance
- Comprehensive Testing: Test both successful and failed validation scenarios