HA Conversational Benchmark
Loadingā¦
Run
Auto-refresh (30s)
Copy report JSON
Compare runs
Enable compare
Run A
Run B
Model leaderboard
Click a row to filter tests by model
Model charts
Pass rate
Avg latency
Cost (USD)
Model pricing
Show pricing
Test filters
Outcome
All
Passed
Failed
Skipped
Model
Test file
All
Entity
All
Search
Latency min (ms)
Latency max (ms)
Cost min (USD)
Cost max (USD)
Flags
Hallucination
Clarification
Wrong entity
Any flag set
Clear filters
Group by
None
Test file
Entity
Tests
Clear model filter