ZenoXCare — marketing
Loading AI evaluation report…
Latest run · PASSED
The ZenoXCare evaluation harness gates every PR that touches the public query router or any AI-adjacent module. Datasets, budgets, and archived run results live in the repo as audit evidence. Lowering a floor requires a PR comment with rationale and a linked Sentry / Datadog dashboard.
Feature
public-query-router
Cases in dataset
40
Budget pass / fail
8 / 0
Last run
Mon, 20 Apr 2026
Regression bar
Source: evals/public-query-router/budgets.json
Rationale: Calibrated to the current router's measured performance on the 40-row curated dataset (accuracy ~0.625, macroF1 ~0.611) with a small safety buffer below each measured value. The eval is now a regression bar: any PR that drops below these floors fails CI. Tightening these floors requires both (a) router improvements and (b) a PR comment with the rationale + linked Sentry/Datadog dashboard.
| Metric | Budget (floor) | Latest actual | Headroom | Status |
|---|---|---|---|---|
| accuracy | 60.0% | 62.5% | +2.5 pts | Pass |
| macroF1 | 55.0% | 61.1% | +6.1 pts | Pass |
Most recent archived run
Selected by recency from evals/_completed/. Top-level passed: true.
| Intent | F1 score | Budget | Status |
|---|---|---|---|
| compliance_standards_and_country_overlay | 85.7% | 70.0% | Pass |
| facility_and_map_discovery | 57.1% | 50.0% |
budgets.json fails CI. Lowering a floor requires governance approval in the PR.evals/_completed/so we can replay any past decision.| f1.compliance_standards_and_country_overlay |
|---|
| 70.0% |
| 85.7% |
| +15.7 pts |
| Pass |
| f1.facility_and_map_discovery | 50.0% | 57.1% | +7.1 pts | Pass |
|---|
| f1.ai_capabilities | 85.0% | 100.0% | +15.0 pts | Pass |
|---|
| f1.provider_verification_lookup | 40.0% | 50.0% | +10.0 pts | Pass |
|---|
| f1.independent_professional_workforce | 40.0% | 50.0% | +10.0 pts | Pass |
|---|
| f1.pricing_and_transparency | 40.0% | 50.0% | +10.0 pts | Pass |
|---|
| Pass |
| ai_capabilities | 100.0% | 85.0% | Pass |
|---|
| provider_verification_lookup | 50.0% | 40.0% | Pass |
|---|
| independent_professional_workforce | 50.0% | 40.0% | Pass |
|---|
| pricing_and_transparency | 50.0% | 40.0% | Pass |
|---|
How much do you charge for a telemedicine visit in Ghana?
Expected: pricing_and_transparency · Actual: platform_facts (conf 0.069) · Tags: pricing,ghana
Show me your fee schedule and refunds policy
Expected: pricing_and_transparency · Actual: governance_and_evidence (conf 0.036) · Tags: fees,refunds
What is your no-show rate per provider and how is it calculated?
Expected: care_completion_quality · Actual: platform_facts (conf 0.029) · Tags: quality,metrics
Do you comply with HIPAA? What about GDPR?
Expected: compliance_standards_and_country_overlay · Actual: platform_facts (conf 0.038) · Tags: hipaa,gdpr
List hospitals on your platform with their addresses
Expected: facility_and_map_discovery · Actual: platform_facts (conf 0.038) · Tags: facility
How does your discovery / search rank providers fairly?
Expected: care_reach_public · Actual: facility_and_map_discovery (conf 0.056) · Tags: discovery