crwdns2162:0crwdne2162:0
crwdns1688:0crwdne1688:0 crwdns1690:0crwdne1690:0

crwdns4382:0crwdne4382:0

Interactive Evaluation of an Adaptive-Questioning Symptom Checker Using Standardized Clinical Vignettes

crwdns4820:0crwdne4820:0
crwdns4822:0crwdne4822:0
medRxiv
DOI
10.1101/2025.08.21.25333628

Objective

To evaluate the triage performance and history-taking quality of an adaptive-questioning symptom checker (CareRoute) using an interactive protocol on standardized clinical vignettes (Semigran et al., BMJ 2015; 45 cases).

Methods

Each session began with only the presenting complaint; CareRoute asked follow-up questions adaptively, and the evaluator answered concisely per the vignette. At the end of questioning, CareRoute issued a triage recommendation. We compared CareRoute’s issued triage with the reference triage and computed history-taking quality from normalized features derived from each vignette’s Condensed Format. History-taking quality comprised (i) elicitation coverage—the percentage of a vignette’s normalized features obtained through questioning, and (ii) elicitation fraction—the proportion of surfaced normalized features (elicited or volunteered) that were obtained through questioning. Primary outcomes were triage concordance and history-taking quality; the secondary outcome was user burden (time spent answering questions). We did not evaluate possible diagnoses, though CareRoute issues them.

Results

Exact 3-tier triage concordance was 88.9% (40/45; 95% CI 76.5–95.2%). Elicitation coverage had a median of 67% (IQR 60–71%), and elicitation fraction had a median of 70% (IQR 62–75%). CareRoute asked a median of 19 questions overall (IQR 16–20), with urgency-conditioned questioning: Emergency Care median 10 questions (IQR 4–14), Doctor Visit median 19 questions (IQR 18–20), Self Care median 19 questions (IQR 17–20).

Conclusions

In an interactive, vignette-constrained evaluation starting from only the presenting complaint, CareRoute achieved high 3-tier triage concordance (88.9%) with no under-triage on Emergency-reference vignettes, while eliciting most normalized features (median elicitation coverage 67%; median elicitation fraction 70%) with acceptable user burden via urgency-conditioned questioning.

crwdns4826:0Interactive Evaluation of an Adaptive-Questioning Symptom Checker Using Standardized Clinical Vignettescrwdne4826:0

crwdns4828:0crwdne4828:0

crwdns4830:0crwdne4830:0

crwdns4832:0crwdne4832:0

crwdns4834:0crwdne4834:0

crwdns5816:0crwdne5816:0