This multicentre, cluster-randomised controlled trial evaluates two digital innovations for improving diagnostic accuracy and treatment quality in acute otitis media (AOM) management in Swedish primary care: a gamified training tool for physicians (AOM Diagnosis \[Dx\] Trainer) and an AI-based diagnostic support system. The study assesses diagnostic accuracy, adherence to evidence-based treatment guidelines, and comparative AI performance.
The trial is coordinated by the Västra Götaland Region (VGR) in collaboration with Umeå University and conducted across four Swedish regions: Västra Götaland, Västerbotten, Östergötland, and Skåne. Up to 20 primary care centres will participate, depending on centre size and recruitment rates. Centres are randomised to either the AOM Dx Trainer intervention or standard care (control).
Study design and participants:
Both physicians and patients are research participants. Physicians in the intervention arm complete the AOM Dx Trainer before patient inclusion begins, while control physicians receive no training. Each centre recruits consecutive patients with new-onset ear symptoms (≤1 month) during an 8-week inclusion period (intervention or control). Consultations are conducted according to standard clinical practice. After each visit, research nurses ensure eligibility and consent and collect tympanic membrane images and tympanometry data solely for research purposes.
Interventions:
The AOM Dx Trainer is a gamified, case-based training tool designed to improve recognition of AOM and related conditions. Physicians classify anonymised tympanic images with symptom vignettes into diagnostic categories and receive immediate feedback. Training continues until a predefined performance threshold is reached.
The AI diagnostic tool, developed at Umeå University, uses convolutional neural networks (CNNs) to analyse tympanic images, with or without symptom and tympanometry data. AI analyses will be performed retrospectively in a laboratory setting and will not influence patient care.
Data collection:
After each visit, patients meet a research nurse who ensures eligibility and consent, and records demographics, symptom severity (AOM-SOS v5), and potential complicating factors (e.g., severe pain despite analgesics, immunosuppression, previous ear surgery, cochlear implant). The physician's diagnosis and any antibiotic prescription (drug and duration) are documented for later comparison with guideline recommendations and expert panel consensus. Physician characteristics are recorded under pseudonymised IDs. Tympanic membrane images are captured using CE-marked video otoscopes (EarPenguin) and tympanometry devices. All data are entered into case report forms (CRFs) and stored securely but are not shared with treating physicians, ensuring real-world diagnostic conditions.
Data handling and ethics:
All data are pseudonymised. Code keys are securely stored within each region. The national coordinating centre in Gothenburg oversees data management and quality assurance. Data are stored on GDPR-compliant servers. Ethical approval has been granted by the Swedish Ethical Review Authority (Ref. 2025-03523-01).
Expert panel reference diagnoses:
Tympanic images and tympanometry results are reviewed retrospectively by an expert panel (two ENT specialists and one senior GP). Consensus diagnoses serve as the reference standard for assessing diagnostic accuracy among physicians and for benchmarking AI performance.
Primary outcome:
Diagnostic accuracy of tympanic membrane classification (normal, AOM, erythematous membrane without effusion, or otitis media with effusion) compared with expert consensus.
Secondary outcomes:
Adherence to national treatment guidelines; antibiotic prescribing rates and duration; and comparative diagnostic performance between physicians and AI configurations.
Sample size and power:
The sample size calculation targets the primary research question. Using a bivariate logistic regression as a proxy (outcome: correct vs. incorrect diagnosis; predictor: intervention status), unpublished data from our group suggest approximately 50% diagnostic accuracy among physicians without AOM Dx Trainer training. We hypothesize an improvement to around 75% among trained physicians, which is considered a clinically relevant threshold. Assuming a two-sided alpha of 0.05 and 95% power, a minimum of 195 patients is required (G\*Power 3.1.9.7). Therefore, approximately 200 patients will be recruited, with roughly equal numbers in the intervention and control arms. This target aligns with the planned multicentre cluster-randomised design and is feasible within the established Swedish primary care research network.