Computer-Aided Diagnosis has shown stellar performance in providing accurate medical diagnoses across multiple testing modalities (medical images, electrophysiological signals, etc.). While this field has typically focused on fully harvesting the signal provided by a single (and generally extremely reliable) modality, fewer efforts have utilized imprecise data lacking reliable ground truth labels. In this unsupervised, noisy setting, the robustification and quantification of the diagnosis uncertainty become paramount, thus posing a new challenge: how can we combine multiple sources of information -- often themselves with vastly varying levels of precision and uncertainty -- to provide a diagnosis estimate with confidence bounds? Motivated by a concrete application in antibody testing, we devise a Stochastic Expectation-Maximization algorithm that allows the principled integration of heterogeneous, and potentially unreliable, data types. Our Bayesian formalism is essential in (a) flexibly combining these heterogeneous data sources and their corresponding levels of uncertainty, (b) quantifying the degree of confidence associated with a given diagnostic, and (c) dealing with the missing values that typically plague medical data. We quantify the potential of this approach on simulated data, and showcase its practicality by deploying it on a real COVID-19 immunity study.
Recommended citation: Donnat Claire, et al. (2020). “A Bayesian Hierarchical Network for Combining Heterogeneous Data Sources in Medical Diagnoses” arXiv.