A Bayesian Hierarchical Network for Combining Heterogeneous Data Sources in Medical Diagnoses

Published in arXiv (Under submission), 2020

Recommended citation: Claire Donnat, Nina Miolane, Jack Kreindler and Frederick de St Pierre Bunbury (2020). "A Bayesian Hierarchical Network for Combining Heterogeneous Data Sources in Medical Diagnose" arXiv.

Computer-Aided Diagnosis has shown stellar performance in providing accurate medical diagnoses across multiple testing modalities (medical images, electrophysiological signals, etc.). While this field has typically focused on fully harvesting the signal provided by a single (and generally extremely reliable) modality, fewer efforts have utilized imprecise data lacking reliable ground truth labels. In this unsupervised, noisy setting, the robustification and quantification of the diagnosis uncertainty become paramount, thus posing a new challenge: how can we combine multiple sources of information -- often themselves with vastly varying levels of precision and uncertainty -- to provide a diagnosis estimate with confidence bounds? Motivated by a concrete application in antibody testing, we devise a Stochastic Expectation-Maximization algorithm that allows the principled integration of heterogeneous, and potentially unreliable, data types. Our Bayesian formalism is essential in (a) flexibly combining these heterogeneous data sources and their corresponding levels of uncertainty, (b) quantifying the degree of confidence associated with a given diagnostic, and (c) dealing with the missing values that typically plague medical data. We quantify the potential of this approach on simulated data, and showcase its practicality by deploying it on a real COVID-19 immunity study.

Download paper here

Recommended citation: Donnat Claire, et al. (2020). “A Bayesian Hierarchical Network for Combining Heterogeneous Data Sources in Medical Diagnoses” arXiv.