Getting AI Ready for Deployment: Tuning Algorithms to Specific Sites Using a Single Chest X-Ray Image (RSNA 2019, Sun Dec 01 – 06)

PURPOSE

Lack of generalisation of deep neural networks, due to equipment and geographic variability, is a known problem facing the radiology community today. We propose a novel method to get algorithms ‘deployment ready’ by using a single reference Chest X-Ray(CXR) image from a potential deployment site, with the intention of automatically reading all ‘normal’ CXRs.

METHOD AND MATERIALS

A deep learning model based on DenseNet-121 (M1) was trained on ~250,000 CXRs from Chexpert Dataset and~50,000 CXRs from NIH CXR14 dataset to predict a ‘normal’ or ‘abnormal’ label. The model was evaluated on 3datasets – E1(n=3587), E2(n=200) and E3(n=212). E1 and E3 were 2 separate datasets obtained from 3 outpatient imaging centres and 3 hospital imaging departments. E2 is Chexpert validation dataset. M2, a Siamese variation of  M1 uses a reference image (to capture site / scanner specific variation) for every site and was evaluated on E1, E2 and E3. A comparison between specificity of M1 and M2 was done by choosing a definite sensitivity threshold (97%) to determine their capability to correctly identify normal CXRs.

RESULTS

Area Under Receiver Operator Curve (AUROC) increased from 0.92, 0.87 and 0.84 on M1 to 0.95, 0.89 and 0.89 for M2 for E1, E2 and E3 respectively. At 97% sensitivity, M1 had a specificity of 0.41, 0.29 and 0.02 on E1, E2 and E3 respectively, which, after tuning M1 with a single reference image (M2), increased to 0.63,0.29, 0.45.

CONCLUSION

Our results indicate that deep learning models can be generalised across equipment, institutions and countries b y simply using a single reference image to tune the functioning of the model, hence showing potential to improve the functioning of deep learning algorithms in general. In this case, we observe drastic improvement in results of a model that distinguishes normal from abnormal images with a high degree of confidence.

CLINICAL RELEVANCE/APPLICATION

More than 50% of all CXRs done across the world are reported as ‘normal’. We demonstrate a novel method where a single algorithm can be deployed across sites to automate reading of normal CXRs while having high sensitivity saving radiologists’ time and improving speed of reporting.