Article, 2024
Differential privacy preserved federated learning for prognostic modeling in COVID‐19 patients using large multi‐institutional chest CT dataset
Medical Physics,
ISSN
2473-4209,
0094-2405,
Volume 51,
7,
Pages 4736-4747,
10.1002/mp.16964
Contributors
Shiri, Isaac
0000-0002-5735-0736
[1]
Salimi, Yazdan
0000-0002-1233-9576
[1]
Sirjani, Nasim
[2]
Razeghi, Behrooz
0000-0001-9568-4166
[3]
Bagherieh, Sara
0000-0002-1827-9164
[4]
Pakbin, Masoumeh
0000-0001-7643-5877
[5]
Mansouri, Zahra
[1]
Hajianfar, Ghasem
0000-0001-5359-2407
[1]
Avval, Atlas Haddadi
0000-0002-3896-7810
[6]
Askari, Dariush
0000-0003-4031-2589
[7]
Ghasemian, Mohammadreza
[5]
Sandoughdaran, Saleh
0000-0002-2191-7139
[8]
Sohrabi, Ahmad
[9]
Sadati, Elham
[10]
Livani, Somayeh
0000-0002-5748-4208
[11]
Iranpour, Pooya
0000-0001-6652-2053
[12]
Kolahi, Shahriar
0000-0002-7490-1229
[13]
Khosravi, Bardia
0000-0002-8024-339X
[14]
Bijari, Salar
0000-0001-7656-0475
[10]
Sayfollahi, Sahar
[9]
Atashzar, Mohammad Reza
[15]
Hasanian, Mohammad
0000-0002-3349-8090
[16]
Shahhamzeh, Alireza
[5]
Teimouri, Arash
0000-0001-8018-5989
[12]
Goharpey, Neda
[7]
Shirzad-Aski, Hesamaddin
0000-0002-0773-1610
[11]
Karimi, Jalal
[15]
Radmard, Amir Reza
0000-0002-7462-118X
[14]
[17]
Rezaei-Kalantari, Kiara
0000-0003-1973-4760
[9]
Oghli, Mostafa Ghelich
0000-0001-7753-5618
[2]
Oveisi, Mehrdad
0000-0002-8100-5609
[18]
Sadr, Alireza Vafaei
0000-0002-5733-6678
[19]
Voloshynovskiy, Slava
0000-0003-0416-9674
[3]
Zaidi, Habib
0000-0001-7559-5297
(Corresponding author)
[1]
[20]
[21]
[22]
Affiliations
- [1]
University Hospital of Geneva
[NORA names:
Switzerland; Europe, Non-EU; OECD];
- [2]
SIMUT (Iran)
[NORA names:
Iran; Asia, Middle East];
- [3]
University of Geneva
[NORA names:
Switzerland; Europe, Non-EU; OECD];
- [4]
Isfahan University of Medical Sciences
[NORA names:
Iran; Asia, Middle East];
- [5]
Qom University of Medical Science and Health Services
[NORA names:
Iran; Asia, Middle East];
(... more)
- [6]
Mashhad University of Medical Sciences
[NORA names:
Iran; Asia, Middle East];
- [7]
Shahid Beheshti University of Medical Sciences
[NORA names:
Iran; Asia, Middle East];
- [8]
Royal Surrey County Hospital
[NORA names:
United Kingdom; Europe, Non-EU; OECD];
- [9]
Iran University of Medical Sciences
[NORA names:
Iran; Asia, Middle East];
- [10]
Tarbiat Modares University
[NORA names:
Iran; Asia, Middle East];
- [11]
Golestan University of Medical Sciences
[NORA names:
Iran; Asia, Middle East];
- [12]
Shiraz University of Medical Sciences
[NORA names:
Iran; Asia, Middle East];
- [13]
Imam Khomeini Hospital
[NORA names:
Iran; Asia, Middle East];
- [14]
Tehran University of Medical Sciences
[NORA names:
Iran; Asia, Middle East];
- [15]
Fasa University of Medical Sciences
[NORA names:
Iran; Asia, Middle East];
- [16]
Arak University of Medical Sciences
[NORA names:
Iran; Asia, Middle East];
- [17]
Shariati Hospital
[NORA names:
Iran; Asia, Middle East];
- [18]
University of British Columbia
[NORA names:
Canada; America, North; OECD];
- [19]
Pennsylvania State University
[NORA names:
United States; America, North; OECD];
- [20]
University Medical Center Groningen
[NORA names:
Netherlands; Europe, EU; OECD];
- [21]
University of Southern Denmark
[NORA names:
SDU University of Southern Denmark;
University; Denmark; Europe, EU; Nordic; OECD];
- [22]
Óbuda University
[NORA names:
Hungary; Europe, EU; OECD]
(less)
Abstract
BACKGROUND: Notwithstanding the encouraging results of previous studies reporting on the efficiency of deep learning (DL) in COVID-19 prognostication, clinical adoption of the developed methodology still needs to be improved. To overcome this limitation, we set out to predict the prognosis of a large multi-institutional cohort of patients with COVID-19 using a DL-based model.
PURPOSE: This study aimed to evaluate the performance of deep privacy-preserving federated learning (DPFL) in predicting COVID-19 outcomes using chest CT images.
METHODS: After applying inclusion and exclusion criteria, 3055 patients from 19 centers, including 1599 alive and 1456 deceased, were enrolled in this study. Data from all centers were split (randomly with stratification respective to each center and class) into a training/validation set (70%/10%) and a hold-out test set (20%). For the DL model, feature extraction was performed on 2D slices, and averaging was performed at the final layer to construct a 3D model for each scan. The DensNet model was used for feature extraction. The model was developed using centralized and FL approaches. For FL, we employed DPFL approaches. Membership inference attack was also evaluated in the FL strategy. For model evaluation, different metrics were reported in the hold-out test sets. In addition, models trained in two scenarios, centralized and FL, were compared using the DeLong test for statistical differences.
RESULTS: The centralized model achieved an accuracy of 0.76, while the DPFL model had an accuracy of 0.75. Both the centralized and DPFL models achieved a specificity of 0.77. The centralized model achieved a sensitivity of 0.74, while the DPFL model had a sensitivity of 0.73. A mean AUC of 0.82 and 0.81 with 95% confidence intervals of (95% CI: 0.79-0.85) and (95% CI: 0.77-0.84) were achieved by the centralized model and the DPFL model, respectively. The DeLong test did not prove statistically significant differences between the two models (p-value = 0.98). The AUC values for the inference attacks fluctuate between 0.49 and 0.51, with an average of 0.50 ± 0.003 and 95% CI for the mean AUC of 0.500 to 0.501.
CONCLUSION: The performance of the proposed model was comparable to centralized models while operating on large and heterogeneous multi-institutional datasets. In addition, the model was resistant to inference attacks, ensuring the privacy of shared data during the training process.
Keywords
AUC,
AUC values,
CI,
COVID-19,
COVID-19 outcomes,
COVID-19 patients,
CT datasets,
CT images,
DL models,
DL-based models,
DeLong,
DeLong test,
DensNet,
FL,
FL approach,
FL strategy,
Privacy-Preserving Federated Learning,
accuracy,
adoption,
approach,
attacks,
average,
center,
centralized model,
chest CT datasets,
chest CT images,
clinical adoption,
cohort of patients,
confidence,
confidence intervals,
criteria,
data,
dataset,
deep learning,
differences,
differential privacy,
efficiency,
efficiency of deep learning,
evaluation,
exclusion,
exclusion criteria,
extraction,
feature extraction,
features,
federated learning,
hold-out test set,
images,
inclusion,
inference,
inference attacks,
interval,
layer,
learning,
limitations,
membership,
membership inference attacks,
methodology,
metrics,
model,
model evaluation,
multi-institutional cohort,
multi-institutional cohort of patients,
multi-institutional dataset,
outcomes,
patients,
performance,
privacy,
process,
prognosis,
prognostic model,
prognostication,
results,
scanning,
scenarios,
sensitivity,
sets,
significant difference,
slices,
specificity,
statistical difference,
statistically,
statistically significant difference,
strategies,
study,
test,
test set,
training,
training process,
training/validation,
training/validation set,
values
Funders
Data Provider: Digital Science