Research progress on electronic health records multimodal data fusion based on deep learning_Journal of Biomedical Engineering

Authors：

FAN Yong ¹ ,  ZHANG Zhengbo ¹ , WANG Jing ²

1. Medical Innovation Research Department, Chinese PLA General Hospital, Beijing 100853, P. R. China;
2. School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, P. R. China;

Corresponding author：

Keywords：

Multimodal fusion; Deep learning; Multimodal medical data; Electronic health records; Medical artificial intelligence

DOI：

10.7507/1001-5515.202310011

Video：

Export PDF Favorites Scan Get Citation

Abstract Full text Figures/Tables Video References Cited by

Currently, the development of deep learning-based multimodal learning is advancing rapidly, and is widely used in the field of artificial intelligence-generated content, such as image-text conversion and image-text generation. Electronic health records are digital information such as numbers, charts, and texts generated by medical staff using information systems in the process of medical activities. The multimodal fusion method of electronic health records based on deep learning can assist medical staff in the medical field to comprehensively analyze a large number of medical multimodal data generated in the process of diagnosis and treatment, thereby achieving accurate diagnosis and timely intervention for patients. In this article, we firstly introduce the methods and development trends of deep learning-based multimodal data fusion. Secondly, we summarize and compare the fusion of structured electronic medical records with other medical data such as images and texts, focusing on the clinical application types, sample sizes, and the fusion methods involved in the research. Through the analysis and summary of the literature, the deep learning methods for fusion of different medical modal data are as follows: first, selecting the appropriate pre-trained model according to the data modality for feature representation and post-fusion, and secondly, fusing based on the attention mechanism. Lastly, the difficulties encountered in multimodal medical data fusion and its developmental directions, including modeling methods, evaluation and application of models, are discussed. Through this review article, we expect to provide reference information for the establishment of models that can comprehensively utilize various modal medical data.

Citation： FAN Yong, ZHANG Zhengbo, WANG Jing. Research progress on electronic health records multimodal data fusion based on deep learning. Journal of Biomedical Engineering, 2024, 41(5): 1062-1071. doi: 10.7507/1001-5515.202310011 Copy

1.	Baltrusaitis T, Ahuja C, Morency L P. Multimodal machine learning: a survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(2): 423-443.
2.	Afouras T, Chung J S, Senior A, et al. Deep audio-visual speech recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(12): 8717-8727.
3.	陈杰, 马静, 李晓峰, 等. 基于DR-Transformer模型的多模态情感识别研究. 情报科学, 2022, 40(3): 117-125.
4.	徐瑞麟, 耿伯英, 刘树衎. 多模态公文的结构知识抽取与组织研究. 系统工程与电子技术, 2022, 44(7): 2241-2250.
5.	Zhou S K, Greenspan H, Davatzikos C, et al. A review of deep learning in medical imaging: imaging traits, technology trends, case studies with progress highlights, and future promises. Proc IEEE Inst Electr Electron Eng, 2021, 109(5): 820-838.
6.	Chen H, Lundberg S M, Erion G, et al. Forecasting adverse surgical events using self-supervised transfer learning for physiological signals. npj Digital Medicine, 2021, 4: 167.
7.	Huang S C, Pareek A, Seyyedi S, et al. Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. npj Digital Medicine, 2020, 3: 136.
8.	Zhang D, Yin C, Zeng J, et al. Combining structured and unstructured data for predictive models: a deep learning approach. BMC Medical Informatics and Decision Making, 2020, 20(1): 280.
9.	Ramachandram D, Taylor G W. Deep multimodal learning: a survey on recent advances and trends. IEEE Signal Processing Magazine, 2017, 34(6): 96-108.
10.	Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: transformers for image recognition at scale. arXiv preprint, 2010. DOI: 10.48550/arXiv.2010.11929.
11.	Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. arXiv preprint, 2017. DOI: 10.48550/arXiv.1706.03762.
12.	Brown T B, Mann B, Ryder N, et al. Language models are few-shot learners. arXiv preprint, 2020. DOI: 10.48550/arXiv.2005.14165.
13.	Meng Y, Speier W, Ong M K, et al. Bidirectional representation learning from transformers using multimodal electronic health record data to predict depression. IEEE Journal of Biomedical and Health Informatics, 2021, 25(8): 3121-3129.
14.	倪炯, 王培军. 医学影像人工智能的现状与未来. 中华医学杂志, 2021, 101(7): 455-457.
15.	Qiu S, Miller M I, Joshi P S, et al. Multimodal deep learning for Alzheimer’s disease dementia assessment. Nature Communications, 2022, 13: 3404.
16.	Prokhorenkova L, Gusev G, Vorobev A, et al. CatBoost: unbiased boosting with categorical features//Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal: Neural Information Processing Systems Foundation, 2018: 6639 - 6649.
17.	Barros V, Tlusty T, Barkan E, et al. Virtual biopsy by using artificial intelligence-based multimodal modeling of binational mammography data. Radiology, 2022, 306(3): e220027.
18.	Chen T, Guestrin C. XGBoost: a scalable tree boosting system//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco: SIGKDD , 2016: 785-794.
19.	Mei X, Lee H C, Diao K Y, et al. Artificial intelligence-enabled rapid diagnosis of patients with COVID-19. Nature Medicine, 2020, 26: 1224-1228.
20.	Khader F, Müller-Franzes G, Wang T, et al. Multimodal deep learning for integrating chest radiographs and clinical parameters: a case for transformers. Radiology, 2023, 309(1): e230806.
21.	Silva J F, Matos S. Modelling patient trajectories using multimodal information. Journal of Biomedical Informatics, 2022, 134: 104195.
22.	Huang K, Altosaar J, Ranganath R. ClinicalBERT: modeling clinical notes and predicting hospital readmission. arXiv preprint, 2020. DOI: 10.48550/arXiv.1904.05342.
23.	Liu F, Shareghi E, Meng Z, et al. Self-alignment pretraining for biomedical entity representations//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2021: 4228-4238.
24.	Liu S, Wang X, Hou Y, et al. Multimodal data matters: language model pre-training over structured and unstructured electronic health records. IEEE Journal of Biomedical and Health Informatics, 2023, 27: 504-514.
25.	Lyu W, Dong X, Wong R, et al. A multimodal transformer: fusing clinical notes with structured EHR data for interpretable in-hospital mortality prediction//American Medical Informatics Association Annual Symposium, New Orleans: American Medical Informatics Association, 2022: 719-728.
26.	Xu Y, Biswal S, Deshpande S R, et al. RAIM: recurrent attentive and intensive model of multimodal patient monitoring data//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London: ACM, 2018: 2565-2573.
27.	Feng Y, University T, Xu Z, et al. DCMN: double core memory network for patient outcome prediction with multimodal data//IEEE International Conference on Data Mining, Beijing: IEEE Computer Society, 2019: 200-209.
28.	Kim H B, Nguyen H T, Jin Q, et al. Computational signatures for post-cardiac arrest trajectory prediction: importance of early physiological time series. Anaesthesia Critical Care & Pain Medicine, 2022, 41(1): 101015.
29.	Hernandez L, Kim R, Tokcan N, et al. Multimodal tensor-based method for integrative and continuous patient monitoring during postoperative cardiac care. Artificial Intelligence in Medicine, 2021, 113: 102032.
30.	Mathis M R, Engoren M C, Williams A M, et al. Prediction of postoperative deterioration in cardiac surgery patients using electronic health record and physiologic waveform data. Anesthesiology, 2022, 137(5): 586-601.
31.	Soenksen L R, Ma Y, Zeng C, et al. Integrated multimodal artificial intelligence framework for healthcare applications. npj Digital Medicine, 2022, 5: 149.
32.	Golovanevsky M, Eickhoff C, Singh R. Multimodal attention-based deep learning for Alzheimer’s disease diagnosis. Journal of the American Medical Informatics Association, 2022, 29(12): 2014-2022.
33.	Abuhmed T, El-Sappagh S, Alonso J M. Robust hybrid deep learning models for Alzheimer’s progression detection. Knowledge-Based Systems, 2021, 213: 106688.
34.	Bahador N, Jokelainen J, Mustola S, et al. Multimodal spatio-temporal-spectral fusion for deep learning applications in physiological time series processing: a case study in monitoring the depth of anesthesia. Information Fusion, 2021, 73: 125-143.
35.	de Hond A A H, Leeuwenberg A M, Hooft L, et al. Guidelines and quality criteria for artificial intelligence-based prediction models in healthcare: a scoping review. npj Digital Medicine, 2022, 5: 2.
36.	Vasey B, Nagendran M, Campbell B, et al. Reporting guideline for the early-stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI. Nature Medicine, 2022, 28: 924-933.
37.	Chen R J, Wang J J, Williamson D F K, et al. Algorithmic fairness in artificial intelligence for medicine and healthcare. Nature Biomedical Engineering, 2023, 7: 719-742.

1. Baltrusaitis T, Ahuja C, Morency L P. Multimodal machine learning: a survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(2): 423-443.
2. Afouras T, Chung J S, Senior A, et al. Deep audio-visual speech recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(12): 8717-8727.
3. 陈杰, 马静, 李晓峰, 等. 基于DR-Transformer模型的多模态情感识别研究. 情报科学, 2022, 40(3): 117-125.
4. 徐瑞麟, 耿伯英, 刘树衎. 多模态公文的结构知识抽取与组织研究. 系统工程与电子技术, 2022, 44(7): 2241-2250.
5. Zhou S K, Greenspan H, Davatzikos C, et al. A review of deep learning in medical imaging: imaging traits, technology trends, case studies with progress highlights, and future promises. Proc IEEE Inst Electr Electron Eng, 2021, 109(5): 820-838.
6. Chen H, Lundberg S M, Erion G, et al. Forecasting adverse surgical events using self-supervised transfer learning for physiological signals. npj Digital Medicine, 2021, 4: 167.
7. Huang S C, Pareek A, Seyyedi S, et al. Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. npj Digital Medicine, 2020, 3: 136.
8. Zhang D, Yin C, Zeng J, et al. Combining structured and unstructured data for predictive models: a deep learning approach. BMC Medical Informatics and Decision Making, 2020, 20(1): 280.
9. Ramachandram D, Taylor G W. Deep multimodal learning: a survey on recent advances and trends. IEEE Signal Processing Magazine, 2017, 34(6): 96-108.
10. Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: transformers for image recognition at scale. arXiv preprint, 2010. DOI: 10.48550/arXiv.2010.11929.
11. Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. arXiv preprint, 2017. DOI: 10.48550/arXiv.1706.03762.
12. Brown T B, Mann B, Ryder N, et al. Language models are few-shot learners. arXiv preprint, 2020. DOI: 10.48550/arXiv.2005.14165.
13. Meng Y, Speier W, Ong M K, et al. Bidirectional representation learning from transformers using multimodal electronic health record data to predict depression. IEEE Journal of Biomedical and Health Informatics, 2021, 25(8): 3121-3129.
14. 倪炯, 王培军. 医学影像人工智能的现状与未来. 中华医学杂志, 2021, 101(7): 455-457.
15. Qiu S, Miller M I, Joshi P S, et al. Multimodal deep learning for Alzheimer’s disease dementia assessment. Nature Communications, 2022, 13: 3404.
16. Prokhorenkova L, Gusev G, Vorobev A, et al. CatBoost: unbiased boosting with categorical features//Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal: Neural Information Processing Systems Foundation, 2018: 6639 - 6649.
17. Barros V, Tlusty T, Barkan E, et al. Virtual biopsy by using artificial intelligence-based multimodal modeling of binational mammography data. Radiology, 2022, 306(3): e220027.
18. Chen T, Guestrin C. XGBoost: a scalable tree boosting system//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco: SIGKDD , 2016: 785-794.
19. Mei X, Lee H C, Diao K Y, et al. Artificial intelligence-enabled rapid diagnosis of patients with COVID-19. Nature Medicine, 2020, 26: 1224-1228.
20. Khader F, Müller-Franzes G, Wang T, et al. Multimodal deep learning for integrating chest radiographs and clinical parameters: a case for transformers. Radiology, 2023, 309(1): e230806.
21. Silva J F, Matos S. Modelling patient trajectories using multimodal information. Journal of Biomedical Informatics, 2022, 134: 104195.
22. Huang K, Altosaar J, Ranganath R. ClinicalBERT: modeling clinical notes and predicting hospital readmission. arXiv preprint, 2020. DOI: 10.48550/arXiv.1904.05342.
23. Liu F, Shareghi E, Meng Z, et al. Self-alignment pretraining for biomedical entity representations//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2021: 4228-4238.
24. Liu S, Wang X, Hou Y, et al. Multimodal data matters: language model pre-training over structured and unstructured electronic health records. IEEE Journal of Biomedical and Health Informatics, 2023, 27: 504-514.
25. Lyu W, Dong X, Wong R, et al. A multimodal transformer: fusing clinical notes with structured EHR data for interpretable in-hospital mortality prediction//American Medical Informatics Association Annual Symposium, New Orleans: American Medical Informatics Association, 2022: 719-728.
26. Xu Y, Biswal S, Deshpande S R, et al. RAIM: recurrent attentive and intensive model of multimodal patient monitoring data//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London: ACM, 2018: 2565-2573.
27. Feng Y, University T, Xu Z, et al. DCMN: double core memory network for patient outcome prediction with multimodal data//IEEE International Conference on Data Mining, Beijing: IEEE Computer Society, 2019: 200-209.
28. Kim H B, Nguyen H T, Jin Q, et al. Computational signatures for post-cardiac arrest trajectory prediction: importance of early physiological time series. Anaesthesia Critical Care & Pain Medicine, 2022, 41(1): 101015.
29. Hernandez L, Kim R, Tokcan N, et al. Multimodal tensor-based method for integrative and continuous patient monitoring during postoperative cardiac care. Artificial Intelligence in Medicine, 2021, 113: 102032.
30. Mathis M R, Engoren M C, Williams A M, et al. Prediction of postoperative deterioration in cardiac surgery patients using electronic health record and physiologic waveform data. Anesthesiology, 2022, 137(5): 586-601.
31. Soenksen L R, Ma Y, Zeng C, et al. Integrated multimodal artificial intelligence framework for healthcare applications. npj Digital Medicine, 2022, 5: 149.
32. Golovanevsky M, Eickhoff C, Singh R. Multimodal attention-based deep learning for Alzheimer’s disease diagnosis. Journal of the American Medical Informatics Association, 2022, 29(12): 2014-2022.
33. Abuhmed T, El-Sappagh S, Alonso J M. Robust hybrid deep learning models for Alzheimer’s progression detection. Knowledge-Based Systems, 2021, 213: 106688.
34. Bahador N, Jokelainen J, Mustola S, et al. Multimodal spatio-temporal-spectral fusion for deep learning applications in physiological time series processing: a case study in monitoring the depth of anesthesia. Information Fusion, 2021, 73: 125-143.
35. de Hond A A H, Leeuwenberg A M, Hooft L, et al. Guidelines and quality criteria for artificial intelligence-based prediction models in healthcare: a scoping review. npj Digital Medicine, 2022, 5: 2.
36. Vasey B, Nagendran M, Campbell B, et al. Reporting guideline for the early-stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI. Nature Medicine, 2022, 28: 924-933.
37. Chen R J, Wang J J, Williamson D F K, et al. Algorithmic fairness in artificial intelligence for medicine and healthcare. Nature Biomedical Engineering, 2023, 7: 719-742.

Previous Article
A review on depth perception techniques in organoid images
Next Article
Research progress of breast pathology image diagnosis based on deep learning

Journal of Biomedical Engineering

Research progress on electronic health records multimodal data fusion based on deep learning

Abstract Full text Figures/Tables Video References Cited by

Previous Article

Next Article

Format

Content