ChatGPT-powered deep learning: elevating brain tumor detection in MRI scans

Soha Rawas (Department of Mathematics and Computer Science, Beirut Arab University, Beirut, Lebanon)
Cerine Tafran (Department of Mathematics and Computer Science, Beirut Arab University, Beirut, Lebanon)
Duaa AlSaeed (College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia)

Applied Computing and Informatics

ISSN: 2634-1964

Article publication date: 3 July 2024

84

Abstract

Purpose

Accurate diagnosis of brain tumors is crucial for effective treatment and improved patient outcomes. Magnetic resonance imaging (MRI) is a common method for detecting brain malignancies, but interpreting MRI data can be challenging and time-consuming for healthcare professionals.

Design/methodology/approach

An innovative method is presented that combines deep learning (DL) models with natural language processing (NLP) from ChatGPT to enhance the accuracy of brain tumor detection in MRI scans. The method generates textual descriptions of brain tumor regions, providing clinicians with valuable insights into tumor characteristics for informed decision-making and personalized treatment planning.

Findings

The evaluation of this approach demonstrates promising outcomes, achieving a notable Dice coefficient score of 0.93 for tumor segmentation, outperforming current state-of-the-art methods. Human validation of the generated descriptions confirms their precision and conciseness.

Research limitations/implications

While the method showcased advancements in accuracy and understandability, ongoing research is essential for refining the model and addressing limitations in segmenting smaller or atypical tumors.

Originality/value

These results emphasized the potential of this innovative method in advancing neuroimaging practices and contributing to the effective detection and management of brain tumors.

Keywords

Citation

Rawas, S., Tafran, C. and AlSaeed, D. (2024), "ChatGPT-powered deep learning: elevating brain tumor detection in MRI scans", Applied Computing and Informatics, Vol. ahead-of-print No. ahead-of-print. https://doi.org/10.1108/ACI-12-2023-0167

Publisher

:

Emerald Publishing Limited

Copyright © 2024, Soha Rawas, Cerine Tafran and Duaa AlSaeed

License

Published in Applied Computing and Informatics. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode


Nomenclature

DL:

Deep Learning

CPU:

Central Processing Unit

NLP:

Natural Language Processing

PyTorch:

An open-source machine learning library

MRI:

Magnetic Resonance Imaging

3D-CNN:

Three-Dimensional Convolutional Neural Network

ChatGPT:

Chat Generative Pre-trained Transformer

U-Net:

A type of convolutional network designed for biomedical image segmentation

BraTS:

Brain Tumor Segmentation

BiFPN:

Bi-directional Feature Pyramid Network

Dice Coefficient:

A statistical tool to gauge the similarity between two samples

EDCNN:

Ensemble Deep Convolutional Neural Network

GPU:

Graphics Processing Unit

fMRI:

Functional Magnetic Resonance Imaging

RAM:

Random Access Memory

DTI:

Diffusion Tensor Imaging

1. Introduction

Medical imaging has transformed the healthcare industry by enabling doctors and other healthcare workers to view inside the human body without using intrusive procedures [1]. Magnetic resonance imaging (MRI) is a common imaging technique in neurology that provides non-invasive data on the brain. MRI is frequently used to detect brain tumors due to its ability to produce high-resolution images. Accurate interpretation of MRI scans is critical for the proper diagnosis and management of brain tumors, but it can be difficult even for experienced radiologists. Automated MRI analysis systems can help improve the accuracy and efficiency of brain tumor detection, resulting in earlier diagnosis and more successful treatment [2, 3].

Deep learning (DL) algorithms have become effective resources for the processing of MRI scans and other medical pictures [4]. Complex characteristics and patterns in the photos can be learned using DL models, which can then be applied to classification and detection tasks. When it comes to the identification and categorization of brain tumors in MRI scans, these models have produced encouraging results. Utilizing DL models for medical imaging has been hampered by the difficulty of interpretability. Understanding the properties that the model has acquired are crucial for efficient model improvement and gaining knowledge of the biological mechanisms behind medical imaging. This issue has been acknowledged in the realm of medical imaging since better model accuracy depends on an understanding of these properties [5].

Natural language processing (NLP) models, like ChatGPT, have recently demonstrated considerable promise for enhancing the interpretability of DL models [6, 7]. These models can produce text descriptions of an image’s properties that are understandable by humans, making it possible to analyze the images more transparently and intuitively. Incorporating NLP models into DL frameworks, particularly in the context of medical imaging, may increase the accuracy and efficacy of image processing and interpretation [8].

Using ChatGPT-enhanced DL models, this paper proposes a novel strategy for improving the precision of brain tumor detection in MRI data. We create a more understandable and accurate system for brain tumor identification by combining DL and NLP. A publicly available MRI scan dataset was used to compare the proposed method to several cutting-edge DL algorithms. Our proposed method could significantly improve the accuracy and effectiveness of brain tumor detection in MRI scans, which would improve patient outcomes. The findings of this study have the potential to advance the field of medical image analysis and have significant implications for the detection and management of brain cancer.

This study offers several notable contributions. Firstly, it introduces a novel method aimed at enhancing the detection of brain tumors in MRI scans. Additionally, it proposes the fusion of image-based segmentation and NLP, providing interpretable and accurate solutions for brain tumor detection. Moreover, the proposed method utilizes ChatGPT-based NLP to gain insights into tumor characteristics and distribution, aiding clinicians in decision-making and personalized treatment planning. Finally, the proposed method provides clear and comprehensible natural language descriptions of tumor regions, along with improved accuracy in brain tumor detection and segmentation compared to current techniques.

This paper is structured to give a thorough summary of our research. In Section 2, we examine earlier research on the use of DL models for brain tumor segmentation and detection. We go over how we integrated ChatGPT into our DL models in Section 3. The materials and techniques used in this study are covered in Section 4. We report our experimental findings and make a comparison with earlier research in Section 5. In Section 6, we offer our final thoughts and suggestions for clinical practice based on our findings.

2. Related work

Research on DL models for medical imaging analysis, including MRI scans, is rapidly expanding. The application of DL models to the identification and categorization of brain tumors in MRI scans has been the subject of numerous studies.

One recent study introduced an improved U-Net architecture that was designed specifically for segmenting brain tumors with high efficiency [9]. The proposed approach involved three key combinations that led to excellent performance across various metrics. To generate more spatially relevant features prior to the fusion operation, a Bidirectional Feature Pyramid Network (BiFPN) was applied in the encoder section using three pretrained models (VGG-19, MobileNetV2, and ResNet50). In the decoder section, an attention mechanism was used, which has been shown to work well in medical image analysis, especially for segmentation tasks. This mechanism facilitated segmentation by concentrating more on distinct tumor types. The study demonstrated the effectiveness of the model using the BraTS 2020 dataset, showcasing its state-of-the-art performance compared to previous models. In parallel, a study investigated how the two main programming environments, MATLAB and Python, affected the precision of CNN-based DL models used for glioma detection [10].

In the domain of brain tumor classification, a novel approach was developed to merge the weights of two separately designed DL models, combining the strengths of both shallow and deep models [11]. This Ensemble Deep Convolutional Neural Network (EDCNN) architecture achieved a prediction accuracy of 97.77%. Furthermore, a highly accurate and fully automated system was proposed to enhance brain tumor classification through deep transfer learning [12]. This method extracted features from MR images of the brain using deep transfer learning and adjusted hyperparameters to generate a precise model. Additionally, the effectiveness of the proposed approach was evaluated using seven popular metaheuristic methods applied on the same dataset.

In recent research [13], a new DL model for automated detection of brain tumors using 3D MR images. The proposed method was created specifically to work with raw 3D MRI data. The Support Vector Machine (SVM) classifier achieved high classification accuracy by extracting deep features from the 3 Attention-Convolutional-LSTM (3ACL) model, which includes convolutional layers, an attention strategy, and an LSTM structure. The proposed approach’s efficacy was assessed using experimental studies on all MRI slices from the BRATS 2015 and 2018 benchmark datasets. Another recent investigation [14] explored a DL approach using an Enhanced U-Net model to segment brain tumors in MRI data. Initially, pre-processing techniques, including filtering and contrast enhancement, were applied to the input MRI images, and patch extraction was then performed.

In recent investigations, there has been growing interest in utilizing natural language processing (NLP) models to enhance the interpretability of DL models in medical imaging. For instance, a rule-based algorithm was developed leveraging SAS tools to extract and classify tumor response into progression or no progression categories [15]. This study utilized 2,970 French reports obtained from various imaging modalities such as magnetic resonance imaging, computed tomography scans, and positron emission tomography, sourced from a comprehensive cancer center’s electronic health record. This dataset was divided into a training set of 2,637 documents and a validation set of 603 documents. Subsequently, the model’s performance was evaluated on 189 imaging reports sourced from 46 radiology centers.

Efforts have been made to optimize the integration of Transformers and Convolutional Neural Networks (CNNs) for medical image segmentation [16]. A model known as the CNN and Transformer Complementary Network (CTCNet) was introduced in this study. CTCNet consists of two encoders: one based on Swin Transformers and the other on Residual CNNs. This architecture aims to produce complementary features in both the Transformer and CNN domains. To seamlessly blend these features, a Cross-domain Fusion Block (CFB) was utilized. Moreover, the study explored the correlation between CNN and Transformer features and incorporated channel attention mechanisms to capture dual attention information from the self-attention features by Transformers. Additionally, a Feature Complementary Module (FCM) was introduced to improve feature representation by amalgamating cross-domain fusion, feature correlation, and dual attention. Lastly, the integration of a Swin Transformer decoder further enhanced the model’s capacity to capture long-range dependencies.

Recent research has explored the integration of deep learning and natural language processing (NLP) for various healthcare applications. One study developed an NLP pipeline [17] that leveraged the ClinicalBERT model to extract stroke features from head CT imaging notes. This approach facilitated the creation of structured datasets for a large patient cohort, enabling accurate analysis of stroke severity and its impact on patient survival. In another investigation, an integrated approach combining deep learning and NLP for continuous remote monitoring in digital health was proposed [18]. This methodology aims to improve patient outcomes, reduce healthcare costs, and enhance the quality of care by analyzing data from wearable devices and patient feedback.

As summarized in Table 1, various studies have contributed to advancements in medical imaging and digital health research, including improved deep learning models for medical imaging analysis, applications of natural language processing in medical imaging, and integration of deep learning and NLP in digital health.

While recent findings underscore the potential of deep learning to refine the precision of brain tumor detection and segmentation, they also unveil promising avenues for further exploration. An intriguing direction involves leveraging ChatGPT to generate text summaries elucidating the features learned by the deep learning model—a novel approach that, to our knowledge, remains unexplored in the related work discussed.

In our proposed methodology, we integrate ChatGPT into a deep learning model tailored for brain tumor detection in MRI data. Recognizing the untapped potential of this avenue, we believe that such integration is worth investigating to enhance the interpretability, accuracy, and overall efficacy of brain tumor detection in MRI scans.

Highlighting the limitations of existing methods in providing interpretable results and comprehensible insights into the detected tumor regions. By incorporating ChatGPT-based natural language descriptions, we seek to address this gap and provide clinicians with valuable insights into the underlying features identified by the deep learning model. This novel approach not only enhances the interpretability of the results but also facilitates better decision-making in diagnosis and therapy planning.

Moreover, by leveraging the language understanding capabilities of ChatGPT, we aim to bridge the gap between the technical complexity of deep learning models and the practical needs of clinical practitioners. This integration holds the potential to revolutionize the field of neuroimaging by providing clinicians with intuitive and actionable insights derived from advanced machine learning techniques.

3. Methodology

In this section, we comprehensively outline our methodology, beginning with an exposition on the model architecture followed by a detailed discussion on the formulation of the model, alongside describing its inherent limitations.

Architecture: Convolutional neural networks (CNNs) and transformers are two potent DL techniques that are combined in the suggested model architecture (as illustrated in Figure 1). The CNN-based feature extraction and classification network and the ChatGPT network for additional refinement make up the two main components of the model design.

The input MRI scans are processed, and significant features are extracted using the CNN-based feature extraction and classification network. Multiple convolutional and pooling layers typically make up this portion of the model, which gradually learns and extracts more intricate information from the input images. A collection of high-level feature representations that can be applied to categorization are produced by this section of the network.

The transformer based ChatGPT network makes up the second component of the model. The primary goal of this second component is to refine the feature representations created by the CNN-based network. The transformer architecture allows for capturing long-range dependencies and relationships between different features, and the ChatGPT model specifically is pre-trained on a large corpus of text data, which enables it to understand language structure and generate high-quality language representations.

The characteristics in the suggested model are input into the transformer-based encoder-decoder after being extracted from the CNN-based network. The ChatGPT network receives the output of the final decoder layer. By including data from the language model in this extra phase, the feature representations are further improved.

By combining the benefits of CNNs and transformers, the proposed model architecture is designed to achieve high accuracy in brain tumor diagnosis. While the CNN-based network extracts local features from the input images, the transformer-based ChatGPT network can capture long-range dependencies and include contextual information from the language model.

Formulation: The suggested methodology employs DL models and methods of NLP to improve the precision of brain tumor detection in MRI data. The model’s primary mathematical formulation can be summarized as follows:

Let X be the input brain MRI scan image, Y be the matching output label indicating whether a brain tumor is present or not, and L be the natural language description of the MRI scan produced by a ChatGPT model that has already been trained.

Convolutional neural networks (CNN) are a type of DL model that takes an input picture (X) and produce a feature vector f(X).

Mathematically, the CNN can be represented as:

f(X)=C(X)
where C(X) represents the feature vector generated by the CNN for input image X.

The segmentation mask S(C(X)) is obtained by applying a transformer-based encoder-decoder to the feature vector f(X), which predicts the tumor regions in the input image.

The transformer-based encoder-decoder framework consists of a stack of encoder and decoder layers, each with multiple attention heads and a hidden size of 512. The encoder layers encode the input feature map C(X) and the decoder layers decode the encoded feature map to generate the segmentation mask S(C(X)). The weights of the encoder-decoder are denoted as We, and the biases are denoted as be.

To generate natural language descriptions of the segmented tumor regions, we connect the output of the decoder layers to a pre-trained ChatGPT model. Specifically, we extract the features from the decoder layers and feed them into the ChatGPT model to generate a natural language description of the segmented tumor regions. Let H be the output features of the decoder layers, which are obtained by applying the transformer-based encoder-decoder to the feature vector f(X), and let Ht be the input features for the ChatGPT model, which are obtained by applying a linear transformation to H:

Ht=Wt*H+bt
where Wt is the weight matrix and bt is the bias vector of the linear layer.

The output text description generated by the ChatGPT model is denoted as D(Ht).

The overall loss function of the proposed model can be defined as follows:

L=W1*Lseg(S(C(X)),Y)+W2*LIm(D(Ht),L)
where LIm is the cross-entropy loss for the language modeling task, which measures the difference between the predicted text description D(Ht) and the ground truth text description L, and Lseg is the Dice loss function for the segmentation task, which measures the similarity between the predicted segmentation mask S(C(X)) and the ground truth segmentation mask Y. The two loss terms' relative weights are governed by the hyperparameters W1 and W2, which also balance the trade-off between the quality of the natural language description and segmentation accuracy.

With a learning rate of 1e−4, a batch size of 2, and early stopping based on the validation loss, we train the suggested model with the Adam optimizer. The Dice coefficient for segmentation accuracy and the language model evaluation metrics for natural language description quality are used to assess the model’s performance on the BraTS dataset.

The proposed model seeks to increase the accuracy of brain tumor detection in MRI scans by improving both the segmentation accuracy and the natural language description quality, ultimately minimizing the overall loss function L.

Limitations: While our DL model exhibits strengths in detecting a wide range of tumor sizes, including both large and small lesions typically observed in MRI scans, it also faces inherent limitations. Our DL model is trained to detect a wide range of tumor sizes, including both large and small lesions typically observed in MRI scans. The model’s architecture, which combines convolutional neural networks (CNNs) for feature extraction and transformers for refinement, enables it to capture diverse tumor sizes and characteristics present in the data.

However, specifying an exact size range for tumors may be challenging due to the variability in tumor sizes encountered in clinical datasets. Additionally, DL models inherently face limitations in accurately predicting certain tumor sizes, especially smaller or atypical lesions that may not exhibit typical characteristics captured during training.

4. Materials and methods

Dataset: We trained and assessed our DL models using the publicly available Brain Tumor Segmentation (BraTS) dataset, which comprises 285 T1-weighted, T1-weighted contrast-enhanced, T2-weighted, and Fluid Attenuated Inversion Recovery (FLAIR) sequenced MRI scans. A ground truth segmentation mask that shows the location and size of the tumor(s) in the scan is included with every MRI. As previously mentioned in studies [19, 20], the dataset was randomly split into three sets: training (60%), validation (20%), and testing (20%).

Preprocessing: We preprocessed the MRI scans through intensity normalization and skull stripping to eliminate non-brain tissue. Subsequently, the preprocessed MRI scans were resized to a fixed resolution of 256 × 256 × 128 voxels and converted to Hounsfield units to meet the specific input size requirements of the algorithms used in our methodology.

Deep Learning Model: Our proposal was to use a DL model to identify brain tumors from MRI images. A transformer-based encoder-decoder framework is placed after a 3D convolutional neural network (CNN) in the model architecture. The input MRI scans are processed by the 3D CNN to extract features, which are then sent to the transformer-based encoder-decoder for brain tumor segmentation.

The six convolutional layers of the 3D CNN architecture are represented in Figure 2 and have 32, 64, 128, 256, 512, and 1024 filters, respectively. Except for the final layer, each convolutional layer is followed by a batch normalization layer, a ReLU activation function, and a 3D max pooling layer. A feature map with a spatial resolution of 8 × 8 × 4 voxels is the result of the 3D CNN.

The transformer-based encoder-decoder framework, as shown in Figure 3, consists of a stack of 6 encoder and 6 decoder layers, each with 8 attention heads and a hidden size of 512. The encoder layers are responsible for encoding the input feature map, while the decoder layers decode the encoded feature map to generate the final segmentation mask. The decoder layers are also connected to a ChatGPT model for generating text descriptions of the segmented tumor regions.

First, the transformer-based encoder-decoder framework receives the output feature map from the final convolutional layer of the CNN, which is a tensor of shape (batch_size, feature_map_height, feature_map_width, num_channels). Included as input is a positional encoding tensor of the same shape as the output feature map, which provides positional details about every feature map element. Moreover, the padded areas of the input feature map are hidden using a padding mask tensor of shape (batch_size, 1, 1, feature_map_width).

The first encoding layer’s multi-head self-attention mechanism is then supplied these three inputs. Long-range dependencies between various items in the input feature map can be captured using the attention mechanism; while each element’s spatial location is conveyed through its positional encoding. This procedure is repeated in the following encoder layers, enabling the capturing of increasingly intricate correlations between features.

The decoder layers use an attention mechanism to produce the final segmentation mask after receiving the output from the encoder layers. A ChatGPT model, which creates text descriptions of the divided tumor regions, is also linked to the decoder layers. The resulting model can both segment brain tumors in MRI scans with accuracy and provide textual explanations of the segmented areas.

Training: The DL model was trained for 200 epochs and early stopping was carried out based on the validation loss using the Adam optimizer with a learning rate of 1e−4, a batch size of 2, and a Dice coefficient loss function.

Evaluation: The BraTS dataset’s testing set was used to evaluate the DL model’s performance. The Dice coefficient, a frequently used metric for calculating the overlap between the predicted and ground truth segmentation masks, has been used to assess the model’s performance. Additionally, the ChatGPT model was used to generate natural language descriptions of the divided tumor regions, and human experts assessed the model’s interpretability.

Implementation: The proposed DL model was executed on a system featuring an NVIDIA GeForce RTX 2080 Ti GPU, 32 GB RAM, and an Intel Core i7-9700K CPU. The implementation utilized PyTorch framework for seamless integration and efficient execution.

5. Results

The performance of the suggested DL model was evaluated using the testing set of the BraTS dataset. DL The model demonstrated superior performance compared to state-of-the-art techniques, achieving a notable Dice coefficient score of 0.93 for tumor segmentation. A qualitative analysis of the segmentation results indicated accurate segmentation of the majority of the tumor regions by the model. However, in specific instances, the model was unable to segment tiny or asymmetrical tumors, highlighting potential areas for improvement in the model’s architecture. To evaluate the interpretability of the model, natural language descriptions of the segmented tumor regions were generated using the ChatGPT model. Human specialists validated these descriptions, characterized by precision and conciseness. Results showed that the model was able to generate precise and concise descriptions. The information extracted from these natural language descriptions further clarifies the characteristics of the tumor regions, offering valuable information for enhancing the detection and management of brain tumors. Furthermore, to assess the impact of NLP on both interpretability and accuracy, the model’s performance was evaluated with and without ChatGPT. Results revealed that integrating ChatGPT led to a notable improvement, with the model attaining a Dice coefficient score of 0.93 as opposed to 0.91 without it. This emphasizes the critical role of incorporating NLP into DL models to enhance both accuracy and interpretability.

Overall, the proposed DL model with ChatGPT-based NLP demonstrates cutting-edge performance in the segmentation of brain tumors and offers insightful data on the properties of the tumor regions. This potential improvement holds significant benefits for both patients and healthcare providers, offering advanced capabilities in brain tumor diagnosis and treatment. An overview of the experimental results is provided in Tables 2 and 3.

5.1 Discussion

The integration of ChatGPT into our proposed DL model significantly affected the effectiveness and interpretability of brain tumor detection in MRI scans. DL. Achieving a Dice coefficient score of 0.93, our model demonstrated state-of-the-art performance in tumor segmentation, outperforming the latest methods reported in the literature. The incorporation of NLP through ChatGPT allowed for the generation of precise and succinct descriptions of the segmented tumor regions, providing additional insights into the characteristics of tumor area that can enhance diagnosis and therapy planning.

Furthermore, our findings highlighted the superiority of our suggested approach, which incorporates ChatGPT, over the model without ChatGPT. This comparison emphasizes the importance of integrating NLP in DL models, displaying the potential to improve the accuracy and efficacy of brain tumor diagnosis and treatment. This advancement holds significant benefits for both patients and medical professionals.

However, the qualitative analysis of the segmentation results unveiled some limitations in our proposed methodology. The model faced challenges in accurately segmenting small or atypically shaped tumors, indicating the necessity for further architectural improvements.

Despite these limitations, our proposed methodology offers a promising approach for brain tumor detection from MRI brain scans and highlights the need for continued research in this evolving field.

6. Conclusion and future directions

The research presented here carries significant practical implications for neuroimaging and clinical practice, offering a novel approach to brain tumor segmentation and diagnosis. The integration of ChatGPT-based natural language processing with deep learning models enables precise and understandable segmentation of brain tumors in MRI scans, empowering clinicians with valuable insights into tumor features for accurate diagnosis and treatment planning. Moreover, the proposed methodology lays the foundation for the development of user-friendly AI tools that seamlessly integrate into clinical workflows, streamlining decision-making processes and ultimately improving patient care outcomes. The clear and concise natural language descriptions generated by the model enhance collaboration between healthcare professionals and foster patient engagement and education regarding their condition and treatment options.

Building upon the current findings, several avenues for future research are outlined. Exploring multimodal integration, such as incorporating additional imaging modalities like functional MRI (fMRI) and diffusion tensor imaging (DTI), could capture complementary information about brain tumor morphology and function, enhancing the accuracy and comprehensiveness of tumor segmentation and characterization. Conducting longitudinal studies to track dynamic changes in tumor morphology and treatment response over time will provide valuable insights into tumor progression and recurrence patterns, guiding personalized treatment strategies and prognostic assessments. Collaboration with clinical partners to translate the research findings into real-world clinical settings through prospective clinical trials is essential to validate the performance and utility of the AI-driven approach across diverse patient populations and healthcare environments. Addressing ethical and regulatory considerations surrounding the deployment of AI-based tools in clinical practice is imperative to ensure patient confidentiality and autonomy. Incorporating patient-reported outcomes and quality-of-life measures into research studies can assess the impact of AI-driven interventions on patient well-being and satisfaction, informing the development of patient-centric care pathways and supporting shared decision-making processes.

In summary, the proposed DL model for brain tumor segmentation, integrated with ChatGPT-based NLP, demonstrates promising results and has the potential to significantly advance neuroimaging science, contributing to improved detection and management of brain tumors in clinical practice. Through continued research and collaboration, the method is anticipated to further enhance patient care and pave the way for innovative solutions in neuroimaging and beyond.

Figures

Model architecture

Figure 1

Model architecture

3D convolutional neural network

Figure 2

3D convolutional neural network

Transformer-based encoder-decoder

Figure 3

Transformer-based encoder-decoder

Reviewed studies grouped by contribution type

ContributionStudy
Improved DL Models for Medical Imaging Analysis[9, 11, 12, 13, 14, 16]
Application of NLP in Medical Imaging[15, 17]
Integration of DL and NLP in Digital Health[18]

Source(s): Table created by authors

Performance comparison of the proposed model with the state-of-the-art methods

ModelDice coefficient score
Proposed model0.93
State-of-the-art method 1 (U-Net)0.87
State-of-the-art method 2 (3D-CNN)0.89

Source(s): Table created by authors

Performance comparison of the proposed model with and without ChatGPT

ModelDice coefficient score
Proposed model0.93
Model without ChatGPT0.91

Source(s): Table created by authors

References

1Asif S, Zhao M, Tang F, Zhu Y. An enhanced deep learning method for multi-class brain tumor classification using deep transfer learning. Multimedia Tools Appl. 2023; 82(20): 1-28. doi: 10.1007/s11042-023-14828-w.

2Alyami J, Rehman A, Almutairi F, Fayyaz AM, Roy S, Saba T, Alkhurim A. Tumor localization and classification from MRI of brain using deep convolution neural network and salp swarm algorithm. Cognit Comput. 2023; 1: 1-11. doi: 10.1007/s12559-022-10096-2.

3Rawas S. Transforming healthcare delivery: next-generation medication management in smart hospitals through IoMT and ML. Discov Artif Intell. 2024; 4(1): 31. doi: 10.1007/s44163-024-00128-1.

4Hammernik K, Kustner T, Yaman B, Huang Z, Rueckert D, Knoll F, Akcakaya M. Physics-driven deep learning for computational magnetic resonance imaging: combining physics and machine learning for improved medical imaging. IEEE Signal Process Mag. 2023; 40(1): 98-114. doi: 10.1109/msp.2022.3215288.

5Linte CA, Mihaela P. Applied sciences—special issue on emerging techniques in imaging, modelling and visualization for cardiovascular diagnosis and therapy. Appl Sci. 2023; 13(2): 984. doi: 10.3390/app13020984.

6Lundervold AS, Lundervold A. An overview of deep learning in medical imaging focusing on MRI. Z für Medizinische Physik. 2019; 29(2): 102-27. doi: 10.1016/j.zemedi.2018.11.002.

7Rawas S, Dwinggo Samala A. Revolutionizing brain tumor analysis: a fusion of ChatGPT and multi-modal CNN for unprecedented precision. Int J Online Biomed Eng. 2024; 20(08): 8-48. doi: 10.3991/ijoe.v20i08.47347.

8Samala AD, Rawas S. Generative AI as virtual healthcare assistant for enhancing patient care quality. Int J Online Biomed Eng. 2024; 20(05): 5-187. doi: 10.3991/ijoe.v20i05.45937.

9Aboussaleh I, Riffi J, Fazazy KE, Mahraz MA, Tairi H. Efficient U-net architecture with multiple encoders and attention mechanism decoders for brain tumor segmentation. Diagnostics. 2023; 13(5): 872. doi: 10.3390/diagnostics13050872.

10Yilmaz VS, Akdag M, Dalveren Y, Doruk RO, Kara A, Soylu A. Investigating the impact of two major programming environments on the accuracy of deep learning-based glioma detection from MRI images. Diagnostics. 2023; 13(4): 651. doi: 10.3390/diagnostics13040651.

11Patil S, Kirange D. Ensemble of deep learning models for brain tumor detection. Proced Computer Sci. 2023; 218: 2468-79. doi: 10.1016/j.procs.2023.01.222.

12Tripathy S, Singh R, Ray M. Automation of brain tumor identification using EfficientNet on magnetic resonance images. Proced Computer Sci. 2023; 218: 1551-60. doi: 10.1016/j.procs.2023.01.133.

13Demir F, Akbulut Y, Taşcı B, Demir K. Improving brain tumor classification performance with an effective approach based on new deep learning model named 3ACL from 3D MRI data. Biomed Signal Process Control. 2023; 81: 104424. doi: 10.1016/j.bspc.2022.104424.

14Kumar GM, Parthasarathy E. Development of an enhanced U-Net model for brain tumor segmentation with optimized architecture. Biomed Signal Process Control. 2023; 81: 104427. doi: 10.1016/j.bspc.2022.104427.

15Laurent G, Craynest F, Thobois M, Hajjaji N. Automatic classification of tumor response from radiology reports with rule-based natural Language Processing integrated into the clinical oncology workflow. JCO Clin Cancer Inform. 2023; 7: e2200139. doi: 10.1200/cci.22.00139.

16Yuan F, Zhang Z, Fang Z. An effective CNN and Transformer complementary network for medical image segmentation. Pattern Recognition. 2023; 136: 109228. doi: 10.1016/j.patcog.2022.109228.

17Hsu E, Bako AT, Potter T, Pan AP, Britz GW, Tannous J, Vahidy FS. Extraction of radiological characteristics from free-text imaging reports using natural Language Processing among patients with ischemic and hemorrhagic stroke: algorithm development and validation. JMIR. 2023; AI 2: e42884. doi: 10.2196/42884.

18Shastry KA, Shastry A. An integrated deep learning and natural language processing approach for continuous remote monitoring in digital health. Decis Analytics J. 2023; 8: 100301. doi: 10.1016/j.dajour.2023.100301.

19Menze BH, Jakab A, Bauer S, Kalpathy-Cramer J, Farahani K, Kirby J, Burren Y, Porz N, Slotboom J, Wiest R, Lanczi L, Gerstner E, Weber MA, Arbel T, Avants BB, Ayache N, Buendia P, Collins DL, Cordier N, Corso JJ, Criminisi A, Das T, Delingette H, Demiralp C, Durst CR, Dojat M, Doyle S, Festa J, Forbes F, Geremia E, Glocker B, Golland P, Guo X, Hamamci A, Iftekharuddin KM, Jena R, John NM, Konukoglu E, Lashkari D, Mariz JA, Meier R, Pereira S, Precup D, Price SJ, Raviv TR, Reza SMS, Ryan M, Sarikaya D, Schwartz L, Shin HC, Shotton J, Silva CA, Sousa N, Subbanna NK, Szekely G, Taylor TJ, Thomas OM, Tustison NJ, Unal G, Vasseur F, Wintermark M, Ye DH, Zhao L, Zhao B, Zikic D, Prastawa M, Reyes M, Van Leemput K. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans Med Imaging. 2014; 34(10): 1993-2024. doi: 10.1109/tmi.2014.2377694.

20Adewole M, Rudie JD, Gbdamosi A, Toyobo O, Raymond C, Zhang D, Omidiji O, Akinola R, Suwaid MA, Emegoakor A, Ojo N, Aguh K, Kalaiwo C, Babatunde G, Ogunleye A, Gbadamosi Y, Iorpagher K, Calabrese E, Aboian M, Linguraru M, Albrecht J, Wiestler B, Kofler F, Janas A, LaBella D, Kzerooni AF, Li HB, Iglesias JE, Farahani K, Eddy J, Bergquist T, Chung V, Shinohara RT, Wiggins W, Reitman Z, Wang C, Liu X, Jiang Z, Familiar A, Van Leemput K, Bukas C, Piraud M, Conte GM, Johansson E, Meier Z, Menze BH, Baid U, Bakas S, Dako F, Fatade A, Anazodo UC. The brain tumor segmentation (brats) challenge 2023: glioma segmentation in Sub-Saharan Africa patient population (brats-Africa). ArXiv. 2023.

Corresponding author

Duaa AlSaeed can be contacted at: dalsaeed@ksu.edu.sa

Related articles