DOI QR코드

DOI QR Code

A Comparative Study of Alzheimer's Disease Classification using Multiple Transfer Learning Models

  • Received : 2019.12.11
  • Accepted : 2019.12.23
  • Published : 2019.12.31

Abstract

Over the past decade, researchers were able to solve complex medical problems as well as acquire deeper understanding of entire issue due to the availability of machine learning techniques, particularly predictive algorithms and automatic recognition of patterns in medical imaging. In this study, a technique called transfer learning has been utilized to classify Magnetic Resonance (MR) images by a pre-trained Convolutional Neural Network (CNN). Rather than training an entire model from scratch, transfer learning approach uses the CNN model by fine-tuning them, to classify MR images into Alzheimer's disease (AD), mild cognitive impairment (MCI) and normal control (NC). The performance of this method has been evaluated over Alzheimer's Disease Neuroimaging (ADNI) dataset by changing the learning rate of the model. Moreover, in this study, in order to demonstrate the transfer learning approach we utilize different pre-trained deep learning models such as GoogLeNet, VGG-16, AlexNet and ResNet-18, and compare their efficiency to classify AD. The overall classification accuracy resulted by GoogLeNet for training and testing was 99.84% and 98.25% respectively, which was exceptionally more than other models training and testing accuracies.

Keywords

I. INTRODUCTION

The Alzheimer’s disease (AD) is a neurological brain disorder which causes permanent damage to brain cells associated with the ability of thinking and memorizing. The cognitive decline caused by this disorder ultimately leads to dementia. The report implies that the brain changes identified with Alzheimer’s may begin 20 or more years before the appearance of symptoms [1] and currently there is no treatment for AD [2]. According to 2018 Alzheimer’s disease facts and figures, United States is the sixth leading cause of death and about 5.7 million Americans are living with AD [3].

While no cure exists for the disease yet, there is consensus on the need and benefit for early diagnosis of AD. Currently, many neurologists and medical researchers have been contributing considerable time to researching methods to allow for early detection of AD, and promising results have been continually achieved [4]. At present, there have been many studies about diagnosis of AD based on Magnetic Resonance Imaging (MRI) data, playing a significant role in classifying AD. Various computer-assisted techniques are proposed, classifying the characterized extracted features from the input images. These features are usually extracted from the regions of interest (ROI) and volume of interests (VoI) [5] or even combine different extracted features [6]. While most of the existing work has focused on the binary classification which only classifies AD from NC, proper treatment requires classifying AD, MCI and NC. MCI is a stage prior to AD, where patients will result in mild symptoms of AD and bare the chance of getting transformed to dementia [7].

Recently, machine learning techniques, particularly deep learning, show great potential in aiding the diagnosis of AD using MRI scans. Deep learning methods, such as CNN, have been shown to outperform existing machine learning methods [8-9]. It has made a massive progress in the field of image processing, mainly due to the availability of large labeled datasets such as ImageNet, for better and accurate learning of models. ImageNet offers around 1.2 million natural images with above 1000 distinctive classes. CNN trained over such images results in high accuracy also improves medical image categorization. However, there are certain limitations of training CNN from scratch, requirement of large dataset is one of them. As a result, another alternative approach called transfer learning can be used to overcome this problem which requires minimum dataset and consumes less time [10,11]. Transfer learning is a machine learning method where a pre-trained network is reused as the starting point for a model on a second task.

In this paper, we employ three pre-trained base models to illustrate transfer learning to effectively classify AD. The main objective of this paper is to show how instead of training a completely new model from scratch, we can utilize transfer learning approach without any preprocessing of the MR images and achieve high accuracy. Moreover, in this study we compare the performance of different deep learning models such as GoogLeNet, VGG16, AlexNet and SqueezeNet by applying transfer learning method.

 

II. Materials Used

The data used in the study were taken from the Alzheimer’s disease Neuroimaging Initiative (ADNI) database (http://adni.loni.usc.edu/). The ADNI was launched in 2003 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies and non-profit organizations, as a $60 million, 5-year public-private partnership. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimer’s disease (AD). ADNI is the result of efforts of many investigators and subjects have been recruited from over 50 sites across the U.S. and Canada.

 

All subjects were required to be 60 years of age or older. The entire image set was classified subjectively by a neurologist, radiologist, and psychiatrist into categories AD, MCI, and NC. The details of MR images used in this study is shown in Table 1.

 

Table 1. Characteristics of selected subjects from ADNI dataset.

E1MTCD_2019_v6n4_209_t0001.png 이미지

 

Ⅲ. Classification using Transfer Learning

The proposed method exploits the transfer learning technique for 3-way classification of AD. The architecture of utilizing transfer learning is shown in Fig. 1.

 

E1MTCD_2019_v6n4_209_f0001.png 이미지

Fig. 1. Architectural representation of transfer learning approach.

 

The transfer learning approach is helpful if we have a small training dataset for parameter learning [12]. We take a trained network, e.g., GoogLeNet as a starting point to learn a new task. GoogleNet pre-trained on ImageNet is taken as a base model to train a brain MR images from ADNI dataset. To use the transfer learning, the fullyconnected layers are removed since the outputs of these layers are 1000 categories and is replaced by a new fullyconnected layer followed by a softmax layer and an output layer for classifying 3 classes. Then, we train the network by providing training set MR images in addition to training options. Next, we test our model and obtain testing accuracy of the model. Finally, we deploy results using confusion matrix.

 

Ⅳ. Pre-trained CNN Architecture

4.1. Overview

Deep learning is a subfield of machine learning and a collection of algorithms that are inspired by the structure of human brain and try to imitate their functions. CNN is one such deep learning algorithm in which the transformations are done using the convolution operation. A typical CNN is comprised of three basic layers; a convolutional layer, a pooling layer and a fully-connected layer. However, an activation layer, normalization layer and a dropout layer also plays significant role in the deep architecture of CNN model.

The convolutional layer is the core building block of a CNN and is responsible for most of the computations done. It extracts the features from the input image which is to be classified [13]. Its parameter consists set of kernels or learnable filters. It performs the convolution operation or filtration over the input, forwarding the response to the next layer as a feature map [14]. The pooling layer is used to spatially reduce the spatial representation and the computational space [15]. It performs the pooling operation on each of the sliced inputs, reducing the computational cost for the next convolutional layer. The application of convolutional and pooling layers results in the extraction and reduction of features from the input images. The objective of a fully-connected layer is to take the output feature maps of the final convolutional or pooling layers and use them to classify the image into a label.

 

4.1.1. GoogLeNet

GoogLeNet has been trained on over a million images and can classify into 1000 object categories. It was introduced by a Google team and was the winner of ILSVRC-2014. The network is designed with computational efficiency and practicality in mind. The network is made up of 22-layers deep as shown in Fig. 2(a). All the convolutions, including those inside the inception modules, use rectified linear unit (ReLU) as an activation function. The size of the receptive field of our network is 224×224in the RGB color space with zero mean. Hence all the images were cropped and converted to 224×224×3 size which is a valid input size for our model. Uniqueness lies in the same 9 Inception modules used in GoogLeNet model [16]. Fig. 2(b) shows detailed structure of Inception layer.

 

4.1.2. AlexNet

The original AlexNet architecture was trained over the ImageNet dataset [17] comprising images belonging to 1000 object classes. It was designed by Alex Krizhevsky and was the winner of ISLRVC-2012 [18]. The architecture of Alexnet is depicted in Fig. 3. It contains 8 layers with the first 5 layers as convolutional followed by 3 fully connected layers. AlexNet accepts input images of size 227×227in RGB color space. Therefore all the images were resized and converted to fit the network criteria to perform transfer learning.

 

4.1.3. VGG-16

VGG-16 is a CNN model proposed by K. Simonyan and A. Zisserman in 2014 [19]. As the name indicates, the VGG-16 model contains 16 layers in total. Since VGG-16 was trained on RGB i.e., 3 channel images it can accept input only if it has exactly 3 channels. Thus, the input to the first convolutional layer is a fixed size 224×224 RGB image, resulted by cropping and converting the image. Fig. 4 shows the overall architecture of VGG-16 model.

 

E1MTCD_2019_v6n4_209_f0002.png 이미지

Fig. 2. Block diagram of CNN model (a) GoogLeNet architecture representing 22 layers, (b) Layers in all nine inception modules.

 

E1MTCD_2019_v6n4_209_f0003.png 이미지

Fig. 3. Eight layers representing architecture of AlexNet model

 

4.1.4. ResNet-18

Deep Residual networks, shortly named as ResNet is developed based on the core idea addressed as shortcut connections or skip connections. These connections provide alternate pathway for data and gradients to flow, thus making training possible. The simplest model is ResNet-18 which has 18 layers. The input image size of this model is 224×224×3. It was developed by Kaiming et al., [20] and was winner of ILSVRC-2015. The detailed architecture of ResNet- 18 and how skip connections run in parallel is shown in Fig. 5.

 

E1MTCD_2019_v6n4_209_f0004.png 이미지

Fig. 4. Architectural representation of 16 layered VGG model.

 

E1MTCD_2019_v6n4_209_f0005.png 이미지

Fig. 5. Architecture of ResNet-18 with skip connections running parallel.

 

4.2. Creating Training and Testing Datasets

An ADNI dataset of total 1020 images is shuffled and split into train and test set in the ratio 70:30. The training and testing set used for 3-way classification (AD vs. MCI vs. NC) for all three networks are same and is shown in Table 2.

 

Table 2. Training and testing dataset used for all models.

E1MTCD_2019_v6n4_209_t0001.png 이미지

We evaluated the performance by changing the learning rate, which controls the amount of change required in response to the estimated error each time the model weights are updated. Choosing the learning rate is challenging as a value too small may result in a long training process, whereas a value too large may result in learning a suboptimal set of weights too fast or an unstable training process. It also affects the accuracy of the model.

 

Ⅴ. Results and Discussion

The classification model was built using MATLAB 2018, which easily offers transfer learning. Training options such as 100 epochs by considering validation patience of 5 as an early stopping parameter, a learning rate of 0.0001, stochastic gradient descent with momentum (SGDM) as an optimizer to minimize the loss function as well as adjust the weight and bias factors, and a mini batch size of 12 were selected to train 621 images. One Epoch is defined as completed when entire dataset is passed forward and backward through neural network. In our case, it took 51 iterations to complete 1 epoch. Validation patience or an extra epoch checks if the model remains stable without any further improvement and thus prevents the model from over-fitting problem.

Since accuracy is the primary evaluation metric, we analyzed training and testing accuracy results by changing the learning rate from 1e-2 to 1e-5 to see how learning rate affects the accuracy and loss of the model. However, as expected the model produced optimal results for learning rate 1e-4 . Therefore, we evaluated all models by fixing the learning rate to 1e-4 . The accuracy was obtained using a confusion matrix, which describes the performance of a classification model. The confusion matrix of GoogLeNet, AlexNet, VGG-16 and ResNet-18 are showed from Table 3 to 6. The training and validation progress of all the models used in this paper is shown in Figs. 6-9.

 

E1MTCD_2019_v6n4_209_f0006.png 이미지

Fig. 6. Training and validation progress of GoogLeNet for 1e-4 learning rate.

 

E1MTCD_2019_v6n4_209_t0003.png 이미지

Fig. 7. Training and validation progress of AlexNet for 1e-4 learning rate.

 

E1MTCD_2019_v6n4_209_t0004.png 이미지

Fig. 8. Training and validation progress of VGG-16 for 1e-4 learning rate.

 

E1MTCD_2019_v6n4_209_t0005.png 이미지

Fig. 9. Training and validation progress of ResNet-18 for 1e-4 learning rate.

 

Figs. 5~8 show how the models progresses when training the model and how its accuracy and loss change. Although every models were trained for 100 epochs, it was trained for less epochs due to early stopping in order to prevent the model from overfitting. In Fig. 5, we can clearly note that the training loss of GoogLeNet steadily declined and almost reached zero while raising the accuracy of the model. The possble reason for this is the inception modules used in GoogLeNet, which increases the depth of the model and learns more features from the image. Nevertheless, other models were also capable in reducing the loss with an increase in accuarcy. ResNet-18 in Fig. 8 also performed well with the reduction in loss to almost zero. It was more efficient as compared to AlexNet and VGG-16 due to skip connections used in the network.

 

Table 3. Confusion matrix obtained by testing GoogLeNet.

E1MTCD_2019_v6n4_209_f0007.png 이미지

 

Table 4. Confusion matrix obtained by testing AlexNet

E1MTCD_2019_v6n4_209_t0008.png 이미지

 

Table 5. Confusion matrix obtained by testing VGG-16.

E1MTCD_2019_v6n4_209_t0009.png 이미지

 

Table 6. Confusion matrix obtained by testing ResNet-18.

E1MTCD_2019_v6n4_209_t0010.png 이미지

 

In Table 3., we can clearly note that GoogLeNet was successful in classification with 97.05% of AD, 100% of MCI and 97.70% of NC being correctly classified. Thus, the overall testing accuracy of GoogLeNet resulted in 98.25%, Moreover, the other models also resulted in good accuracies such as AlexNet, VGG-16 and ResNet-18 shown in Table 4, 5 and 6 respectively. Among these models ResNet-18 outperfomed the other two models with 92.90% of AD, 100% of MCI and 97.70% of NC correctly classified. However, GoogLeNet surpasssed the other models with the highest testing accuracy, hence, outcoming as the best model to classify AD.

 

Table 7. Comparison of different transfer learning model results.

E1MTCD_2019_v6n4_209_t0006.png 이미지

From Table 7, we can conclude that GoogLeNet is a powerful deep learning model for medical images specifically MR images classification. GoogLeNet produced highest training and testing accuracy over other models. The second highest accuracies was obtained by ResNet-18 outperforming AlexNet and VGG-16.

 

Ⅵ. CONCLUSION

The detection of Alzheimer’s disease remains a difficult problem, yet important for early diagnosis to get proper treatment. Currently, there is no treatment for AD and hence, early detection and classification is very important task to treat the patient. While there are many classification algorithms used in present days, classification using deep learning has captivated every researchers due to its flexibility and capacity to produce optimal results. However, in medical field acquiring enough data (images) is quite difficult as well as training a model from scratch is time consuming. Therefore, to overcome these problems transfer learning is used which requires only minimal data and takes few hours to classify MR images.

In this paper, we used transfer learning approach using different deep learning models such as GoogLeNet, AlexNet, VGG-6 and ResNet-18 as the base model for accurately classifying MR images amongst three different classes: AD, MCI and NC. We analyzed these models by changing the learning rate and the performance obtained at 1e-4 outperformed the performance of other cases. The models were well trained for our datasets and all the models were efficient in classification. Among all the other models, GoogLeNet produced the highest training and testing accuracy of 99.84% and 98.25% respectively. Therefore, transfer learning using GoogLeNet is definitely a successful approach to classify MR images into AD, MCI and NC.

 

Acknowledgement

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2018R1D1A1A09083734).

References

  1. F. Giannini and G. Leuzzi, Nonlinear Microwave Circuit Design. NewYork, NY: John Wiley & Sons Inc., 2004.
  2. V.L. Villemagne, S. Burnham, P. Bourgeat and B. Brown, "Amyloid ${\beta}$ deposition, neurodegeneration, and cognitive decline in sporadic Alzheimer's disease: a prospective cohort study," The Lancet Neurology, vol. 12, no. 4, pp. 357-367, Apr. 2013. https://doi.org/10.1016/S1474-4422(13)70044-9
  3. S. Belleville, C. Fouquet, S. Duchesne and D.L. Collins, "Detecting early preclinical Alzheimer's disease via cognition, neuropsychiatry, and neuroimaging: qualitative review and recommendations for testing," Journal of Alzheimer's disease, vol. 42, no. 4, pp. 375-382, Jan. 2014. https://doi.org/10.3233/JAD-141470
  4. J. Gaugler, B. James, T. Johnson, A. Marin and J. Weuve, "2019 Alzheimer's disease facts and figures," Alzheimers & Dementia, vol. 15, no. 3, pp. 321-87, Mar. 2019. https://doi.org/10.1016/j.jalz.2019.01.010
  5. M. Tabaton, P. Odetti, S. Cammarata and R. Borghi, "Artificial neural networks identify the predictive values of risk factors on the conversion of amnestic mild cognitive impairment," Journal of Alzheimer's Disease, vol. 19, no. 3, pp. 1035-1040, Jan. 2010. https://doi.org/10.3233/JAD-2010-1300
  6. J. B. Toledo, M. Bjerke, K. W. Chen and M. Rozycki, "Memory, executive, and multidomain subtle cognitive impairment Clinical and biomarker findings," Neurology, vol. 85, no. 2, pp. 144-153, Jul. 2015. https://doi.org/10.1212/WNL.0000000000001738
  7. N. Madusanka, H. K. Choi, J. H. So and B. K. Choi, "Alzheimer's Disease Classification Based on Multi-feature Fusion," Current Medical Imaging, vol. 15, no. 2, pp. 161-9, Feb, 2019. https://doi.org/10.2174/1573405614666181012102626
  8. S. Gauthier, B. Reisberg, M. Zaudig, R. C. Petersen, K. Ritchie, K. Broich, et. Al., "Mild cognitive impairment," The lancet, vol. 367, no. 9518, pp. 1262-70, Apr. 2006. https://doi.org/10.1016/S0140-6736(06)68542-5
  9. Y. LeCun, Y. Bengio and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, pp. 436-444, May. 2015. https://doi.org/10.1038/nature14539
  10. A. Ortiz, J. Munilla, J. M. Gorriz, and J. Ramirez, "Ensembles of deep learning architectures for the early diagnosis of the Alzheimer's disease," International Journal of neural systems, vol. 26, no. 07, pp. 1650025, No. 2016. https://doi.org/10.1142/S0129065716500258
  11. H. C. shin, H. R. Roth, M. Gao, "Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning," IEEE transactions on medical imaging, vol. 35, no. 5, pp. 1285-1298, Feb. 2016. https://doi.org/10.1109/TMI.2016.2528162
  12. K.H. Jin, M.T. McCann, E. Froustey, "Deep Convolutional Neural Network for Inverse Problems in Imaging," IEEE Transactions on Image Processing, vol. 26, no. 9, pp. 4509-4522, Jun. 2017. https://doi.org/10.1109/TIP.2017.2713099
  13. S. J. Pan, and Q. Yang, "A survey on transfer learning," IEEE Transactions on knowledge and data engineering, vol. 22, no. 10, pp. 1345-1359, Oct. 2009. https://doi.org/10.1109/TKDE.2009.191
  14. M. Xin and Y. Wang, "Research on image classification model based on deep convolution neural network," EURASIP Journal on Image and Video Processing, no. 1, pp. 40, Dec. 2019. https://doi.org/10.1186/s13640-020-00526-2
  15. J. Fang, Y. Zhou, Y. Yu, and S. Du, "Fine-grained vehicle model recognition using a coarse-to-fine convolutional neural network architecture," IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 7, pp. 1782-1792, Nov. 2016. https://doi.org/10.1109/TITS.2016.2620495
  16. J. Bouvrie, "Notes on convolutional neural networks," In Practice, 2006.
  17. C. Szegedy, L. Wei, J. Yangqing, S. Pierre, "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1-9, 2015.
  18. J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and F. F. Li, "ImageNet: A large-scale hierarchical image database," in 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, Jun. 2009.
  19. A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," In Advances in neural information processing systems, pp. 1097-1105, 2012.
  20. K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, Sep. 2014.
  21. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.