[Home ] [Archive]   [ فارسی ]  
:: Main :: About :: Current Issue :: Archive :: Search :: Submit :: Contact ::
:: Volume 5, Issue 1 (Spring 2018) ::
2018, 5(1): 25-34 Back to browse issues page
Diagnosis of Leukemia Type by Machine Learning: Dimension Reduction and Balancing
Mohamadreza Pajoohan , Zeinab Gharaati
Ph.D in computer Engineering, Assistant Professor of Computer Engineering, Department of Computer Engineering Dept., Yazd University, Yazd, Iran.
Abstract:   (116 Views)
Introduction: Combination of artificial intelligence and data mining has been resulted to considerable progress in the prevention and diagnosis of diseases. Complex models have been proposed for the diagnosis of acute leukemia from genetic information, but significant results have not been achieved. This study aimed to predict the type of blood cancer by examining a wide range of parametric and non-parametric methods and to increase the generalization of learning by extracting fewer essential features.
Methods: This descriptive and analytical study used Leukemia1 dataset from the Vanderbilt University of USA. This dataset contains a set of bone marrow and blood samples of patients having leukemia used for classification based on three subgroups of leukemia, namely ALL B-cell, ALL T-cell and AML. Parametric classification including linear algorithms, Naïve Bayes, Euclidean distance, nearest average, template matching as well as non-parametric classification using basic estimator algorithms, kernel, k-nearest neighbors and k-nearest neighbors based on the kernel has been used.
Results: Considering all features, the best method was nearest mean prediction method achieving the accuracy of 92.86%. By applying the PCA feature reduction method, too, the best result was related to the nearest mean algorithm and by average number of features of 6.8, the accuracy became 96%. Finally, using data-balancing methods and quadratic algorithm resulted in the average number of features and the accuracy of 5.41 and 98.59% respectively.
Conclusion: The results show the effectiveness of essential features extraction in improving the accuracy of Bayes-based models and its preference over the existing complex models.
 
Keywords: Genetics data, Diagnosis of type of blood cancer, Data mining, Data balancing, Dimension reduction.
Full-Text [PDF 605 kb]   (55 Downloads)    
Type of Study: Original Article | Subject: Data Mining
Received: 2017/11/16 | Accepted: 2018/05/7
Send email to the article author

Add your comments about this article
Your username or Email:

CAPTCHA code


XML   Persian Abstract   Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Pajoohan M, Gharaati Z. Diagnosis of Leukemia Type by Machine Learning: Dimension Reduction and Balancing. Journal of Health and Biomedical Informatics. 2018; 5 (1) :25-34
URL: http://jhbmi.ir/article-1-251-en.html


Volume 5, Issue 1 (Spring 2018) Back to browse issues page
مجله انفورماتیک سلامت و زیست پزشکی Journal of Health and Biomedical Informatics
Persian site map - English site map - Created in 0.05 seconds with 31 queries by YEKTAWEB 3731