A Machine Learning Methodology for Diagnosing Chronic Kidney Disease
Keywords:
(UCI)Abstract
Chronic kidney disease (CKD) is a global health problem with high morbidity and mortality
rate, and it induces other diseases. Since there are no obvious symptoms during the early stages
of CKD, patients often fail to notice the disease. Early detection of CKD enables patients to
receive timely treatment to ameliorate the progression of this disease. Machine learning models
can effectively aid clinicians achieve this goal due to their fast and accurate recognition
performance. In this study, we propose a machine learning methodology for diagnosing CKD.
The CKD data set was obtained from the University of California Irvine (UCI) machine
learning repository, which has a large number of missing values. Random forest imputation
was used to fill in the missing values, which selects several complete samples with the most
similar measurements to process the missing data for each incomplete sample. Missing values
are usually seen in real-life medical situations because patients may miss some measurements
for various reasons. After effectively filling out the incomplete data set, Five machine learning
algorithms (logistic regression, random forest, support vector machine, decision tree, ,naïve
bayes, KNN and gradient boost classier ) were used to establish models. Also the stage of the
disease is also predicted according to the age of the person.











