Abstract:
Presumptive treatment and self-medication for malaria have
been used in limited-resource countries. However, these
approaches have been considered unreliable due to the unnecessary use of malaria medication. This study aims to demonstrate supervised machine learning models in diagnosing
malaria using patient symptoms and demographic features.
Malaria diagnosis dataset extracted in two regions of Tanzania:
Morogoro and Kilimanjaro. Important features were selected to
improve model performance and reduce processing time.
Machine learning classifiers with the k-fold cross-validation
method were used to train and validate the model. The dataset
developed a machine learning model for malaria diagnosis
using patient symptoms and demographic features. A malaria
diagnosis dataset of 2556 patients’ records with 36 features was
used. It was observed that the ranking of features differs among
regions and when combined dataset. Significant features were
selected, residence area, fever, age, general body malaise, visit
date, and headache. Random Forest was the best classifier with
an accuracy of 95% in Kilimanjaro, 87% in Morogoro and 82% in
the combined dataset. Based on clinical symptoms and demographic features, a regional-specific malaria predictive model
was developed to demonstrate relevant machine learning classifiers. Important features are useful in making the disease
prediction.