Analysis of Medical and Health-Related Data about Adult Obesity using Supervised and Unsupervised Learning
G. Ispirova, T. Eftimov, B. Koroušić Seljak
15th International Conference on Informatics and Information Technologies CIIT 2018
Mavrovo, Macedonia, 20-22 April, 2018
Obesity is a growing problem in most developed countries and it is responsible for a significant degree of morbidity and mortality. Overweight and obesity are linked to more deaths worldwide than underweight, and globally there are more people who are obese than underweight. This paper focuses on working with medical and health-related data concerning the problem of adult obesity, in particular using several machine learning algorithms on this kind of data to gain information from the collected data. In order to get the most out of the data, two machine learning techniques-supervised and unsupervised learning are used. Adequate labeling of the data is introduced and classification is selected as a supervised learning technique. Various classification algorithms are applied, using 10-fold cross validation, and a comparison of the results is presented. Clustering is used as an unsupervised technique, with the goal of identifying groups of participants that exhibit similar behavior in terms of the results from the programs. For this purpose, we applied a k-medoids clustering algorithm known as PAM (Partitioning Around Medoids) and a visualization technique to represent the detected clusters. The paper covers: data extraction, data preprocessing, classification, clustering and summary of the results.
BIBTEX copied to Clipboard