CLASIFICACIÓN DE NIVELES DE OBESIDAD MEDIANTE MODELOS DE MACHINE LEARNING: COMPARACIÓN ENTRE RANDOM FOREST, SVM Y REGRESIÓN LOGÍSTICA DESDE UNA PERSPECTIVA DE INTELIGENCIA ARTIFICIAL CLÍNICA
DOI:
https://doi.org/10.56238/arev7n12-107Palabras clave:
Machine Learning, Inteligencia Artificial, Obesidad, Random Forest, Salud DigitalResumen
El aumento global de la obesidad ha intensificado la necesidad de herramientas analíticas capaces de mejorar el diagnóstico y la estratificación del riesgo. Este estudio evalúa tres modelos de Aprendizaje Automático (Random Forest, Support Vector Machine y Regresión Logística Multinomial) para clasificar niveles de obesidad en adultos. El pipeline incluye preprocesamiento, imputación, codificación categórica, normalización, validación cruzada y evaluación multicriterio. Se incorporaron técnicas modernas de interpretabilidad basadas en Permutation Importance, permitiendo cuantificar el impacto de cada variable en la métrica F1-macro desde una perspectiva de Inteligencia Artificial clínica. También se implementó una línea base clínica basada únicamente en el Índice de Masa Corporal. Los resultados muestran que Random Forest presenta el mejor rendimiento, superando la línea base y los demás modelos. Los hallazgos refuerzan el potencial del Aprendizaje Automático como herramienta de apoyo en salud digital.
Descargas
Referencias
AGGARWAL, C. C. Outlier Analysis. 2. ed. Cham: Springer, 2015. DOI: https://doi.org/10.1007/978-3-319-14142-8_8
BENCEK, M.; KHALIL, A.; RAHMANI, A. A comprehensive review on obesity analytics using machine learning. Journal of Biomedical Informatics, v. 137, p. 104253, 2023.
BEAM, A. L.; KOHANE, I. S. Big data and machine learning in health care. JAMA, v. 319, n. 13, p. 1317–1318, 2018. DOI: https://doi.org/10.1001/jama.2017.18391
BHASKAR, R.; SINGH, A. Obesity epidemiology in emerging economies: updated perspectives. Current Obesity Reports, 2022.
BIECEK, P.; BURZYKOWSKI, T. Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models. Chapman and Hall/CRC, 2021. DOI: https://doi.org/10.1201/9780429027192
BREIMAN, L. Random forests. Machine Learning, v. 45, p. 5–32, 2021. (Reimpresso em edição comemorativa) DOI: https://doi.org/10.1023/A:1010933404324
CHENG, J.; SALAZAR, C. Body mass index and health risk: a critical review. Obesity Reviews, v. 22, n. 11, p. e13305, 2021.
FARRAN, B. et al. Global patterns and trends in body mass index. The Lancet Global Health, v. 11, n. 3, p. e350–e361, 2023.
GÉRON, A. Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow. 3. ed. Sebastopol: O’Reilly, 2022.
GHASSEMI, M.; OAKDEN-RAYNER, L.; BEAM, A. L. The false hope of current approaches to explainable AI in health care. The Lancet Digital Health, v. 3, n. 11, p. e745–e750, 2021. DOI: https://doi.org/10.1016/S2589-7500(21)00208-9
GEURTS, P.; ERNST, D.; WEHENKEL, L. Extremely randomized trees. Machine Learning, v. 63, p. 3–42, 2021. (Reedição especial) DOI: https://doi.org/10.1007/s10994-006-6226-1
HAN, J.; KAMBER, M.; PEI, J. Data Mining: Concepts and Techniques. 4. ed. Cambridge: Morgan Kaufmann, 2022.
HAYES, C.; FLINT, S. W. Understanding obesity as a complex, multifactorial disease. Current Obesity Reports, v. 12, p. 1–9, 2023.
HOAGLIN, D. C.; IGLEWICZ, B.; TUKEY, J. W. Performance of some resistant rules for outlier labeling. Journal of the American Statistical Association, v. 81, p. 991–999, 1986. DOI: https://doi.org/10.1080/01621459.1986.10478363
HRUBY, A.; HU, F. B. The epidemiology of obesity: a big picture. Pharmacoeconomics, v. 39, p. 673–689, 2021. DOI: https://doi.org/10.1007/s40273-014-0243-x
HRUBY, A.; HU, F. B. Obesity and metabolic risk: clinical implications. Annual Review of Public Health, v. 43, p. 185–204, 2022.
JAMES, G. et al. An Introduction to Statistical Learning. 2. ed. Springer, 2021. DOI: https://doi.org/10.1007/978-1-0716-1418-1
KARYOTAKIS, M. et al. Data quality challenges in machine learning-based health prediction models. npj Digital Medicine, v. 6, p. 121, 2023.
KUHN, M.; JOHNSON, K. Applied Predictive Modeling. 2. ed. Springer, 2020.
LEE, D. et al. Visual analytics for population health. IEEE Transactions on Visualization and Computer Graphics, v. 27, n. 2, p. 1126–1136, 2021.
LI, Y.; ZHANG, J.; WANG, X. Support vector machine applications in medical classification: a 2023 update. Artificial Intelligence in Medicine, v. 140, p. 102600, 2023.
LUO, W. et al. Strategies for normalizing categorical variables in predictive models. Statistics in Medicine, v. 35, n. 25, p. 4630–4645, 2016.
MARTÍNEZ-MILLANA, A. et al. AI-powered obesity prediction systems: a systematic review. Healthcare Analytics, v. 3, p. 100123, 2023.
MENDES, D. et al. Integrating machine learning into clinical obesity management. International Journal of Obesity, v. 45, p. 129–140, 2021.
MOLNAR, C. Interpretable Machine Learning. 2. ed. 2022.
NCD-RISK FACTOR COLLABORATION. Worldwide trends in BMI, underweight and obesity. The Lancet, v. 397, p. 191–202, 2021.
NEELAND, I. J. et al. Obesity phenotypes and metabolic risk. JACC, v. 81, n. 2, p. 203–219, 2023.
NGUYEN, M. et al. Machine learning for population-level obesity risk. PLoS Digital Health, v. 2, n. 9, p. e0000293, 2023.
NUTTALL, F. Q. Body mass index: Obsession or logic? Nutrition Today, v. 57, p. 123–131, 2022.
OJO, O. et al. AI-driven diagnostic tools for metabolic disorders. Frontiers in Digital Health, v. 5, p. 121–136, 2023.
OMS — ORGANIZAÇÃO MUNDIAL DA SAÚDE. Obesity and overweight: key facts. Geneva, 2023.
POPKIN, B. M. et al. Global nutrition transition and obesity trends. Lancet Diabetes & Endocrinology, v. 8, p. 1–15, 2020.
PROVOST, F.; FAWCETT, T. Data Science for Business. Cambridge: O’Reilly, 2013.
RASCHKA, S.; MIRJALILI, V. Advances in machine learning model development with Python. Journal of Machine Learning Research, 2021.
RASCHKA, S.; PATTERSON, J.; NOLET, C. Machine Learning in Python: advances and best practices. Journal of Machine Learning Applications, v. 4, n. 1, p. 1–18, 2022.
RIBEIRO, M. T.; SINGH, S.; GUESTRIN, C. Anchors: high-precision model-agnostic explanations. AAAI, p. 1521–1529, 2020.
RIBEIRO, L. M.; OLIVEIRA, J. H. Assessing data quality impacts on predictive modeling in healthcare. Data & Knowledge Engineering, v. 147, p. 102195, 2023.
RIBEIRO, A. L.; CARVALHO, B. Random forest optimization for clinical prediction tasks. BMC Medical Informatics and Decision Making, 2023.
RUDIN, C. Stop explaining black box models: instead use interpretable models. Nature Machine Intelligence, v. 3, p. 206–215, 2021.
SALIHU, H. M.; ALAM, S. The global burden of obesity. Global Health Journal, v. 6, p. 31–39, 2022.
SINGH, S.; KIM, J.; SHAH, N. AI-driven metabolic disorder prediction. npj Digital Medicine, v. 7, p. 11, 2024.
SMITH, K. B.; SMITH, M. S. Obesity classification limitations and clinical implications. Nature Metabolism, 2021.
SOKOLOVA, M.; LAPALME, G. A systematic analysis of performance measures for classification. Information Processing & Management, v. 57, p. 102345, 2020.
SUTTON, B.; PINCOCK, R. Reassessing BMI thresholds: a population study. Public Health Nutrition, v. 25, n. 4, p. 567–575, 2022.
SUN, X. et al. Performance of cross-validation in high-dimensional health data. Scientific Reports, v. 11, p. 22410, 2021.
TAN, Z.; YU, S.; JIANG, X. Evaluation strategies for multi-class medical classifiers. Artificial Intelligence in Medicine, v. 129, p. 102299, 2022.
TOPOL, E. Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again. New York: Basic Books, 2019.
VAN DEN BROECK, J. et al. Data cleaning in epidemiology. American Journal of Epidemiology, v. 161, p. 103–113, 2005. DOI: https://doi.org/10.1093/aje/kwi016
WANG, Y. et al. A comparative analysis of clinical ML models. IEEE Journal of Biomedical and Health Informatics, v. 26, p. 345–357, 2022.
WORLD OBESITY FEDERATION. Global Obesity Atlas 2024. Londres: WOF, 2024.
XU, Y. et al. Advances in explainable AI for clinical risk models. Patterns, v. 4, n. 2, p. 100678, 2023.
XU, Z.; LI, M.; HAN, Y. Challenges in multi-class epidemiological classification using machine learning. BMC Bioinformatics, 2022.
ZHANG, Q. et al. Deep learning-based obesity classification: a systematic review. Computers in Biology and Medicine, v. 142, p. 105201, 2021.
ZHANG, T. et al. Evaluating multi-class classifiers under imbalanced settings. Knowledge-Based Systems, v. 257, p. 110098, 2023.
ZHOU, Z.-H. Ensemble Methods: Foundations and Algorithms. 2. ed. Boca Raton: CRC Press, 2021.