CLASSIFICAÇÃO DE NÍVEIS DE OBESIDADE POR MODELOS DE MACHINE LEARNING: COMPARAÇÃO ENTRE RANDOM FOREST, SVM E REGRESSÃO LOGÍSTICA SOB UMA PERSPECTIVA DE INTELIGÊNCIA ARTIFICIAL CLÍNICA
DOI:
https://doi.org/10.56238/arev7n12-107Palavras-chave:
Machine Learning, Inteligência Artificial, Obesidade, Random Forest, Saúde DigitalResumo
O aumento global da prevalência de obesidade tem impulsionado o desenvolvimento de ferramentas analíticas capazes de aprimorar o diagnóstico e a estratificação de risco. Este estudo investiga a aplicação de três modelos de Machine Learning (Random Forest, Support Vector Machine e Regressão Logística Multinomial) para a classificação de níveis de obesidade em adultos. O pipeline proposto inclui pré-processamento, imputação, codificação categórica, normalização, validação cruzada e avaliação multicritério. O estudo incorpora técnicas modernas de interpretabilidade baseadas em Permutation Importance, permitindo quantificar o impacto de cada variável na métrica F1-macro sob perspectiva de Inteligência Artificial aplicada à saúde. Implementou-se também baseline clínico baseado exclusivamente no IMC (Índice de Massa Corporal), possibilitando comparar métodos estatísticos tradicionais com abordagens supervisionadas. Os resultados demonstram melhor desempenho do Random Forest, superando significativamente o baseline clínico e os demais modelos. Os achados evidenciam o potencial do Machine Learning como ferramenta auxiliar em saúde digital, oferecendo previsões mais robustas do que regras simplificadas.
Downloads
Referências
AGGARWAL, C. C. Outlier Analysis. 2. ed. Cham: Springer, 2015. DOI: https://doi.org/10.1007/978-3-319-14142-8_8
BENCEK, M.; KHALIL, A.; RAHMANI, A. A comprehensive review on obesity analytics using machine learning. Journal of Biomedical Informatics, v. 137, p. 104253, 2023.
BEAM, A. L.; KOHANE, I. S. Big data and machine learning in health care. JAMA, v. 319, n. 13, p. 1317–1318, 2018. DOI: https://doi.org/10.1001/jama.2017.18391
BHASKAR, R.; SINGH, A. Obesity epidemiology in emerging economies: updated perspectives. Current Obesity Reports, 2022.
BIECEK, P.; BURZYKOWSKI, T. Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models. Chapman and Hall/CRC, 2021. DOI: https://doi.org/10.1201/9780429027192
BREIMAN, L. Random forests. Machine Learning, v. 45, p. 5–32, 2021. (Reimpresso em edição comemorativa) DOI: https://doi.org/10.1023/A:1010933404324
CHENG, J.; SALAZAR, C. Body mass index and health risk: a critical review. Obesity Reviews, v. 22, n. 11, p. e13305, 2021.
FARRAN, B. et al. Global patterns and trends in body mass index. The Lancet Global Health, v. 11, n. 3, p. e350–e361, 2023.
GÉRON, A. Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow. 3. ed. Sebastopol: O’Reilly, 2022.
GHASSEMI, M.; OAKDEN-RAYNER, L.; BEAM, A. L. The false hope of current approaches to explainable AI in health care. The Lancet Digital Health, v. 3, n. 11, p. e745–e750, 2021. DOI: https://doi.org/10.1016/S2589-7500(21)00208-9
GEURTS, P.; ERNST, D.; WEHENKEL, L. Extremely randomized trees. Machine Learning, v. 63, p. 3–42, 2021. (Reedição especial) DOI: https://doi.org/10.1007/s10994-006-6226-1
HAN, J.; KAMBER, M.; PEI, J. Data Mining: Concepts and Techniques. 4. ed. Cambridge: Morgan Kaufmann, 2022.
HAYES, C.; FLINT, S. W. Understanding obesity as a complex, multifactorial disease. Current Obesity Reports, v. 12, p. 1–9, 2023.
HOAGLIN, D. C.; IGLEWICZ, B.; TUKEY, J. W. Performance of some resistant rules for outlier labeling. Journal of the American Statistical Association, v. 81, p. 991–999, 1986. DOI: https://doi.org/10.1080/01621459.1986.10478363
HRUBY, A.; HU, F. B. The epidemiology of obesity: a big picture. Pharmacoeconomics, v. 39, p. 673–689, 2021. DOI: https://doi.org/10.1007/s40273-014-0243-x
HRUBY, A.; HU, F. B. Obesity and metabolic risk: clinical implications. Annual Review of Public Health, v. 43, p. 185–204, 2022.
JAMES, G. et al. An Introduction to Statistical Learning. 2. ed. Springer, 2021. DOI: https://doi.org/10.1007/978-1-0716-1418-1
KARYOTAKIS, M. et al. Data quality challenges in machine learning-based health prediction models. npj Digital Medicine, v. 6, p. 121, 2023.
KUHN, M.; JOHNSON, K. Applied Predictive Modeling. 2. ed. Springer, 2020.
LEE, D. et al. Visual analytics for population health. IEEE Transactions on Visualization and Computer Graphics, v. 27, n. 2, p. 1126–1136, 2021.
LI, Y.; ZHANG, J.; WANG, X. Support vector machine applications in medical classification: a 2023 update. Artificial Intelligence in Medicine, v. 140, p. 102600, 2023.
LUO, W. et al. Strategies for normalizing categorical variables in predictive models. Statistics in Medicine, v. 35, n. 25, p. 4630–4645, 2016.
MARTÍNEZ-MILLANA, A. et al. AI-powered obesity prediction systems: a systematic review. Healthcare Analytics, v. 3, p. 100123, 2023.
MENDES, D. et al. Integrating machine learning into clinical obesity management. International Journal of Obesity, v. 45, p. 129–140, 2021.
MOLNAR, C. Interpretable Machine Learning. 2. ed. 2022.
NCD-RISK FACTOR COLLABORATION. Worldwide trends in BMI, underweight and obesity. The Lancet, v. 397, p. 191–202, 2021.
NEELAND, I. J. et al. Obesity phenotypes and metabolic risk. JACC, v. 81, n. 2, p. 203–219, 2023.
NGUYEN, M. et al. Machine learning for population-level obesity risk. PLoS Digital Health, v. 2, n. 9, p. e0000293, 2023.
NUTTALL, F. Q. Body mass index: Obsession or logic? Nutrition Today, v. 57, p. 123–131, 2022.
OJO, O. et al. AI-driven diagnostic tools for metabolic disorders. Frontiers in Digital Health, v. 5, p. 121–136, 2023.
OMS — ORGANIZAÇÃO MUNDIAL DA SAÚDE. Obesity and overweight: key facts. Geneva, 2023.
POPKIN, B. M. et al. Global nutrition transition and obesity trends. Lancet Diabetes & Endocrinology, v. 8, p. 1–15, 2020.
PROVOST, F.; FAWCETT, T. Data Science for Business. Cambridge: O’Reilly, 2013.
RASCHKA, S.; MIRJALILI, V. Advances in machine learning model development with Python. Journal of Machine Learning Research, 2021.
RASCHKA, S.; PATTERSON, J.; NOLET, C. Machine Learning in Python: advances and best practices. Journal of Machine Learning Applications, v. 4, n. 1, p. 1–18, 2022.
RIBEIRO, M. T.; SINGH, S.; GUESTRIN, C. Anchors: high-precision model-agnostic explanations. AAAI, p. 1521–1529, 2020.
RIBEIRO, L. M.; OLIVEIRA, J. H. Assessing data quality impacts on predictive modeling in healthcare. Data & Knowledge Engineering, v. 147, p. 102195, 2023.
RIBEIRO, A. L.; CARVALHO, B. Random forest optimization for clinical prediction tasks. BMC Medical Informatics and Decision Making, 2023.
RUDIN, C. Stop explaining black box models: instead use interpretable models. Nature Machine Intelligence, v. 3, p. 206–215, 2021.
SALIHU, H. M.; ALAM, S. The global burden of obesity. Global Health Journal, v. 6, p. 31–39, 2022.
SINGH, S.; KIM, J.; SHAH, N. AI-driven metabolic disorder prediction. npj Digital Medicine, v. 7, p. 11, 2024.
SMITH, K. B.; SMITH, M. S. Obesity classification limitations and clinical implications. Nature Metabolism, 2021.
SOKOLOVA, M.; LAPALME, G. A systematic analysis of performance measures for classification. Information Processing & Management, v. 57, p. 102345, 2020.
SUTTON, B.; PINCOCK, R. Reassessing BMI thresholds: a population study. Public Health Nutrition, v. 25, n. 4, p. 567–575, 2022.
SUN, X. et al. Performance of cross-validation in high-dimensional health data. Scientific Reports, v. 11, p. 22410, 2021.
TAN, Z.; YU, S.; JIANG, X. Evaluation strategies for multi-class medical classifiers. Artificial Intelligence in Medicine, v. 129, p. 102299, 2022.
TOPOL, E. Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again. New York: Basic Books, 2019.
VAN DEN BROECK, J. et al. Data cleaning in epidemiology. American Journal of Epidemiology, v. 161, p. 103–113, 2005. DOI: https://doi.org/10.1093/aje/kwi016
WANG, Y. et al. A comparative analysis of clinical ML models. IEEE Journal of Biomedical and Health Informatics, v. 26, p. 345–357, 2022.
WORLD OBESITY FEDERATION. Global Obesity Atlas 2024. Londres: WOF, 2024.
XU, Y. et al. Advances in explainable AI for clinical risk models. Patterns, v. 4, n. 2, p. 100678, 2023.
XU, Z.; LI, M.; HAN, Y. Challenges in multi-class epidemiological classification using machine learning. BMC Bioinformatics, 2022.
ZHANG, Q. et al. Deep learning-based obesity classification: a systematic review. Computers in Biology and Medicine, v. 142, p. 105201, 2021.
ZHANG, T. et al. Evaluating multi-class classifiers under imbalanced settings. Knowledge-Based Systems, v. 257, p. 110098, 2023.
ZHOU, Z.-H. Ensemble Methods: Foundations and Algorithms. 2. ed. Boca Raton: CRC Press, 2021.