THE MARKER: HERRAMIENTA DE MARCADO DE IMÁGENES ASISTIDA POR INTELIGENCIA ARTIFICIAL
DOI:
https://doi.org/10.56238/levv16n54-174Palabras clave:
Aprendizaje Automático, Visión Artificial, Etiquetado de ImágenesResumen
El objetivo principal de este trabajo de fin de carrera es desarrollar The Marker, una herramienta de marcado de imágenes asistida por inteligencia artificial destinada a la creación de conjuntos de datos personalizados para aplicaciones de visión artificial. El estudio se basa en conceptos de inteligencia artificial, aprendizaje automático, redes neuronales profundas y ergonomía, destacando la importancia de la anotación de imágenes en la construcción de modelos computacionales eficaces y los impactos físicos asociados a actividades repetitivas, como LER, DORT y síndrome visual informático. La metodología aplicada implicó el desarrollo de una aplicación modular compuesta por una interfaz gráfica en React, procesamiento en Rust, ejecución del modelo Segment Anything Model mediante scripts en Python y almacenamiento seguro con cifrado AES-GCM. Se realizaron pruebas experimentales para evaluar la precisión, el tiempo de inferencia, la cantidad de interacciones manuales necesarias y el rendimiento del sistema en diferentes resoluciones de imagen. Los resultados indican que la herramienta reduce significativamente el esfuerzo manual al sugerir puntos de segmentación automáticamente, funciona en un entorno offline y en máquinas con menor potencia de procesamiento, ofrece una experiencia ergonómica mejorada y demuestra su potencial para acelerar la creación de bases de datos visuales de forma colaborativa.
Descargas
Referencias
ABADI, M. et al. TensorFlow: a system for large-scale machine learning. 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2016.
ANDRILUKA, M.; UIJLINGS, J. R.; FERRARI, V. Fluid annotation: a human–machine collaboration interface for full image annotation. arXiv preprint, arXiv:1806.07527, 2018. Disponível em: https://arxiv.org/abs/1806.07527. Acesso em: 28 set. 2025.
ARMSTRONG, T. J. et al. A conceptual model for work-related neck and upper-limb musculoskeletal disorders. Scandinavian Journal of Work, Environment & Health, v. 19, n. 2, p. 73–84, 1993. DOI: https://doi.org/10.5271/sjweh.1494
BOJARSKI, M. et al. End to end learning for self-driving cars. arXiv preprint, arXiv:1604.07316, 2016. Disponível em: https://arxiv.org/abs/1604.07316. Acesso em: 28 set. 2025.
BRASIL. MINISTÉRIO DA SAÚDE. Saúde do trabalhador: notificações de LER/DORT no Brasil. Brasília: Ministério da Saúde, 2023. Disponível em: https://www.gov.br/saude/pt-br/assuntos/saude-do-trabalhador. Acesso em: 28 set. 2025.
BRASIL. MINISTÉRIO DO TRABALHO E EMPREGO. Norma Regulamentadora nº 17: Ergonomia. Brasília: MTE, 2023.
CHEN, X. et al. Brain tumor classification based on neural architecture search. Scientific Reports, v. 12, art. 19206, 2022. DOI: 10.1038/s41598-022-22172-6. DOI: https://doi.org/10.1038/s41598-022-22172-6
CHEN, X. et al. UCVL: a benchmark for crime surveillance video analysis with large models. Neurocomputing, v. 600, p. 128–142, 2025.
COLES-BRENNAN, C.; SULLEY, A.; YOUNG, G. Management of digital eye strain. Clinical and Experimental Optometry, v. 102, n. 1, p. 18–29, 2019. DOI: https://doi.org/10.1111/cxo.12798
DEFENSE SCOOP. NGA awards $700M data labeling contract to advance computer vision models. DefenseScoop, 3 set. 2024. Disponível em: https://defensescoop.com/2024/09/03/nga-700m-data-labeling-advance-computer-vision-models/. Acesso em: 3 nov. 2025.
DUTTA, A.; ZISSERMAN, A. The VIA annotation software for images, audio and video. arXiv preprint, arXiv:1904.10699, 2019. Disponível em: https://arxiv.org/abs/1904.10699. Acesso em: 28 set. 2025.
EVERINGHAM, M. et al. The Pascal Visual Object Classes (VOC) challenge. International Journal of Computer Vision, v. 88, n. 2, p. 303–338, 2010. DOI: https://doi.org/10.1007/s11263-009-0275-4
GOHILL, H. et al. A hybrid technique for plant disease identification and localisation in real-time. Computers and Electronics in Agriculture, v. 219, 108838, 2024.
GOODFELLOW, I.; BENGIO, Y.; COURVILLE, A. Deep learning. Cambridge: MIT Press, 2016.
GOWRISANKARAN, S.; SHEEDY, J. E. Computer vision syndrome: a review. Work, v. 52, n. 2, p. 303–314, 2015. DOI: https://doi.org/10.3233/WOR-152162
HAGBERG, M.; SILVERSTEIN, B.; WELLS, R. Work related musculoskeletal disorders: a reference book for prevention. London: Taylor & Francis, 1995.
KHALIL, K.; KIMIAFAR, K.; ZADEH, M. R.; et al.Artificial intelligence literacy among healthcare professionals and students: a systematic review. Health Informatics Journal, v. 29, n. 4, p. 1–15, 2023.
KIRILLOV, A. et al. Segment anything. arXiv preprint, arXiv:2304.02643, 2023. Disponível em: https://arxiv.org/abs/2304.02643. Acesso em: 28 set. 2025.
KOVASHKA, A. et al. Human-in-the-loop annotation. Foundations and Trends in Computer Graphics and Vision, 2016.
KRIZHEVSKY, A.; SUTSKEVER, I.; HINTON, G. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, v. 25, 2012.
KUNDU, R.; DAS, R.; GHOSH, S.; et al. Pneumonia detection in chest X-ray images using an ensemble of convolutional neural networks. PLOS ONE, v. 16, n. 9, e0256630, 2021. DOI: https://doi.org/10.1371/journal.pone.0256630
LECUN, Y.; BENGIO, Y.; HINTON, G. Deep learning. Nature, v. 521, p. 436–444, 2015. DOI: https://doi.org/10.1038/nature14539
LIN, T. Y. et al. Microsoft COCO: common objects in context. In: European Conference on Computer Vision (ECCV), 2014. Disponível em: https://arxiv.org/abs/1405.0312. Acesso em: 28 set. 2025.
MERRILL, R. M.; ALLEMAN, J. R. The relevance of ergonomic interventions for the prevention of musculoskeletal disorders. Journal of Occupational and Environmental Medicine, v. 54, n. 4, p. 427–433, 2012.
META. React – A JavaScript library for building user interfaces. 2013. Disponível em: https://react.dev/. Acesso em: 28 set. 2025.
MITCHELL, T. Machine learning. New York: McGraw-Hill, 1997.
MORAIS, D. M. G. et al. O conceito de inteligência artificial usado no mercado de softwares, da educação tecnológica e na literatura científica. Educação Profissional e Tecnológica em Revista, v. 4, n. 2, p. 98–109, 2020. DOI: https://doi.org/10.36524/profept.v4i2.557
OMS – ORGANIZAÇÃO MUNDIAL DA SAÚDE. Ergonomics in the workplace. Geneva: WHO, 2003.
PAPADOPOULOS, D. P. et al. Extreme clicking for efficient object annotation. In: International Conference on Computer Vision (ICCV), 2017. DOI: https://doi.org/10.1109/ICCV.2017.528
PASZKE, A. et al. PyTorch: an imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, v. 32, 2019.
RAJPURKAR, P. et al. CheXNet: pneumonia detection. arXiv preprint, arXiv:1711.05225, 2017.
REDMON, J. et al. YOLO: real-time object detection. CVPR, 2016.
REN, S.; HE, K.; GIRSHICK, R.; SUN, J. Faster R-CNN: region proposal networks. NeurIPS, 2015.
ROSENFIELD, M. Computer vision syndrome (a.k.a. digital eye strain). Optometry in Practice, v. 17, n. 1, p. 1–10, 2016.
RUSSELL, B. et al. LabelMe: a database and web-based tool for image annotation. International Journal of Computer Vision, v. 77, p. 157–173, 2008. DOI: https://doi.org/10.1007/s11263-007-0090-8
RUSSELL, S.; NORVIG, P. Artificial intelligence: a modern approach. 3. ed. Upper Saddle River: Pearson, 2010.
SAGER, C.; JANIESCH, C.; ZSCHECH, P. A survey of image labelling for computer vision applications. arXiv preprint, arXiv:2104.08885, 2021. Disponível em: https://arxiv.org/abs/2104.08885. Acesso em: 28 set. 2025.
SANAR. Lesões por esforço repetitivo (LER) e distúrbios osteomusculares relacionados ao trabalho (DORT): conceitos e prevenção. Disponível em: https://www.sanar.com.br/. Acesso em: 17 nov. 2025.
SCHUHMANN, C.; BEAUMONT, R.; VENCU, R.; GORDON, C.; WIGHTMAN, R.; CHERTI, M.; et al. LAION-5B: An open large-scale dataset for training next generation image-text models. Advances in Neural Information Processing Systems, v. 35, p. 25278–25294, 2022.
SHEPPARD, A. L.; WOLFFSOHN, J. S. Digital eye strain: prevalence, measurement and amelioration. BMJ Open Ophthalmology, v. 3, n. 1, e000146, 2018. DOI: https://doi.org/10.1136/bmjophth-2018-000146
SHIN, H. et al. Visual product search using deep learning. 2022.
SZELISKI, R. Computer Vision: Algorithms and Applications. Springer, 2010. DOI: https://doi.org/10.1007/978-1-84882-935-0
TZELEPIS, D. et al. Efficient bounding box annotation. Pattern Recognition Letters, 2021
TAURI. Tauri documentation. 2022. Disponível em: https://tauri.app/. Acesso em: 28 set. 2025.
TIME. Meta scales up the AI data industry. Time, 19 set. 2024. Disponível em: https://time.com/7294699/meta-scale-ai-data-industry/. Acesso em: 3 nov. 2025.
WANG, L.; ZHAO, X.; ZHANG, Y.; HAN, X.; DEVEÇI, M. A review of convolutional neural networks in computer vision. Artificial Intelligence Review, v. 57, n. 4, p. 1–27, 2024. DOI: https://doi.org/10.1007/s10462-024-10721-6
ZHOU, Z. H. A brief introduction to weakly supervised learning. Springer, 2018.