THE MARKER: ARTIFICIAL INTELLIGENCE-ASSISTED IMAGE MARKING TOOL
DOI:
https://doi.org/10.56238/levv16n54-174Keywords:
Machine Learning, Computer Vision, LabelingAbstract
This undergraduate thesis aims to develop The Marker, an AI-assisted image-annotation tool designed to create customized datasets for computer-vision applications, grounding the study in concepts of artificial intelligence, machine learning, deep neural networks, and ergonomics while emphasizing the importance of image annotation in building effective computational models and the physical impacts associated with repetitive tasks such as RSI, WMSDs, and Computer Vision Syndrome. The applied methodology involved developing a modular application composed of a React graphical interface, processing modules in Rust, execution of the Segment Anything Model (SAM) via Python scripts, and secure storage with AES-GCM encryption; experimental tests were conducted to evaluate accuracy, interference time, the number of manual interactions required, and system performance across different image resolutions. The results indicate that the tool significantly reduces manual effort by automatically suggesting segmentation points, operates offline and on lower-powered machines, provides an improved ergonomic experience, and shows strong potential to accelerate the collaborative creation of visual datasets.
Downloads
References
ABADI, M. et al. TensorFlow: a system for large-scale machine learning. 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2016.
ANDRILUKA, M.; UIJLINGS, J. R.; FERRARI, V. Fluid annotation: a human–machine collaboration interface for full image annotation. arXiv preprint, arXiv:1806.07527, 2018. Disponível em: https://arxiv.org/abs/1806.07527. Acesso em: 28 set. 2025.
ARMSTRONG, T. J. et al. A conceptual model for work-related neck and upper-limb musculoskeletal disorders. Scandinavian Journal of Work, Environment & Health, v. 19, n. 2, p. 73–84, 1993. DOI: https://doi.org/10.5271/sjweh.1494
BOJARSKI, M. et al. End to end learning for self-driving cars. arXiv preprint, arXiv:1604.07316, 2016. Disponível em: https://arxiv.org/abs/1604.07316. Acesso em: 28 set. 2025.
BRASIL. MINISTÉRIO DA SAÚDE. Saúde do trabalhador: notificações de LER/DORT no Brasil. Brasília: Ministério da Saúde, 2023. Disponível em: https://www.gov.br/saude/pt-br/assuntos/saude-do-trabalhador. Acesso em: 28 set. 2025.
BRASIL. MINISTÉRIO DO TRABALHO E EMPREGO. Norma Regulamentadora nº 17: Ergonomia. Brasília: MTE, 2023.
CHEN, X. et al. Brain tumor classification based on neural architecture search. Scientific Reports, v. 12, art. 19206, 2022. DOI: 10.1038/s41598-022-22172-6. DOI: https://doi.org/10.1038/s41598-022-22172-6
CHEN, X. et al. UCVL: a benchmark for crime surveillance video analysis with large models. Neurocomputing, v. 600, p. 128–142, 2025.
COLES-BRENNAN, C.; SULLEY, A.; YOUNG, G. Management of digital eye strain. Clinical and Experimental Optometry, v. 102, n. 1, p. 18–29, 2019. DOI: https://doi.org/10.1111/cxo.12798
DEFENSE SCOOP. NGA awards $700M data labeling contract to advance computer vision models. DefenseScoop, 3 set. 2024. Disponível em: https://defensescoop.com/2024/09/03/nga-700m-data-labeling-advance-computer-vision-models/. Acesso em: 3 nov. 2025.
DUTTA, A.; ZISSERMAN, A. The VIA annotation software for images, audio and video. arXiv preprint, arXiv:1904.10699, 2019. Disponível em: https://arxiv.org/abs/1904.10699. Acesso em: 28 set. 2025.
EVERINGHAM, M. et al. The Pascal Visual Object Classes (VOC) challenge. International Journal of Computer Vision, v. 88, n. 2, p. 303–338, 2010. DOI: https://doi.org/10.1007/s11263-009-0275-4
GOHILL, H. et al. A hybrid technique for plant disease identification and localisation in real-time. Computers and Electronics in Agriculture, v. 219, 108838, 2024.
GOODFELLOW, I.; BENGIO, Y.; COURVILLE, A. Deep learning. Cambridge: MIT Press, 2016.
GOWRISANKARAN, S.; SHEEDY, J. E. Computer vision syndrome: a review. Work, v. 52, n. 2, p. 303–314, 2015. DOI: https://doi.org/10.3233/WOR-152162
HAGBERG, M.; SILVERSTEIN, B.; WELLS, R. Work related musculoskeletal disorders: a reference book for prevention. London: Taylor & Francis, 1995.
KHALIL, K.; KIMIAFAR, K.; ZADEH, M. R.; et al.Artificial intelligence literacy among healthcare professionals and students: a systematic review. Health Informatics Journal, v. 29, n. 4, p. 1–15, 2023.
KIRILLOV, A. et al. Segment anything. arXiv preprint, arXiv:2304.02643, 2023. Disponível em: https://arxiv.org/abs/2304.02643. Acesso em: 28 set. 2025.
KOVASHKA, A. et al. Human-in-the-loop annotation. Foundations and Trends in Computer Graphics and Vision, 2016.
KRIZHEVSKY, A.; SUTSKEVER, I.; HINTON, G. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, v. 25, 2012.
KUNDU, R.; DAS, R.; GHOSH, S.; et al. Pneumonia detection in chest X-ray images using an ensemble of convolutional neural networks. PLOS ONE, v. 16, n. 9, e0256630, 2021. DOI: https://doi.org/10.1371/journal.pone.0256630
LECUN, Y.; BENGIO, Y.; HINTON, G. Deep learning. Nature, v. 521, p. 436–444, 2015. DOI: https://doi.org/10.1038/nature14539
LIN, T. Y. et al. Microsoft COCO: common objects in context. In: European Conference on Computer Vision (ECCV), 2014. Disponível em: https://arxiv.org/abs/1405.0312. Acesso em: 28 set. 2025.
MERRILL, R. M.; ALLEMAN, J. R. The relevance of ergonomic interventions for the prevention of musculoskeletal disorders. Journal of Occupational and Environmental Medicine, v. 54, n. 4, p. 427–433, 2012.
META. React – A JavaScript library for building user interfaces. 2013. Disponível em: https://react.dev/. Acesso em: 28 set. 2025.
MITCHELL, T. Machine learning. New York: McGraw-Hill, 1997.
MORAIS, D. M. G. et al. O conceito de inteligência artificial usado no mercado de softwares, da educação tecnológica e na literatura científica. Educação Profissional e Tecnológica em Revista, v. 4, n. 2, p. 98–109, 2020. DOI: https://doi.org/10.36524/profept.v4i2.557
OMS – ORGANIZAÇÃO MUNDIAL DA SAÚDE. Ergonomics in the workplace. Geneva: WHO, 2003.
PAPADOPOULOS, D. P. et al. Extreme clicking for efficient object annotation. In: International Conference on Computer Vision (ICCV), 2017. DOI: https://doi.org/10.1109/ICCV.2017.528
PASZKE, A. et al. PyTorch: an imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, v. 32, 2019.
RAJPURKAR, P. et al. CheXNet: pneumonia detection. arXiv preprint, arXiv:1711.05225, 2017.
REDMON, J. et al. YOLO: real-time object detection. CVPR, 2016.
REN, S.; HE, K.; GIRSHICK, R.; SUN, J. Faster R-CNN: region proposal networks. NeurIPS, 2015.
ROSENFIELD, M. Computer vision syndrome (a.k.a. digital eye strain). Optometry in Practice, v. 17, n. 1, p. 1–10, 2016.
RUSSELL, B. et al. LabelMe: a database and web-based tool for image annotation. International Journal of Computer Vision, v. 77, p. 157–173, 2008. DOI: https://doi.org/10.1007/s11263-007-0090-8
RUSSELL, S.; NORVIG, P. Artificial intelligence: a modern approach. 3. ed. Upper Saddle River: Pearson, 2010.
SAGER, C.; JANIESCH, C.; ZSCHECH, P. A survey of image labelling for computer vision applications. arXiv preprint, arXiv:2104.08885, 2021. Disponível em: https://arxiv.org/abs/2104.08885. Acesso em: 28 set. 2025.
SANAR. Lesões por esforço repetitivo (LER) e distúrbios osteomusculares relacionados ao trabalho (DORT): conceitos e prevenção. Disponível em: https://www.sanar.com.br/. Acesso em: 17 nov. 2025.
SCHUHMANN, C.; BEAUMONT, R.; VENCU, R.; GORDON, C.; WIGHTMAN, R.; CHERTI, M.; et al. LAION-5B: An open large-scale dataset for training next generation image-text models. Advances in Neural Information Processing Systems, v. 35, p. 25278–25294, 2022.
SHEPPARD, A. L.; WOLFFSOHN, J. S. Digital eye strain: prevalence, measurement and amelioration. BMJ Open Ophthalmology, v. 3, n. 1, e000146, 2018. DOI: https://doi.org/10.1136/bmjophth-2018-000146
SHIN, H. et al. Visual product search using deep learning. 2022.
SZELISKI, R. Computer Vision: Algorithms and Applications. Springer, 2010. DOI: https://doi.org/10.1007/978-1-84882-935-0
TZELEPIS, D. et al. Efficient bounding box annotation. Pattern Recognition Letters, 2021
TAURI. Tauri documentation. 2022. Disponível em: https://tauri.app/. Acesso em: 28 set. 2025.
TIME. Meta scales up the AI data industry. Time, 19 set. 2024. Disponível em: https://time.com/7294699/meta-scale-ai-data-industry/. Acesso em: 3 nov. 2025.
WANG, L.; ZHAO, X.; ZHANG, Y.; HAN, X.; DEVEÇI, M. A review of convolutional neural networks in computer vision. Artificial Intelligence Review, v. 57, n. 4, p. 1–27, 2024. DOI: https://doi.org/10.1007/s10462-024-10721-6
ZHOU, Z. H. A brief introduction to weakly supervised learning. Springer, 2018.