AUTOMATIC DETECTION OF HOMOPHOBIC SPEECH USING MACHINE LEARNING
DOI:
https://doi.org/10.56238/arev7n5-029Keywords:
Machine Learning. Hate Speech. Social Networks. Data mining. Natural Language Processing.Abstract
his research explores machine learning models to detect hate speech with homophobic contexts on social networks, a relevant problem in the digital age due to the negative impact on the LGBTQIA+ community. The overall objective is to train predictive models capable of identifying homophobic speech efficiently, contributing to the fight against hate speech and promoting a safer virtual environment. The CRISP-DM methodology was used, applying five phases: understanding the business, understanding and preparing data, modeling and evaluation. Six models were trained: Decision Tree, Random Forest, Extra Trees, Passive Aggressive, eXtreme Gradient Boosting and Support Vector Machine. The evaluation of the models used metrics such as accuracy, precision, recall and F1-Score, as well as analysis of the confusion matrix and the Receiver Operating Characteristic curve to measure the performance of each model. The SVM model had the best overall performance, with an accuracy of 87.10%, a precision of 79.15%, and an area under the curve of 0.9227, highlighting its effectiveness in minimizing false positives. The results highlight the potential of learning models in identifying hate speech and contribute to the construction of safer and more inclusive digital environments.