Sentiment Analysis of Social Media Data using Machine Learning
John Milton V Milton V
Paper Contents
Abstract
With the exponential growth of social media, vast amounts of unstructured text data are generated daily, reflecting public opinion on diverse issues. Sentiment analysis provides a powerful way to mine this data for insights. This study focuses on sentiment classification of Twitter and Reddit datasets using supervised machine learning techniques. Pre-processing involved tokenization, stop-word removal, lemmatization, and TF-IDF vectorization. The performance of algorithms such as Support Vector Machines (SVM), Random Forest (RF), and Long Short-Term Memory (LSTM) networks was evaluated. Experimental results showed that SVM achieved the highest accuracy (87.4%), followed by RF (83.9%) and LSTM (82.6%). These findings highlight that traditional ML methods remain competitive with deep learning models for sentimental tasks when datasets are balanced and feature engineering is optimized. The study demonstrates the importance of model selection and preprocessing in designing robust sentiment analysis systems.
Copyright
Copyright © 2025 John Milton V. This is an open access article distributed under the Creative Commons Attribution License.