Improving Online Safety with Machine Learning-Based Phishing Detection
Bandi Veera Venkata Satyanarayana Veera Venkata Satyanarayana
Paper Contents
Abstract
Phishing websites pose a critical threat to online security by mimicking legitimate websites to deceive users. This project aims to design and develop a machine learning-based framework for detecting phishing websites. The proposed framework will involve collecting phishing URL data from Phish Tank and legitimate URL data from the University of New Brunswick. A dataset of 10,000 URLs will be created, and essential features such as Address Bar, Domain-based, and HTML & JavaScript based characteristics will be extracted for analysis. The project will begin with preprocessing the dataset, splitting it into training and testing sets, and selecting appropriate supervised machine learning models for development. Models include Random Forest, Support Vector Machines and XGBoost. These models will be trained on the dataset and stacked together, and their performance will be evaluated using metrics such as accuracy and precision to identify the most suitable classification for phishing detection. The action plan includes developing a robust classification system using high performing machine learning algorithm to accurately classify URLs as phishing or legitimate. The framework will also consider future extensions, such as creating a browser extension or a user-friendly GUI using flask to use the model developed by the training for real-time phishing detection. This project will provide a structured and scalable solution to mitigate risks associated with phishing attacks and enhance cybersecurity.
Copyright
Copyright © 2025 Bandi Veera Venkata Satyanarayana. This is an open access article distributed under the Creative Commons Attribution License.