AURA: A Graph Neural Network-Enhanced Semantic Plagiarism Detection System with Sentence-Level Analysis
Ch Mohan Mohan
Paper Contents
Abstract
It is becoming more and more challenging to uphold academic integrity in the face of rampant digital content and advanced paraphrasing technology. Conventional plagiarism detection software, based on exact text matches, frequently fails to detect content that has been textually transformed but semantically preserved using paraphrasing or summarization. This work presents AURA (Academic research assistant Using Relationship Analysis), an innovative system that is capable of identifying direct copying and sophisticated paraphrasing by combining Graph Neural Networks (GNNs) with transformer-based semantic embeddings.Our solution uses a GraphSAGE architecture based on a heterogeneous graph representing papers, authors, and academic terms to capture the relations among them. Another main contribution is a strict scoring function which cleverly compensates for valid overlap in scholarly language in addition to an analysis feature at sentence level similar to Turnitin commercial platforms. When tested on a corpus of 2,541 arXiv papers, AURA exhibited a considerably lower false positive rate (18% for unrelated documents) than baseline systems (more than 60%).The system is highly accurate, detecting more than 95% of identical matches and more than 85% of deeply paraphrased text, offering detailed, actionable feedback at the sentence level. Our hybrid approach, weighting text embeddings at 70% and graph-aware embeddings at 30%, yields a solid balance between semantic stability and context sensitivity. AURA is implemented as an open-source web application that seeks to demystify access to advanced plagiarism detection tools that have been reserved in the past for costly commercial platforms.
Copyright
Copyright © 2025 Ch Mohan. This is an open access article distributed under the Creative Commons Attribution License.