WhatsApp at (+91-9098855509) Support
ijprems Logo
  • Home
  • About Us
    • Editor Vision
    • Editorial Board
    • Privacy Policy
    • Terms & Conditions
    • Publication Ethics
    • Peer Review Process
  • For Authors
    • Publication Process(up)
    • Submit Paper Online
    • Pay Publication Fee
    • Track Paper
    • Copyright Form
    • Paper Format
    • Topics
  • Fees
  • Indexing
  • Conference
  • Contact
  • Archieves
    • Current Issue
    • Past Issue
  • More
    • FAQs
    • Join As Reviewer
  • Submit Paper

Recent Papers

Dedicated to advancing knowledge through rigorous research and scholarly publication

  1. Home
  2. Recent Papers

Efficient Detection of Duplicate Question Pairs Using Machine Learning and NLP Techniques

Ms. Vaishali Bajpai1 Vaishali Bajpai1

Download Paper

Paper Contents

Abstract

The paper focuses on the development and implementation of a system for the detection of duplicate question pairs, using Machine Learning and Natural Language Processing techniques. Given the proliferation of forums and Q&A sites in the Internet Age, efficient ways to detect the same questions are crucially important for the quality and usability of such platforms. The goal of the paper is to devise a model that identifies correctly whether the semantic equivalence of the two input questions is correct. Various techniques in NLP are applied in preprocessing the text data, which includes tokenization, stemming, lemmatization, and finally vectorization using methods such as TF-IDF. Besides basic text preprocessing, some advanced features are extracted, which includes n-grams and cosine similarity, and keyword extraction. We further enrich our feature set by using the Fuzzy Wuzzy library to develop similarity ratios for question pairs. We further develop different models with Logistic Regression, Support Vector Machines, Random Forest, and Gradient Boosting. The paper performs a rather detailed comparison between all of these models to come up with the best one. These evaluation metrics will include accuracy, precision, recall, and the F1-score. Furthermore, tuning hyperparameters and cross-validation are part of the whole process for model performance optimization.

Copyright

Copyright © 2024 Ms. Vaishali Bajpai1. This is an open access article distributed under the Creative Commons Attribution License.

Paper Details
Paper ID: IJPREMS41200043744
ISSN: 2321-9653
Publisher: ijprems
Page Navigation
  • Abstract
  • Copyright
About IJPREMS

The International Journal of Progressive Research in Engineering, Management and Science is a peer-reviewed, open access journal that publishes original research articles in engineering, management, and applied sciences.

Quick Links
  • Home
  • About Our Journal
  • Editorial Board
  • Publication Ethics
Contact Us
  • IJPREMS - International Journal of Progressive Research in Engineering Management and Science, motinagar, ujjain, Madhya Pradesh., india
  • Chat with us on WhatsApp: +91 909-885-5509
  • Email us: editor@ijprems.com
  • Sun-Sat: 9:00 AM - 9:00 PM

© 2025 International Journal of Progressive Research in Engineering, Management and Science. All Rights Reserved.

Terms & Conditions | Privacy Policy | Publication Ethics | Peer Review Process | Contact Us