A cross-border community for researchers with openness, equality and inclusion

ABSTRACT LIBRARY

Intelligent Academic Integrity System Using TF-IDF, Cosine Similarity, and Stylometric Profiling

Publisher: IEEE

Authors: r mallikka, SRM Institute of Science and Technology *V Vettrivel, SRM Institute of Science and Technology Tiruchirapalli SRM Nagar; Chennai - Trichy Hwy; Dist; Irungalur; Tamil Nadu N Diwakar, SRM Institute of Science and Technology Tiruchirapalli MS Pranav, SRM Institute of Science and Technology Tiruchirapalli

  • Favorite
  • Share:

Abstract:

Academic dishonesty threatens educational integrity across institutions worldwide. This paper presents a comprehensive Flask-based system that leverages machine learning and natural language processing techniques to detect plagiarism automatically. The system employs TF-IDF vectorization combined with cosine similarity metrics to identify copied content effectively. It incorporates stylometric profiling to detect impersonation and authorship inconsistencies. We integrated OCR.SPACE API to process handwritten submissions, enabling the system to analyze both digital typed documents and scanned handwritten scripts. SQLite database manages user authentication, stores detection results, and maintains case history efficiently. The intuitive web interface allows administrators and faculty to view comprehensive analytics, track investigation cases, and generate detailed reports. Our experimental evaluation demonstrates that the system significantly improves detection accuracy while substantially reducing the manual workload that teachers traditionally bear when checking papers. The system efficiently handles batch processing of multiple submissions simultaneously and generates clear, evidence-based reports with specific flagged sections. This makes it particularly useful for educational institutions seeking to maintain rigorous academic integrity standards without overwhelming their teaching staff with time-consuming manual verification processes.

Keywords: Academic Dishonesty Detection, Plagiarism Detection, Stylometric Analysis, Natural Language Processing (NLP), Machine Learning, Text Similarity, TF-IDF Vectorization, Cosine Similarity, Cheating Detection System, Automated Evaluation, Educational Technolo

Published in: 2024 Asian Conference on Communication and Networks (ASIANComNet)

Date of Publication: --

DOI: -

Publisher: IEEE