European Journal of Computer Science and Information Technology (EJCSIT)

EA Journals

Robust detection of LLM-generated text through transfer learning with pre-trained Distilled BERT model

Abstract

Detecting text generated by large language models (LLMs) is a growing challenge as these models produce outputs nearly indistinguishable from human writing. This study explores multiple detection approaches, including a Multi-Layer Perceptron (MLP), Long Short-Term Memory (LSTM) networks, a Transformer block, and a fine-tuned distilled BERT model. Leveraging BERT’s contextual understanding, we train the model on diverse datasets containing authentic and synthetic texts, focusing on features like sentence structure, token distribution, and semantic coherence. The fine-tuned BERT outperforms baseline models, achieving high accuracy and robustness across domains, with superior AUC scores and efficient computation times. By incorporating domain-specific training and adversarial techniques, the model adapts to sophisticated LLM outputs, improving detection precision. These findings underscore the efficacy of pretrained transformer models for ensuring authenticity in digital communication, with potential applications in mitigating misinformation, safeguarding academic integrity, and promoting ethical AI usage.

Keywords: Classifier, GenAI, detection, fine tuning, large language models, machine learning, natural language processing, pretraining

cc logo

This work by European American Journals is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 Unported License

 

Recent Publications

Email ID: editor.ejcsit@ea-journals.org
Impact Factor: 7.80
Print ISSN: 2054-0957
Online ISSN: 2054-0965
DOI: https://doi.org/10.37745/ejcsit.2013

Author Guidelines
Submit Papers
Review Status

 

Scroll to Top

Don't miss any Call For Paper update from EA Journals

Fill up the form below and get notified everytime we call for new submissions for our journals.