Detecting text generated by large language models (LLMs) is a growing challenge as these models produce outputs nearly indistinguishable from human writing. This study explores multiple detection approaches, including a Multi-Layer Perceptron (MLP), Long Short-Term Memory (LSTM) networks, a Transformer block, and a fine-tuned distilled BERT model. Leveraging BERT’s contextual understanding, we train the model on diverse datasets containing authentic and synthetic texts, focusing on features like sentence structure, token distribution, and semantic coherence. The fine-tuned BERT outperforms baseline models, achieving high accuracy and robustness across domains, with superior AUC scores and efficient computation times. By incorporating domain-specific training and adversarial techniques, the model adapts to sophisticated LLM outputs, improving detection precision. These findings underscore the efficacy of pretrained transformer models for ensuring authenticity in digital communication, with potential applications in mitigating misinformation, safeguarding academic integrity, and promoting ethical AI usage.
Keywords: Classifier, GenAI, detection, fine tuning, large language models, machine learning, natural language processing, pretraining