
AJSM OPEN ACCESS
Academic Journal of Sociology and Management
ISSN:3005-5040 (print) | ISSN:3005-5059 (online) | Publication Frequency: Bimonthly
A Differential Privacy-Based Mechanism for Preventing Data Leakage in Large Language Model Training
* Corresponding Author1: Xingpeng Xiao, E-Mail: charlsiexno9@gmail.com
Publication
Accepted 2025 March 11 ; Published 2025 March 18
Academic Journal of Sociology and Management, 2025, 3(2), 3005-5040.
Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language processing tasks, yet they face significant challenges in protecting sensitive information during training. This paper presents a novel differential privacy-based mechanism for preventing data leakage in LLM training processes. The proposed system introduces a dynamic privacy budget allocation strategy integrated with adaptive noise injection mechanisms, specifically designed for transformer architectures. The mechanism implements a multi-layered protection framework that combines real-time monitoring capabilities with automated response systems. Through comprehensive experimental evaluation on models ranging from 100M to 175B parameters, our approach demonstrates superior performance in privacy protection while maintaining model utility. The system achieves a 99.2% detection rate for potential data leakages with a minimal false alarm rate of 0.8%, representing a significant improvement over traditional approaches. Performance analysis reveals that the proposed mechanism maintains model accuracy within 1.8% of non-private baselines while providing strong privacy guarantees. The implementation reduces computational overhead by 35% compared to conventional differential privacy methods. Our research establishes new benchmarks in privacy-preserving machine learning, particularly for large-scale language models, and provides a practical framework for secure AI system deployment.
Keywords
Large Language Model , Differential Privacy , Data Leakage Prevention , Privacy-preserving Machine Learning .
Metadata
Pages: 33-42
References: 19
Disciplines: Management
Subjects: Human Resource Management
Cite This Article
APA Style
Xiao, X., Zhang, Y., Chen, H., Ren, W., Zhang, J. & Xu, J. (2025). A differential privacy-based mechanism for preventing data leakage in large language model training. Academic Journal of Sociology and Management, 3(2), 33-42. https://doi.org/10.70393/616a736d.323732
Acknowledgments
The authors thank the editor and anonymous reviewers for their helpful comments and valuable suggestions.
FUNDING
Not applicable.
INSTITUTIONAL REVIEW BOARD STATEMENT
Not applicable.
DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.
INFORMED CONSENT STATEMENT
Not applicable.
CONFLICT OF INTEREST
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
AUTHOR CONTRIBUTIONS
Not applicable.
References
PUBLISHER'S NOTE
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Copyright © 2025 The Author(s). Published by Southern United Academy of Sciences.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Persistent Identifiers





Abstracting and Indexing




Quality Assurance


Archiving Services
t



