Adversarial Robustness in Generative AI: Defending Against Malicious Model Inversions and Deepfake Attacks (Published)
Generative AI models are rapidly advancing creative content creation but remain vulnerable to adversarial attacks like model inversion and deepfakes. In this work, we delve into robust defence strategies with an actual dataset of the Deepfake Detection Challenge (DFDC) to simulate various attack scenarios. We employ the use of both anomaly detection and adversarial training mechanisms to harden the security of generative models. Experimental results reveal that these composite defence mechanisms significantly reduce the malicious attack success rate while the inventive capability of the models is still preserved. Our findings highlight the importance of embedding strong security characteristics in generative AI models towards protecting digital content and encouraging responsible use under the fast-evolving adversarial digital environment.
Keywords: AI ethics, DFDC dataset, adversarial robustness, adversarial training, anomaly detection, deepfake attacks, digital security, generative AI, model inversion, resilient models