Honggu Liu, Paolo Bestagini, Lin Huang, Wenbo Zhou, Stefano Tubaro, Weiming Zhang, Nenghai Yu
Accepted by ICIP 2023
With the rapid development in media generation technologies, the creation of DeepFake videos is within everyone's reach. As the widespread diffusion of DeepFakes can lead to severe consequences (e.g., defamation, fake news spreading, etc.), detecting DeepFakes is becoming a crucial task within the forensic community. However, most of the existing DeepFake detectors suffer from two issues: i) they are hardly explainable as they build upon black-box data-driven techniques rather than interpretable features; ii) they are often tailored to low-level texture features, failing to generalize on low-quality DeepFake videos. In this work we propose a video DeepFake detector that aims at solving these issues. The proposed detector relies on the fact that most DeepFake generators work on a frame-by-frame basis, thus breaking the temporal consistency of facial features across frames. In particular, we noticed that facial identity features tend to be less stable in time on DeepFake videos than original ones. We therefore propose a framework trained on time series of facial identity features. The use of high-level semantic features makes the detector interpretable and robust against low-quality DeepFake videos. Extensive experiments show that our method achieves outstanding performance on low-quality DeepFake video and obtains promising results on unseen dataset evaluation.
We noticed that facial identity features tend to be less stable in time on DeepFake videos than original ones.
- This work is supported by China Scholarship Council