Article, 2024

Deep video inpainting detection and localization based on ConvNeXt dual-stream network

Expert Systems with Applications, ISSN 0957-4174, 1873-6793, Volume 247, Page 123331, 10.1016/j.eswa.2024.123331

Contributors

Yao, Yudong

0000-0002-7012-9307 [1] Han, Tingfeng [1] Gao, Xudong [1] Ren, Yizhi [1] Meng, Wei-Zhi

0000-0003-4384-5786 (Corresponding author) [2]

Affiliations

[1] Hangzhou Dianzi University

[NORA names:

[2] Technical University of Denmark

[NORA names:

DTU Technical University of Denmark

University

Abstract

Currently, deep learning-based video inpainting algorithms can fill in a specified video region with visually plausible content, usually leaving imperceptible traces. Since deep video inpainting methods can be used to maliciously manipulate video content, there is an urgent need for an effective method to detect and localize deep video inpainting. In this paper, we propose a dual-stream video inpainting detection network, which includes a ConvNeXt dual-stream encoder and a multi-scale feature cross-fusion decoder. To further explore the spatial and temporal traces left by deep inpainting, we extract motion residuals and enhance them using 3D convolution and SRM filtering. Furthermore, we extract filtered residuals using LoG and Laplacian filtering. These residuals are then entered into ConvNeXt, thereby learning discriminative inpainting features. To enhance detection accuracy, we design a top-down pyramid decoder that aims at deep fusion of multi-dimensional multi-scale features to fully exploit the information of different dimensions and levels in detail. We created two datasets containing state-of-the-art video inpainting algorithms and conducted various experiments to evaluate our approach. The experimental results demonstrate that our approach outperforms existing methods and attains a competitive performance despite encountering unseen inpainting algorithms.

Keywords

ConvNeXt, Laplacian filter, SRM, SRM filters, accuracy, algorithm, competitive performance, content, convolution, dataset, decoding, deep fusion, detection, detection accuracy, detection network, dimensions, dual-stream network, effective method, encoding, experimental results, experiments, features, filter, fusion, information, inpainting, inpainting algorithm, inpainting method, levels, localization, log, method, motion residuals, multi-scale features, network, performance, region, residues, results, temporal traces, trace, video, video content, video inpainting, video inpainting algorithm, video inpainting method, video regions, visualization, visually plausible contents

Funders

National Natural Science Foundation of China
Ministry of Education of the People's Republic of China

Deep video inpainting detection and localization based on ConvNeXt dual-stream network

Contributors

Affiliations

Abstract

Keywords

Funders

Data Provider: Digital Science

LINKS
-

Matching Records in NORA

SUBJECTS
+

DK Main Research Area

UN SDG Classification

OECD Classification

AU/NZ FOR Classification

METRICS
+

Citation Metrics

Attention Metrics

Attention Metrics

DK Open Access Indicator

Contributors

Affiliations

Abstract

Keywords

Funders

Data Provider: Digital Science

LINKS-

Matching Records in NORA

SUBJECTS+

DK Main Research Area

UN SDG Classification

OECD Classification

AU/NZ FOR Classification

METRICS+

Citation Metrics

Attention Metrics

Attention Metrics

DK Open Access Indicator

Matching Records in NORA

DK Open Access Indicator

DK Green Classification

LINKS
-

SUBJECTS
+

METRICS
+