作者: Yang, F (Yang, Feng); Xia, GS (Xia, Gui-Song); Liu, G (Liu, Gang); Zhang, LP (Zhang, Liangpei); Huang, X (Huang, Xin)
|
摘要: A dynamic texture (DT) refers to a sequence of images that exhibit spatial and temporal regularities. The modeling of DTs plays an important role in many video-related vision tasks, where the main difficulty lies in fact how to simultaneously depict the spatial and temporal aspects of DTs. While unlike the modeling of DTs, tremendous achievements have been recently reported on static texture modeling.
This paper addresses the problem of dynamic texture recognition by aggregating spatial and temporal texture features via an ensemble SVM scheme, and bypassing the difficulties of simultaneously spatiotemporal description of DTs. More precisely, firstly, by considering a 3-dimensional DT video as a stack 2-dimensional static textures, we exploit the spatial texture features of single frame to combine different aspects of spatial structures, followed by randomly selecting several frames of the DT video in the time augmentation process. Secondly, in order to incorporate temporal information, the naive linear dynamic system (LDS) model is used to extract dynamics of DTs in temporal domain. Finally, we aggregate these spatial and temporal cues via an ensemble SVM architecture. We have experimented not only on several common dynamic texture datasets, but also on two challenging dynamic scene datasets. The results show that the proposed scheme achieves the state-of-the-art performances on the recognition of dynamic textures and dynamic scenes. Moreover, our approach offers a simple and general way to aggregate any spatial and temporal features into the task of dynamic texture recognition. (C) 2015 Elsevier B.V. All rights reserved.
|