Over the past few years, a significant progress has been made in deep convolutional neural networks (CNNs)-based image recognition. This is mainly due to the strong ability of such networks in mining discriminative object pose and parts information from texture and shape. This is often inappropriate for fine-grained visual classification (FGVC) since it exhibits high intra-class and low inter-class variances due to occlusions, deformation, illuminations, etc. Thus, an expressive feature representation describing global structural information is a key to characterize an object/ scene. To this end, we propose a method that effectively captures subtle changes by aggregating context-aware features from most relevant image-regions and their impo...
In contrast to basic-level object recognition, fine-grained categorization aims to distinguishbetwee...
Predicting salient regions in natural images requires the detection of objects that are present in a...
The Facial Action Coding System (FACS) encodes the action units (AUs) in facial images, which has at...
Recent scene graph generation (SGG) frameworks have focused on learning complex relationships among ...
Deep convolutional neural networks (CNNs) have shown a strong ability in mining discriminative objec...
For various computer vision tasks, finding suitable feature representations is fundamental. Fine-gra...
Fine-grained image retrieval has gradually become a hot topic in computer vision , which aims to ret...
Fine-grained recognition is one of the most difficult topics in visual recognition, which aims at di...
University of Technology Sydney. Faculty of Engineering and Information Technology.Fine-Grained Visu...
Recent experiments in computer vision demonstrate texture bias as the primary reason for supreme res...
This paper shows how a standard convolutional neural network (CNN) without recurrent connections is ...
In this paper, we propose a general framework for image classification using the attention mechanism...
Deep neural networks have reached human-level performance on many computer vision tasks. However, th...
Fine-Grained classification models can expressly focus on the relevant details useful to distinguish...
In this paper, we introduce a graph representation learning architecture for spatial image steganaly...
In contrast to basic-level object recognition, fine-grained categorization aims to distinguishbetwee...
Predicting salient regions in natural images requires the detection of objects that are present in a...
The Facial Action Coding System (FACS) encodes the action units (AUs) in facial images, which has at...
Recent scene graph generation (SGG) frameworks have focused on learning complex relationships among ...
Deep convolutional neural networks (CNNs) have shown a strong ability in mining discriminative objec...
For various computer vision tasks, finding suitable feature representations is fundamental. Fine-gra...
Fine-grained image retrieval has gradually become a hot topic in computer vision , which aims to ret...
Fine-grained recognition is one of the most difficult topics in visual recognition, which aims at di...
University of Technology Sydney. Faculty of Engineering and Information Technology.Fine-Grained Visu...
Recent experiments in computer vision demonstrate texture bias as the primary reason for supreme res...
This paper shows how a standard convolutional neural network (CNN) without recurrent connections is ...
In this paper, we propose a general framework for image classification using the attention mechanism...
Deep neural networks have reached human-level performance on many computer vision tasks. However, th...
Fine-Grained classification models can expressly focus on the relevant details useful to distinguish...
In this paper, we introduce a graph representation learning architecture for spatial image steganaly...
In contrast to basic-level object recognition, fine-grained categorization aims to distinguishbetwee...
Predicting salient regions in natural images requires the detection of objects that are present in a...
The Facial Action Coding System (FACS) encodes the action units (AUs) in facial images, which has at...