Semantic segmentation is a popular visual recognition task whose goal is to estimate pixel-level object class labels in images. This problem has been recently handled by deep convolutional neural networks (DCNNs), and the state-of-the-art techniques achieve impressive records on public benchmark data sets. However, learning DCNNs demand a large number of annotated training data while segmentation annotations in existing data sets are significantly limited in terms of both quantity and diversity due to the heavy annotation cost. Weakly supervised approaches tackle this issue by leveraging weak annotations such as image-level labels and bounding boxes, which are either readily available in existing large-scale data sets for image classificati...