Thanks to their event-driven nature, spiking neural networks (SNNs) are surmised to be great computation-efficient models. The spiking neurons encode beneficial temporal facts and possess excessive anti-noise properties. However, the high-quality encoding of spatio-temporal complexity and also its training optimization of SNNs are restricted by means of the contemporary problem, this article proposes a novel hierarchical event-driven visual device to explore how information transmits and signifies in the retina the usage of biologically manageable mechanisms. This cognitive model is an augmented spiking-based framework consisting of the function learning capacity of convolutional neural networks (CNNs) with the cognition capability of SNNs....