Learning machines for pattern recognition, such as neural networks or support vector machines, are usually conceived to process real–valued vectors with predefined dimensionality even if, in many real–world applications, relevant information is inherently organized into entities and relationships between them. Instead, Graph Neural Networks (GNNs) can directly process structured data, guaranteeing universal approximation of many practically useful functions on graphs. GNNs, that do not strictly meet the definition of deep architectures, are based on the unfolding mechanism during learning, that, in practice, yields networks that have the same depth of the data structures they process. However, GNNs may be hindered by the long–term dependenc...