Accurately estimating congestion for proper global adaptive routing decisions (i.e., determine whether a packet should be routed minimally or non-minimally) has a significant impact on overall performance for high-radix topologies, such as the Dragonfly topology. Prior work have focused on understanding near-end congestion - i.e., congestion that occurs at the current router - or downstream congestion - i.e., congestion that occurs in downstream routers. However, most prior work do not evaluate the impact of far-end congestion or the congestion from the high channel latency between the routers. In this work, we refer to far-end congestion as phantom congestion as the congestion is not "real" congestion. Because of the long inter-router late...
Recent increases in the pin bandwidth of integrated-circuits has motivated an increase in the degree...
Large-scale compute clusters are highly affected by performance variability that originates from dif...
Traffic engineering models based on end-to-end loss probabilities and delays do not scale well to fa...
Recently proposed high-radix interconnection networks [10] require global adaptive routing to achiev...
Global adaptive routing exploits non-minimal paths to improve performance on adversarial traffic pat...
Evolving technology and increasing pin-bandwidth moti-vate the use of high-radix routers to reduce t...
Dragonfly networks arrange network routers in a two-level hierarchy, providing a competitive cost-pe...
Abstract—Dragonfly networks are appealing topologies for large-scale Datacenter and HPC networks, th...
System noise can negatively impact the performance of HPC systems, and the interconnection network i...
Adaptive routing is an efficient congestion avoidance mechanism for modern Datacenter and HPC networ...
With the growing popularity of big-data applications, Data Center Networks increasingly carry larger...
Dragonfly networks are appealing topologies for large-scale Data center and HPC networks, that provi...
Adaptive deadlock-free routing mechanisms are required to handle variable traffic patterns in dragon...
The Cray Cascade architecture uses Dragonfly as its interconnect topology and employs a globally ada...
Dragonfly networks are composed of interconnected groups of routers. Adaptive routing allows packets...
Recent increases in the pin bandwidth of integrated-circuits has motivated an increase in the degree...
Large-scale compute clusters are highly affected by performance variability that originates from dif...
Traffic engineering models based on end-to-end loss probabilities and delays do not scale well to fa...
Recently proposed high-radix interconnection networks [10] require global adaptive routing to achiev...
Global adaptive routing exploits non-minimal paths to improve performance on adversarial traffic pat...
Evolving technology and increasing pin-bandwidth moti-vate the use of high-radix routers to reduce t...
Dragonfly networks arrange network routers in a two-level hierarchy, providing a competitive cost-pe...
Abstract—Dragonfly networks are appealing topologies for large-scale Datacenter and HPC networks, th...
System noise can negatively impact the performance of HPC systems, and the interconnection network i...
Adaptive routing is an efficient congestion avoidance mechanism for modern Datacenter and HPC networ...
With the growing popularity of big-data applications, Data Center Networks increasingly carry larger...
Dragonfly networks are appealing topologies for large-scale Data center and HPC networks, that provi...
Adaptive deadlock-free routing mechanisms are required to handle variable traffic patterns in dragon...
The Cray Cascade architecture uses Dragonfly as its interconnect topology and employs a globally ada...
Dragonfly networks are composed of interconnected groups of routers. Adaptive routing allows packets...
Recent increases in the pin bandwidth of integrated-circuits has motivated an increase in the degree...
Large-scale compute clusters are highly affected by performance variability that originates from dif...
Traffic engineering models based on end-to-end loss probabilities and delays do not scale well to fa...