We compare and evaluate different methods to infer groups of correlated failures. These methods try to group failure events occurring nearly simultaneously in clusters. Indeed if several failures occur nearly at the same moment in a network, it is possible that these failures have the same root cause. The input data of our algorithms are IP failure notifications that can be provided by several sources. We consider two sources: IS-IS Link State Packets (LSPs) and Syslog messages. Our first results on the Abilene and GÉANT networks show that the inference methods behave differently and that using IS-IS LSPs provides more accurate results than using Syslog messages.DGTRE TOTE
We demonstrate machine learning augmented Open Shortest Path First (OSPF) routing which infers Share...
Distributed systems such as grids, peer-to-peer systems, and even Internet DNS servers have grown si...
International audienceDistributed systems such as grids, peer-to-peer systems, and even Internet DNS...
Dependencies between failures in operational networks may have a huge impact on their reliability an...
In order to evaluate the expected availability of a service, a network administrator should consider...
Techniques are described for a graph based approach to co-relate events in a network. The root cause...
To evaluate the expected availability of a backbone network service, the administrator should consid...
We introduce the ideas of watching methods (MPs) and watching cycles (MCs) for distinctive localizat...
Several works shed light on the vulnerability of networks against regional failures, which are failu...
It may be difficult to identify root causes of protocol failures or degradations in application traf...
© 2014 IEEE. As the sizes of supercomputers and data centers grow towards exascale, failures become ...
In this paper we study the correlation of node failures in time and space. Our study is based on mea...
Traditional methods for ensuring reliable transmissions in circuit- switched networks rely on the pr...
Of the major factors affecting end-to-end service availability, network component failure is perhaps...
In this paper we show how information contained in robust network codes can be used for passive infe...
We demonstrate machine learning augmented Open Shortest Path First (OSPF) routing which infers Share...
Distributed systems such as grids, peer-to-peer systems, and even Internet DNS servers have grown si...
International audienceDistributed systems such as grids, peer-to-peer systems, and even Internet DNS...
Dependencies between failures in operational networks may have a huge impact on their reliability an...
In order to evaluate the expected availability of a service, a network administrator should consider...
Techniques are described for a graph based approach to co-relate events in a network. The root cause...
To evaluate the expected availability of a backbone network service, the administrator should consid...
We introduce the ideas of watching methods (MPs) and watching cycles (MCs) for distinctive localizat...
Several works shed light on the vulnerability of networks against regional failures, which are failu...
It may be difficult to identify root causes of protocol failures or degradations in application traf...
© 2014 IEEE. As the sizes of supercomputers and data centers grow towards exascale, failures become ...
In this paper we study the correlation of node failures in time and space. Our study is based on mea...
Traditional methods for ensuring reliable transmissions in circuit- switched networks rely on the pr...
Of the major factors affecting end-to-end service availability, network component failure is perhaps...
In this paper we show how information contained in robust network codes can be used for passive infe...
We demonstrate machine learning augmented Open Shortest Path First (OSPF) routing which infers Share...
Distributed systems such as grids, peer-to-peer systems, and even Internet DNS servers have grown si...
International audienceDistributed systems such as grids, peer-to-peer systems, and even Internet DNS...