The complexity of current Internet applications makes understanding network traffic a challenging task. By providing larger-scale aggregates for analysis, unsupervised clustering approaches can greatly aid in the identification of new applications, attacks, and other changes in network usage patterns. In this paper we introduce ADHIC, a new algorithm that clusters similar network traffic together without prior knowledge of protocol structures. Packet similarity is determined through comparisons of substrings within packets at distinguishing offsets. ADHIC is notable in that it 1) produces a hierarchical decomposition of network traffic in the form of a cluster-identifying decision tree, 2) needs only a small fraction of packets to generate ...