A shared-memory counter is a well-studied and widely-used concurrent object. It supports two operations: An Inc operation that increases its value by 1 and a Read operation that returns its current value. Jayanti, Tan and Toueg [Jayanti et al., 2000] proved a linear lower bound on the worst-case step complexity of obstruction-free implementations, from read and write operations, of a large class of shared objects that includes counters. The lower bound leaves open the question of finding counter implementations with sub-linear amortized step complexity. In this paper, we address this gap. We present the first wait-free n-process counter, implemented using only read and write operations, whose amortized operation step complexity is O(log^2 n...