The $hp$-adaptive finite element method (FEM) - where one independently chooses the mesh size ($h$) and polynomial degree ($p$) to be used on each cell - has long been known to have better theoretical convergence properties than either $h$- or $p$-adaptive methods alone. However, it is not widely used, owing at least in parts to the difficulty of the underlying algorithms and the lack of widely usable implementations. This is particularly true when used with continuous finite elements. Herein, we discuss algorithms that are necessary for a comprehensive and generic implementation of $hp$-adaptive finite element methods on distributed-memory, parallel machines. In particular, we will present a multi-stage algorithm for the unique enumerati...