This paper presents a novel method to count people for video surveillance applications. Methods in the literature either follow a direct approach, by first detecting people and then counting them, or an indirect approach, by establishing a relation between some easily detectable scene features and the estimated number of people. The indirect approach is considerably more robust, but it is not easy to take into account such factors as perspective or people groups with different densities. The proposed technique, while based on the indirect approach, specifically addresses these problems; furthermore it is based on a trainable estimator that does not require an explicit formulation of a priori knowledge about the perspective and density effec...