This paper addresses the problem of tracking a moving person with a single, omnidirectional camera. An appearance-based tracking system is described which uses a self-acquired appearance model and a Kalman filter to estimate the position of the person. Features corresponding to ``depth cues'' are first extracted from the panoramic images, then an artificial neural network is trained to estimate the distance of the person from the camera. The estimates are combined using a discrete Kalman filter to track the position of the person over time. The ground truth information required for training the neural network and the experimental analysis was obtained from another vision system, which uses multiple webcams and triangulation to calculate the...