We describe an algorithm for generatingmultimodal referring expressions,based on empirical data. The mainnovelties are (1) a decision to pointbased on both the efficiency of pointing(Fitt’s law) and the inefficiency ofa full linguistic description, (2) the explicittracking of the ’focus of attention’,and (3) a threedimensional notionof salience incorporating linguistic, focusand inherent salience