Handwritten document understanding is a fundamental research problem in pattern recognition and it relies on the effective features. In this paper, we propose a joint feature distribution (JFD) principle to design novel discriminative features which could be the joint distribution of features on adjacent positions or the joint distribution of different features on the same location. Following the proposed JFD principle, we introduce seventeen features, including twelve textural-based and five grapheme-based features. We evaluate these features for different applications from four different perspectives to understand handwritten documents beyond OCR, by writer identification, script recognition, historical manuscript dating and localization....