Why is it so difficult to automatically understand a language even when what is targeted is only a limited kind of understanding, based on known facts? A key reason is the great variability in language, which is too challenging for a computer. This is the problem I try to tackle: how to identify similar meanings among different expressions? How to identify fragments of meaning in a sea of texts? This thesis consists of four chapters. I first consider recent developments in computational linguistics: I show that the availability of large corpora has resulted in more functional Natural Language Processing (NLP). This evolution carries the potential of a major impact on theory: corpora and automatic acquisition of knowledge from corpora (espec...