The thesis focuses on exploring extensions to the recurrent neural network (RNN) algorithm for natural language processing (NLP) in terms of improving its capabilities of semantic composition, investigating the possible benefits of leveraging multi-prototype word representations and improving its overall interpretability. While RNNs have received a strong competitor in form of the Transformer model, both approaches to processing natural language sequences possess their own set of issues. This thesis investigates methods of inducing sparsity in neural networks in order to learn shared sense representations and also tackles the problem of semantic composition in recurrent networks. The thesis introduces a novel approach for building recursive...