Through their transfer learning abilities, highly-parameterized large pre-trained language models have dominated the NLP landscape for a multitude of downstream language tasks. Though linguistically proficient, the inability of these models to incorporate the learning of non-linguistic entities (numerals and arithmetic reasoning) limits their usage for tasks that require numeric comprehension or strict mathematical reasoning. However, as we illustrate in this paper, building a general purpose language model that also happens to be proficient in mathematical reasoning is not as straight-forward as training it on a numeric dataset. In this work, we develop a novel framework that enables language models to be mathematically proficient while re...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences, F...
Natural Language Processing (NLP) has become one of the leading application areas in the current Art...
Math word problem (MWP) solving faces a dilemma in number representation learning. In order to avoid...
A better understanding of the emergent computation and problem-solving capabilities of recent large ...
Language models have achieved remarkable performance on a wide range of tasks that require natural l...
Language models, given their black-box nature, often exhibit sensitivity to input perturbations, lea...
Language models demonstrate both quantitative improvement and new qualitative capabilities with incr...
The nature and amount of information needed for learning a natural language, and the underlying mech...
Large language models have exhibited emergent abilities, demonstrating exceptional performance acros...
Language models typically tokenize text into subwords, using a deterministic, hand-engineered heuris...
International audiencecomputational models have played a central role in the debate over language le...
International audienceWe present a computational model of language learning via a sequence of intera...
Thesis (Ph.D.)--University of Washington, 2023Language models (LMs) are at the core of almost all st...
International audienceWe present an overview of the results obtained with a computational model that...
Many tasks are considered to be 'solved' in the computational linguistics literature, but the corres...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences, F...
Natural Language Processing (NLP) has become one of the leading application areas in the current Art...
Math word problem (MWP) solving faces a dilemma in number representation learning. In order to avoid...
A better understanding of the emergent computation and problem-solving capabilities of recent large ...
Language models have achieved remarkable performance on a wide range of tasks that require natural l...
Language models, given their black-box nature, often exhibit sensitivity to input perturbations, lea...
Language models demonstrate both quantitative improvement and new qualitative capabilities with incr...
The nature and amount of information needed for learning a natural language, and the underlying mech...
Large language models have exhibited emergent abilities, demonstrating exceptional performance acros...
Language models typically tokenize text into subwords, using a deterministic, hand-engineered heuris...
International audiencecomputational models have played a central role in the debate over language le...
International audienceWe present a computational model of language learning via a sequence of intera...
Thesis (Ph.D.)--University of Washington, 2023Language models (LMs) are at the core of almost all st...
International audienceWe present an overview of the results obtained with a computational model that...
Many tasks are considered to be 'solved' in the computational linguistics literature, but the corres...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences, F...
Natural Language Processing (NLP) has become one of the leading application areas in the current Art...
Math word problem (MWP) solving faces a dilemma in number representation learning. In order to avoid...