The availability of large and rich quantities of text data is due to the emergence of the World Wide Web, social media, and mobile devices. Such vast data sets have led to leaps in the performance of many statistically-based problems. Given a large magnitude of text data available, it is computationally prohibitive to train many complex Natural Language Processing (NLP) models on large data. This motivates the hypothesis that simple models trained on big data can outperform more complex models with small data. My dissertation provides a solution to effectively and efficiently exploit large data on many NLP applications. Datasets are growing at an exponential rate, much faster than increase in memory. To provide a memory-efficient solution ...
As the size of data available for processing increases, new models of computation are needed. This ...
For an extended version of this article that contains additional references and more in-depth discus...
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities...
Computational power needs have grown dramatically in recent years. This is also the case in many lan...
Data streams have emerged as a natural computational model for numerous applications of big data pro...
Substantial progress has been made in the field of natural language processing (NLP) due to the adve...
The field of streaming algorithms has enjoyed a deal of focus from the theoretical computer science ...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
The desired output in many machine learning tasks is a structured object, such as tree, clustering, ...
In light of widespread digitization endeavors and ever-growing textual data generation, developing e...
Today, computer systems need to cope with the explosive growth of data in the world. For instance, i...
In contrast to the traditional random access memory computational model where the entire input is av...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
Sometimes data is generated unboundedly and at such a fast pace that it is no longer possible to sto...
Streaming algorithms, which process very large datasets received one update at a time, are a key too...
As the size of data available for processing increases, new models of computation are needed. This ...
For an extended version of this article that contains additional references and more in-depth discus...
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities...
Computational power needs have grown dramatically in recent years. This is also the case in many lan...
Data streams have emerged as a natural computational model for numerous applications of big data pro...
Substantial progress has been made in the field of natural language processing (NLP) due to the adve...
The field of streaming algorithms has enjoyed a deal of focus from the theoretical computer science ...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
The desired output in many machine learning tasks is a structured object, such as tree, clustering, ...
In light of widespread digitization endeavors and ever-growing textual data generation, developing e...
Today, computer systems need to cope with the explosive growth of data in the world. For instance, i...
In contrast to the traditional random access memory computational model where the entire input is av...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
Sometimes data is generated unboundedly and at such a fast pace that it is no longer possible to sto...
Streaming algorithms, which process very large datasets received one update at a time, are a key too...
As the size of data available for processing increases, new models of computation are needed. This ...
For an extended version of this article that contains additional references and more in-depth discus...
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities...