Distributed SQL Query Engines (DSQEs), like Hive, Shark, and Impala, have become the de-facto database set-up for Decision Support Systems with large database sizes. Unlike their single-threaded counterparts like MySQL, DSQEs experience inefficiencies related to the algorithm, code base, OS, and CPU micro-architecture that limit throughput despite the speedup from distributed execution. In my thesis, I present a detailed performance analysis of a DSQE called Hive, comparing it to MySQL, a single-threaded database application. Hive has difficulty converting queries into a set of MapReduce jobs for distributed execution. Hive also experiences a startup phase that is a significant overhead for short running queries. Additionally, both Hive and...
Database engines must adapt to the underlying hardware for high-performance transaction execution. C...
Big Data systems manage and process huge volumes of data constantly generated by various technologie...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
With the decrease in cost of storage and computation of public clouds, even small and medium enterpr...
Hive table is one of the big data tables which relies on structural data. By default, it stores the ...
Apache Hadoop has provided solutions to the obstacles related to the Big Data processing. Hadoop sto...
There has been much research devoted to improving the performance of data analytics frameworks, but ...
During the last two decades, computer hardware has experienced remarkable developments. Especially C...
The continuous increase in volume, variety and velocity of Big Data exposes datacenter resource scal...
The continuous increase in volume, variety and velocity of Big Data exposes datacenter resource scal...
SQL-on-Hadoop systems have been gaining popularity in recent years. One popular example of SQL-on-Ha...
Recent high-performance processors employ sophisticated techniques to overlap and simultaneously exe...
Computer architectures are moving towards an era dominated by many-core machines with dozens or even...
The sheer increase in the volume of data over the last decade has triggered research in cluster comp...
this paper we answer the question "Where does time go when a database system executes on a mode...
Database engines must adapt to the underlying hardware for high-performance transaction execution. C...
Big Data systems manage and process huge volumes of data constantly generated by various technologie...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
With the decrease in cost of storage and computation of public clouds, even small and medium enterpr...
Hive table is one of the big data tables which relies on structural data. By default, it stores the ...
Apache Hadoop has provided solutions to the obstacles related to the Big Data processing. Hadoop sto...
There has been much research devoted to improving the performance of data analytics frameworks, but ...
During the last two decades, computer hardware has experienced remarkable developments. Especially C...
The continuous increase in volume, variety and velocity of Big Data exposes datacenter resource scal...
The continuous increase in volume, variety and velocity of Big Data exposes datacenter resource scal...
SQL-on-Hadoop systems have been gaining popularity in recent years. One popular example of SQL-on-Ha...
Recent high-performance processors employ sophisticated techniques to overlap and simultaneously exe...
Computer architectures are moving towards an era dominated by many-core machines with dozens or even...
The sheer increase in the volume of data over the last decade has triggered research in cluster comp...
this paper we answer the question "Where does time go when a database system executes on a mode...
Database engines must adapt to the underlying hardware for high-performance transaction execution. C...
Big Data systems manage and process huge volumes of data constantly generated by various technologie...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...