The ever increasing amount of data being handled in data centers causes an intrinsic inefficiency: moving data around is expensive in terms of bandwidth, latency, and power consumption, especially given the low computational complexity of many database operations. In this paper we explore near-data processing in database engines, i.e., the option of offloading part of the computation directly to the storage nodes. We implement our ideas in Caribou, an intelligent distributed storage layer incorporating many of the lessons learned while building systems with specialized hardware. Caribou provides access to DRAM/NVRAM storage over the network through a simple key-value store interface, with each storage node providing high-bandwidth near-data...
Heterogeneity in cloud environments is a fact of life—from workload skews and network path changes, ...
Modern scientific computing involves organizing, moving, visualizing, and analyzing massive amounts ...
The realm of HPC systems lies in sharing computational resources efficiently. Their challenge is to ...
Todays data center and cloud architectures decouple compute and storage resources for better scalabi...
Over the last decades, a tremendous change toward using information technology in almost every daily...
Summarization: In the last decade, data processing systems started using main memory as much as poss...
Disk-oriented approaches to online storage are becoming increasingly problematic: they do not scale ...
Massive data transfers in modern key/value stores resulting from low data-locality and data-to-code ...
The storage stack in a data center consists of all the hardware and software layers involved in proc...
The exponential growth of the dataset size demanded by modern big data applications requires innovat...
The increasing gap between the speed of the processor and the time to access the data in the disk ha...
The Apache Hadoop project provides a framework for reliable, scalable, distributed computing. The st...
The cost of running a data center is increasingly dominated by energy consumption, contributed by po...
Traditional cloud computing technologies, such as MapReduce, use file systems as the system-wide sub...
Many large computer clusters offer alternative computing elements in addition to general-purpose CP...
Heterogeneity in cloud environments is a fact of life—from workload skews and network path changes, ...
Modern scientific computing involves organizing, moving, visualizing, and analyzing massive amounts ...
The realm of HPC systems lies in sharing computational resources efficiently. Their challenge is to ...
Todays data center and cloud architectures decouple compute and storage resources for better scalabi...
Over the last decades, a tremendous change toward using information technology in almost every daily...
Summarization: In the last decade, data processing systems started using main memory as much as poss...
Disk-oriented approaches to online storage are becoming increasingly problematic: they do not scale ...
Massive data transfers in modern key/value stores resulting from low data-locality and data-to-code ...
The storage stack in a data center consists of all the hardware and software layers involved in proc...
The exponential growth of the dataset size demanded by modern big data applications requires innovat...
The increasing gap between the speed of the processor and the time to access the data in the disk ha...
The Apache Hadoop project provides a framework for reliable, scalable, distributed computing. The st...
The cost of running a data center is increasingly dominated by energy consumption, contributed by po...
Traditional cloud computing technologies, such as MapReduce, use file systems as the system-wide sub...
Many large computer clusters offer alternative computing elements in addition to general-purpose CP...
Heterogeneity in cloud environments is a fact of life—from workload skews and network path changes, ...
Modern scientific computing involves organizing, moving, visualizing, and analyzing massive amounts ...
The realm of HPC systems lies in sharing computational resources efficiently. Their challenge is to ...