Scientists are increasingly turning to datacenter-scale computers to produce and analyze massive arrays. Despite decades of database research that extols the virtues of declarative query processing, scientists still write, debug and parallelize imperative HPC kernels even for the most mundane queries. This impedance mismatch has been partly attributed to the cumbersome data loading process; in response, the database community has proposed in situ mechanisms to access data in scientific file formats. Scientists, however, desire more than a passive access method that reads arrays from files. This paper describes ArrayBridge, a bi-directional array view mechanism for scientific file formats, that aims to make declarative array manipulations ...
File systems are the backbone of large-scale data processing for sci-entific applications. Motivated...
Datasets used in scientific and engineering applications are often modeled as dense multi-dimensiona...
Many scientific data-intensive applications perform iterative computations on array data. There exis...
Scientists are increasingly turning to datacenter-scale computers to produce and analyze massive arr...
Scientific experiments and large-scale simulations produce massive amounts of data. Many of these sc...
Scientific data analysis typically involves reading massive amounts of data that was generated by si...
Thesis (Ph.D.)--University of Washington, 2014Scientists today are able to generate data at an unpre...
As high-performance computing approaches exascale, the existing I/O system design is having trouble ...
Multidimensional arrays are a fundamental data type in scientific computing and are used extensively...
Multi-dimensional arrays (also known as raster data or gridded data) play a key role in many, if not...
Data producers typically optimize the layout of data files to minimize the write time. In most cases...
Scientists today are able to generate data at an unprecedented scale and rate. For example the Sloan...
Large-scale scientific applications typically write their data to par-allel file systems with organi...
Modern scientific datasets present numerous data management and analysis challenges. State-of-the- a...
Abstract—Data producers typically optimize the layout of data files to minimize the write time. In m...
File systems are the backbone of large-scale data processing for sci-entific applications. Motivated...
Datasets used in scientific and engineering applications are often modeled as dense multi-dimensiona...
Many scientific data-intensive applications perform iterative computations on array data. There exis...
Scientists are increasingly turning to datacenter-scale computers to produce and analyze massive arr...
Scientific experiments and large-scale simulations produce massive amounts of data. Many of these sc...
Scientific data analysis typically involves reading massive amounts of data that was generated by si...
Thesis (Ph.D.)--University of Washington, 2014Scientists today are able to generate data at an unpre...
As high-performance computing approaches exascale, the existing I/O system design is having trouble ...
Multidimensional arrays are a fundamental data type in scientific computing and are used extensively...
Multi-dimensional arrays (also known as raster data or gridded data) play a key role in many, if not...
Data producers typically optimize the layout of data files to minimize the write time. In most cases...
Scientists today are able to generate data at an unprecedented scale and rate. For example the Sloan...
Large-scale scientific applications typically write their data to par-allel file systems with organi...
Modern scientific datasets present numerous data management and analysis challenges. State-of-the- a...
Abstract—Data producers typically optimize the layout of data files to minimize the write time. In m...
File systems are the backbone of large-scale data processing for sci-entific applications. Motivated...
Datasets used in scientific and engineering applications are often modeled as dense multi-dimensiona...
Many scientific data-intensive applications perform iterative computations on array data. There exis...