Lakehouse systems have reached in the past few years unprecedented size and heterogeneity and have been embraced by many industry players. However, they are often difficult to use as they lack the declarative language and optimization possibilities of relational engines. This paper introduces RumbleML, a high-level, declarative library integrated into the RumbleDB engine and with the JSONiq language. RumbleML allows using a single platform for data cleaning, data preparation, training, and inference, as well as management of models and results. It does it using a purely declarative language (JSONiq) for all these tasks and without any performance loss over existing platforms (e.g. Spark). The key insights of the design of RumbleML are th...
Analytics on big data may range from passenger vol-ume prediction in transportation to customer sati...
The immense evolution in Large Language Models (LLMs) has underscored the importance of massive, div...
Machine learning has become a key driver for technological advancement in the last decade on the bac...
This thesis integrates Snowflake to the ecosystem of RumbleDB, the query execution engine of the JSO...
Semi-structured data formats such as JSON offer the advantage of representing arbitrarily complex da...
Machine Learning methods, especially Deep Learning, had an enormous breakthrough in Natural Language...
We demonstrate MLog, a high-level language that integrates machine learning into data management sys...
Semi-structured data formats like JSON gained popularity through their ability to represent arbitrar...
Jaql is a declarative scripting language for enterprise data analysis powered by a scalable runtime ...
JSONiq is a querying language specifically tailored for JSON files, whose key capability is to deal ...
Thesis (Ph.D.)--University of Washington, 2018Artificial intelligence has become the topic of the cu...
Machine learning (ML) pipelines for model training and validation typically include preprocessing, s...
Diversity in machine learning APIs (in both software toolkits and web services), works against reali...
Improving developer productivity is an important, but very difficult task, that researchers from bot...
Inference of Machine Learning (ML) models, i.e. the process of obtaining predictions from trained mo...
Analytics on big data may range from passenger vol-ume prediction in transportation to customer sati...
The immense evolution in Large Language Models (LLMs) has underscored the importance of massive, div...
Machine learning has become a key driver for technological advancement in the last decade on the bac...
This thesis integrates Snowflake to the ecosystem of RumbleDB, the query execution engine of the JSO...
Semi-structured data formats such as JSON offer the advantage of representing arbitrarily complex da...
Machine Learning methods, especially Deep Learning, had an enormous breakthrough in Natural Language...
We demonstrate MLog, a high-level language that integrates machine learning into data management sys...
Semi-structured data formats like JSON gained popularity through their ability to represent arbitrar...
Jaql is a declarative scripting language for enterprise data analysis powered by a scalable runtime ...
JSONiq is a querying language specifically tailored for JSON files, whose key capability is to deal ...
Thesis (Ph.D.)--University of Washington, 2018Artificial intelligence has become the topic of the cu...
Machine learning (ML) pipelines for model training and validation typically include preprocessing, s...
Diversity in machine learning APIs (in both software toolkits and web services), works against reali...
Improving developer productivity is an important, but very difficult task, that researchers from bot...
Inference of Machine Learning (ML) models, i.e. the process of obtaining predictions from trained mo...
Analytics on big data may range from passenger vol-ume prediction in transportation to customer sati...
The immense evolution in Large Language Models (LLMs) has underscored the importance of massive, div...
Machine learning has become a key driver for technological advancement in the last decade on the bac...