In-database analytics is of great practical importance as it avoids the costly repeated loop data scientists have to deal with on a daily basis: select features, export the data, convert data format, train models using an external tool, reimport the parameters. It is also a fertile ground of theoretically fundamental and challenging problems at the intersection of relational and statistical data models. This paper introduces a unified framework for training and evaluating a class of statistical learning models inside a relational database. This class includes ridge linear regression, polynomial regression, factorization machines, and principal component analysis. We show that, by synergizing key tools from relational database theory...
Relational learning refers to learning from data that have a complex structure. This structure may ...
Many databases store data in relational format, with differ-ent types of entities and information ab...
A major obstacle to fully integrated deployment of many data mining algorithms is the assumption tha...
In-database analytics is of great practical importance as it avoids the costly repeated loop data sc...
Integrated solutions for analytics over relational databases are of great practical importance as th...
In this talk, I will make the case for a first-principles approach to machine learning over relation...
The primary difference between propositional (attribute-value) and relational data is the existence ...
One fundamental limitation of classical statistical modeling is the assumption that data is represen...
Abstract—In the previous work, we described the advantages of in-database machine learning. By using...
Enterprise data analytics is a booming area in the data man-agement industry. Many companies are rac...
This paper overviews factorized databases and their application to machine learning. The key observa...
This thesis focuses on some fundamental problems in machine learning that are posed as nonconvex mat...
Multidimensional statistical models are generally computed outside a relational DBMS, exporting data...
Statistical relational learning techniques have been successfully applied in a wide range of relatio...
We consider the problem of computing machine learning models over multi-relational databases. The ma...
Relational learning refers to learning from data that have a complex structure. This structure may ...
Many databases store data in relational format, with differ-ent types of entities and information ab...
A major obstacle to fully integrated deployment of many data mining algorithms is the assumption tha...
In-database analytics is of great practical importance as it avoids the costly repeated loop data sc...
Integrated solutions for analytics over relational databases are of great practical importance as th...
In this talk, I will make the case for a first-principles approach to machine learning over relation...
The primary difference between propositional (attribute-value) and relational data is the existence ...
One fundamental limitation of classical statistical modeling is the assumption that data is represen...
Abstract—In the previous work, we described the advantages of in-database machine learning. By using...
Enterprise data analytics is a booming area in the data man-agement industry. Many companies are rac...
This paper overviews factorized databases and their application to machine learning. The key observa...
This thesis focuses on some fundamental problems in machine learning that are posed as nonconvex mat...
Multidimensional statistical models are generally computed outside a relational DBMS, exporting data...
Statistical relational learning techniques have been successfully applied in a wide range of relatio...
We consider the problem of computing machine learning models over multi-relational databases. The ma...
Relational learning refers to learning from data that have a complex structure. This structure may ...
Many databases store data in relational format, with differ-ent types of entities and information ab...
A major obstacle to fully integrated deployment of many data mining algorithms is the assumption tha...