This paper analyses the complexity of rule selection for supervised learning in distributed scenarios. The selection of rules is usually guided by a utility measure such as predictive accuracy or weighted relative accuracy. Other examples are support and confidence, known from association rule mining. A common strategy to tackle rule selection from distributed data is to evaluate rules locally on each dataset. While this works well for homogeneously distributed data, this work proves limitations of this strategy if distributions are allowed to deviate. To identify those subsets for which local and global distributions deviate may be regarded as an interesting learning task of its own, explicitly taking the locality of data into account....
Abstract: Data mining is used to extract important knowledge from large datasets, but sometimes thes...
To fill the increasing demand for explanations of decisions made by automated prediction systems, ma...
In some domains (e.g., molecular biology), data reposi-tories are large in size, dynamic, and physic...
This paper analyses the complexity of rule selection for supervised learning in distributed scenari...
This paper analyses the tractability of rule selection for supervised learning in distributed scenar...
Separate-and-conquer or covering rule learning algorithms may be viewed as a technique for using loc...
Conventional rule learning algorithms aim at finding a set of simple rules, where each rule covers a...
AbstractAssociation Rule Mining (ARM) is a popular and well researched method for discovering intere...
In many areas of daily life (e.g. in e-commerce or social networks), massive amounts of data are col...
This dissertation investigates how to adapt standard classification rule learning approaches to su...
Machine-learning methods are becoming increasingly popular for automated data analysis. However, sta...
Association rule mining typically focuses on discovering global rules valid across the entire datase...
With the existence of many large transaction databases, the huge amounts of data, the high scalabili...
Most algorithms for learning and pattern discovery in data assume that all the needed data is availa...
This paper motivates and precisely formulates the problem of learning from distributed data; descri...
Abstract: Data mining is used to extract important knowledge from large datasets, but sometimes thes...
To fill the increasing demand for explanations of decisions made by automated prediction systems, ma...
In some domains (e.g., molecular biology), data reposi-tories are large in size, dynamic, and physic...
This paper analyses the complexity of rule selection for supervised learning in distributed scenari...
This paper analyses the tractability of rule selection for supervised learning in distributed scenar...
Separate-and-conquer or covering rule learning algorithms may be viewed as a technique for using loc...
Conventional rule learning algorithms aim at finding a set of simple rules, where each rule covers a...
AbstractAssociation Rule Mining (ARM) is a popular and well researched method for discovering intere...
In many areas of daily life (e.g. in e-commerce or social networks), massive amounts of data are col...
This dissertation investigates how to adapt standard classification rule learning approaches to su...
Machine-learning methods are becoming increasingly popular for automated data analysis. However, sta...
Association rule mining typically focuses on discovering global rules valid across the entire datase...
With the existence of many large transaction databases, the huge amounts of data, the high scalabili...
Most algorithms for learning and pattern discovery in data assume that all the needed data is availa...
This paper motivates and precisely formulates the problem of learning from distributed data; descri...
Abstract: Data mining is used to extract important knowledge from large datasets, but sometimes thes...
To fill the increasing demand for explanations of decisions made by automated prediction systems, ma...
In some domains (e.g., molecular biology), data reposi-tories are large in size, dynamic, and physic...