Abstract:- In this paper, we propose a new solution for dynamic task scheduling in distributed environment. The key issue for scheduling tasks is that we can not obtain the execution time of irregular computations in advance. For this reason, we propose a method which is based on sampling to some typical data mining algorithm. We argue that a function is existed in the items: execution time, the size of data and the algorithm, therefore we can deduce the execution time of a data mining task from the corresponding the size of data and algorithm. The experimental results show that almost all the algorithms exhibits quasi linear scalability, but the slope of different algorithms is different. We adopt this sampling method for process the tasks...
Dynamic scheduling of a set of algorithms is a key problem for data analysis platform. In this paper...
Abstract. In this paper we consider concurrent execution of multiple data mining queries. If such da...
The computing-intensive data mining for inherently Internet-wide distributed data, referred to as Di...
Abstract:- Distributed data mining plays a crucial role in knowledge discovery in very large databas...
The use of information technology (IT) in scientific investigations is now commonplace, due largely ...
Abstract. Increasingly the datasets used for data mining are becoming huge and physically distribute...
Sequential sampling algorithms have recently attracted interest as a way to design scalable algorith...
Sequential sampling algorithms have recently attracted interest as a way to design scalable algorith...
This is an open access article that can be obtained from the links below - Copyright @ 2006 Springer...
The aim of this paper is to provide a description of machine learning based scheduling approach for ...
Abstract: Scheduling divisible workloads in distributed systems has been one of the interesting res...
Increasingly the datasets used for data mining are huge and physically distributed
The rapidly growing field of data mining has the potential of improving performance of existing sche...
This article presents a statistical approach to the scheduling of divisible workloads. Structured as...
In real-world dynamic heterogeneous distributed systems, allocating tasks to processors can be an i...
Dynamic scheduling of a set of algorithms is a key problem for data analysis platform. In this paper...
Abstract. In this paper we consider concurrent execution of multiple data mining queries. If such da...
The computing-intensive data mining for inherently Internet-wide distributed data, referred to as Di...
Abstract:- Distributed data mining plays a crucial role in knowledge discovery in very large databas...
The use of information technology (IT) in scientific investigations is now commonplace, due largely ...
Abstract. Increasingly the datasets used for data mining are becoming huge and physically distribute...
Sequential sampling algorithms have recently attracted interest as a way to design scalable algorith...
Sequential sampling algorithms have recently attracted interest as a way to design scalable algorith...
This is an open access article that can be obtained from the links below - Copyright @ 2006 Springer...
The aim of this paper is to provide a description of machine learning based scheduling approach for ...
Abstract: Scheduling divisible workloads in distributed systems has been one of the interesting res...
Increasingly the datasets used for data mining are huge and physically distributed
The rapidly growing field of data mining has the potential of improving performance of existing sche...
This article presents a statistical approach to the scheduling of divisible workloads. Structured as...
In real-world dynamic heterogeneous distributed systems, allocating tasks to processors can be an i...
Dynamic scheduling of a set of algorithms is a key problem for data analysis platform. In this paper...
Abstract. In this paper we consider concurrent execution of multiple data mining queries. If such da...
The computing-intensive data mining for inherently Internet-wide distributed data, referred to as Di...