Classification of imbalanced data has been reported to require modification of standard classification algorithms and lately has attracted a lot of attention due to practical applications in industry, banking and finance. The aim of the paper is to examine algorithms known from literature when two modifications are introduced: MapReduce to parallelize computations and Relief to select most valuable attributes. Both modifications are needed in Big Data area. Also two new algorithms are considered
The volume, variety, and velocity properties of big data and the valuable information it contains ha...
MapReduce model is a typical distributed computing model, which is widely used in large-scale data p...
In many application domains such as medicine, information retrieval, cybersecurity, social media, et...
Big Data applications are emerging during the last years, and researchers from many disciplines are ...
The class imbalance problem, one of the common data irregularities, causes the development of under-...
Abstract—The “big data ” term has caught the attention of experts in the context of learning from da...
Imbalance datasets exist in many real-world domains. It is straightforward to apply classification a...
Algorithms for mitigating imbalance of the MapReduce computa-tions are considered in this paper. Map...
The problem of classification of imbalanced datasets is a critical one. With an increase in the numb...
I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, includ...
Classification techniques in the big data scenario are in high demand in a wide variety of applicati...
Abstract — Classification techniques in the big data scenario are in high demand in a wide variety o...
Classification is a data mining task. It aims to extract knowledge from large datasets. There are tw...
The healthcare industry has generated large amounts of data, and analyzing these has emerged as an i...
[[abstract]]Mining with big data or big data mining has become an active research area. It is very d...
The volume, variety, and velocity properties of big data and the valuable information it contains ha...
MapReduce model is a typical distributed computing model, which is widely used in large-scale data p...
In many application domains such as medicine, information retrieval, cybersecurity, social media, et...
Big Data applications are emerging during the last years, and researchers from many disciplines are ...
The class imbalance problem, one of the common data irregularities, causes the development of under-...
Abstract—The “big data ” term has caught the attention of experts in the context of learning from da...
Imbalance datasets exist in many real-world domains. It is straightforward to apply classification a...
Algorithms for mitigating imbalance of the MapReduce computa-tions are considered in this paper. Map...
The problem of classification of imbalanced datasets is a critical one. With an increase in the numb...
I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, includ...
Classification techniques in the big data scenario are in high demand in a wide variety of applicati...
Abstract — Classification techniques in the big data scenario are in high demand in a wide variety o...
Classification is a data mining task. It aims to extract knowledge from large datasets. There are tw...
The healthcare industry has generated large amounts of data, and analyzing these has emerged as an i...
[[abstract]]Mining with big data or big data mining has become an active research area. It is very d...
The volume, variety, and velocity properties of big data and the valuable information it contains ha...
MapReduce model is a typical distributed computing model, which is widely used in large-scale data p...
In many application domains such as medicine, information retrieval, cybersecurity, social media, et...