This paper studies how to reduce the amount of human su-pervision for identifying splogs / authentic blogs in the con-text of continuously updating splog data sets year by year. Following the previous works on active learning, against the task of splog / authentic blog detection, this paper empir-ically examines several strategies for selective sampling in active learning by Support Vector Machines (SVMs). As a confidence measure of SVMs learning, we employ the dis-tance from the separating hyperplane to each test instance, which have been well studied in active learning for text clas-sification. Unlike those results of applying active learning to text classification tasks, in the task of splog / authentic blog detection of this paper, it i...
This paper presents a method for potential topic discovery from blogsphere. We define a potential to...
Abstract. This paper explores the use of Support Vector Machines (SVMs) for learning text classi ers...
Supervised machine learning methods are increasingly employed in political science. Such models requ...
Weblogs, or blogs have become an important new way to publish information, engage in discussions and...
Abstract. In order to reduce human efforts, there has been increasing interest in applying active le...
In machine learning, active learning refers to algorithms that autonomously select the data points f...
This paper focuses on spam blog (splog) detection. Blogs are highly popular, new media social commun...
The abundance of real-world data and limited labeling budget calls for active learning, which is an ...
The paper describes a probabilistic active learning strategy for support vector machine (SVM) design...
The abundance of real-world data and limited labeling budget calls for active learning, which is an ...
Spam blogs (splogs) have become a major problem in the increasingly popular blogosphere. Splogs are ...
My first exposure to Support Vector Machines came this spring when heard Sue Dumais present impressi...
Abstract: Data mining extracts novel and useful knowledge from large repositories of data and has be...
Abstract. Automated text categorisation systems learn a generalised hypothesis from large numbers of...
Splog is the key challenge in the access of blogosphere. Existing splog-filtering methods are restri...
This paper presents a method for potential topic discovery from blogsphere. We define a potential to...
Abstract. This paper explores the use of Support Vector Machines (SVMs) for learning text classi ers...
Supervised machine learning methods are increasingly employed in political science. Such models requ...
Weblogs, or blogs have become an important new way to publish information, engage in discussions and...
Abstract. In order to reduce human efforts, there has been increasing interest in applying active le...
In machine learning, active learning refers to algorithms that autonomously select the data points f...
This paper focuses on spam blog (splog) detection. Blogs are highly popular, new media social commun...
The abundance of real-world data and limited labeling budget calls for active learning, which is an ...
The paper describes a probabilistic active learning strategy for support vector machine (SVM) design...
The abundance of real-world data and limited labeling budget calls for active learning, which is an ...
Spam blogs (splogs) have become a major problem in the increasingly popular blogosphere. Splogs are ...
My first exposure to Support Vector Machines came this spring when heard Sue Dumais present impressi...
Abstract: Data mining extracts novel and useful knowledge from large repositories of data and has be...
Abstract. Automated text categorisation systems learn a generalised hypothesis from large numbers of...
Splog is the key challenge in the access of blogosphere. Existing splog-filtering methods are restri...
This paper presents a method for potential topic discovery from blogsphere. We define a potential to...
Abstract. This paper explores the use of Support Vector Machines (SVMs) for learning text classi ers...
Supervised machine learning methods are increasingly employed in political science. Such models requ...