In order to analyse the structure of hyperlink networks, researchers need to understand the type and nature of webpages or websites (i.e. nodes) that comprise such networks. Information such as generic top-level domain (e.g. com, org, gov) only provides ‘coarse-grained’ data about what these nodes are. However, social scientists often require more detailed information about the websites under analysis. Usually this involves manually labelling, or ‘coding’, nodes into categories, using techniques similar to textual or documentary analysis. However, the size and nature of hyperlink networks often makes this task quite time-consuming and costly. In this exploratory pilot study we investigate the use of supervised machine learning to automatica...
The World Wide Web contains an enormous amount of information, but it can be exceedingly difficult f...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Website categorization has recently emerged as a very important task in several contexts. A huge amo...
Abstract. Domain-specific internet portals are growing in popularity because they gather content fro...
The web is recognized as the largest data source in the world. The nature of such data is characteri...
In recent years, the usage of the Internet has increased tremendously, and the total number of web p...
The Internet contains a vast amount of data that is growing exponentially. To exploit this data, a W...
The Internet is rife with abuse: examples include spam, phishing, malicious advertising, DNS abuse,...
The applications and advantages of the Internet for real-time information sharing can never be over-...
The Internet is rife with abuse: examples include spam, phishing, malicious advertising, DNS abuse,...
This dissertation reports the results of an exploratory data analysis investigation of the relations...
The applications and advantages of the Internet for real-time information sharing can never be over-...
Tremendous resources are spent by organizations guarding against and recovering from cybersecurity a...
Tremendous resources are spent by organizations guarding against and recovering from cybersecurity a...
We used machine learning to study policy issues and frames in political messages. With regard to fra...
The World Wide Web contains an enormous amount of information, but it can be exceedingly difficult f...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Website categorization has recently emerged as a very important task in several contexts. A huge amo...
Abstract. Domain-specific internet portals are growing in popularity because they gather content fro...
The web is recognized as the largest data source in the world. The nature of such data is characteri...
In recent years, the usage of the Internet has increased tremendously, and the total number of web p...
The Internet contains a vast amount of data that is growing exponentially. To exploit this data, a W...
The Internet is rife with abuse: examples include spam, phishing, malicious advertising, DNS abuse,...
The applications and advantages of the Internet for real-time information sharing can never be over-...
The Internet is rife with abuse: examples include spam, phishing, malicious advertising, DNS abuse,...
This dissertation reports the results of an exploratory data analysis investigation of the relations...
The applications and advantages of the Internet for real-time information sharing can never be over-...
Tremendous resources are spent by organizations guarding against and recovering from cybersecurity a...
Tremendous resources are spent by organizations guarding against and recovering from cybersecurity a...
We used machine learning to study policy issues and frames in political messages. With regard to fra...
The World Wide Web contains an enormous amount of information, but it can be exceedingly difficult f...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Website categorization has recently emerged as a very important task in several contexts. A huge amo...