A statistical study of embedded metadata in a sample of more than 4 million HTML Web-pages is reported. The paper tries to determine and quantify the validity of this metadata. Of particular interest is to see if it is trustworthy enough for determining the topic of a Web-page. Datasets are collected by a Web crawler running both as a general and a focused crawler. Metadata fields 'title', 'author', 'keywords', 'description', and 'language' are analyzed in detail together with Dublin Core metadata. The study reveals problems with how metadata is created. Among the 75 \% of all Web-pages that have interesting metadata, the field 'language' is the most trustworthy. All other metadata fields show a high degree of duplication thus degrading the...
This paper reports on a study that examined the ability of resource authors to create acceptable met...
This paper reports on a study that examined the ability of resource authors to create acceptable met...
The currently established formats for how a Web site can publish metadata about a site's pages, the ...
This study examined the use of HTML meta tags for embedding metadata in World Wide Web resources. Wh...
This study examined the use of HTML meta tags for embedding metadata in World Wide Web resources. Wh...
Purpose – This paper aims to investigate the internet web page metadata usage behavior in terms of t...
The tremendous growth of Web resources has made information organization and retrieval more and more...
The purpose of this Master's thesis is to study metadata and its use at Web sites of six Swedis...
The purpose of this Master's thesis is to study metadata and its use at Web sites of six Swedis...
The World Wide Web currently has a huge amount of data, with practically no classification informati...
It has been claimed that topic metadata can be used to improve the accuracy of text searches. Here, ...
The World Wide Web currently has a huge amount of data, with practically no classification informati...
ABSTRACT In this paper, we investigate the difference between metadata generated by users and author...
Metadata is designed to improve information organization and information retrieval effectiveness and...
The present investigation was aimed to study the scope of presence of Dublin Core metadata elements ...
This paper reports on a study that examined the ability of resource authors to create acceptable met...
This paper reports on a study that examined the ability of resource authors to create acceptable met...
The currently established formats for how a Web site can publish metadata about a site's pages, the ...
This study examined the use of HTML meta tags for embedding metadata in World Wide Web resources. Wh...
This study examined the use of HTML meta tags for embedding metadata in World Wide Web resources. Wh...
Purpose – This paper aims to investigate the internet web page metadata usage behavior in terms of t...
The tremendous growth of Web resources has made information organization and retrieval more and more...
The purpose of this Master's thesis is to study metadata and its use at Web sites of six Swedis...
The purpose of this Master's thesis is to study metadata and its use at Web sites of six Swedis...
The World Wide Web currently has a huge amount of data, with practically no classification informati...
It has been claimed that topic metadata can be used to improve the accuracy of text searches. Here, ...
The World Wide Web currently has a huge amount of data, with practically no classification informati...
ABSTRACT In this paper, we investigate the difference between metadata generated by users and author...
Metadata is designed to improve information organization and information retrieval effectiveness and...
The present investigation was aimed to study the scope of presence of Dublin Core metadata elements ...
This paper reports on a study that examined the ability of resource authors to create acceptable met...
This paper reports on a study that examined the ability of resource authors to create acceptable met...
The currently established formats for how a Web site can publish metadata about a site's pages, the ...