AbstractDespite the exponential WWW growth and the success of the Semantic Web, there is limited support today to handle the information found on the Web. In this scenario, techniques and tools that support effective information retrieval are becoming increasingly important. In this work, we present a technique for recognizing and comparing the visual structural information of Web pages, The technique is based on a classification of the set of html–tags which is guided by the visual effect of each tag in the whole structure of the page. This allows us to translate the web page to a normalized form where groups of html tags are mapped into a common canonical one. A metric to compute the distance between two different pages is also introduced...
This thesis describes the design and implementation of an algorithm that, using some initial hints f...
Methods for ranking World Wide Web resources according to their position in the link structure of th...
The summary-based approach is not an optimal solution in presenting a search result for a massive co...
AbstractDespite the exponential WWW growth and the success of the Semantic Web, there is limited sup...
Abstract. When we describe a Web page informally, we often use phrases like \it looks like a newspap...
A new web content structure based on visual representation is proposed in this paper. Many web appli...
In this paper, we propose a Web page archiving system that combines state-of-the-art comparison meth...
In this paper we propose an architecture that exploit web pages stuctural information for the extrac...
We present in this paper a Web page archiving approach combining image and structural techniques. Ou...
Extracting and processing information from Web pages is an important task in many areas like constru...
Extracting and processing information from Web pages is an important task in many areas like constru...
We present general-purpose methods for recognizing certain types of structure in HTML documents. The...
Hyperlinks among webpages are very important information and are widely used for webpage clustering ...
AbstractAbility to create web page is one of basic IT skills. In the web page creation learning proc...
Though there are millions of websites on the internet, half of the ones we come across do not provid...
This thesis describes the design and implementation of an algorithm that, using some initial hints f...
Methods for ranking World Wide Web resources according to their position in the link structure of th...
The summary-based approach is not an optimal solution in presenting a search result for a massive co...
AbstractDespite the exponential WWW growth and the success of the Semantic Web, there is limited sup...
Abstract. When we describe a Web page informally, we often use phrases like \it looks like a newspap...
A new web content structure based on visual representation is proposed in this paper. Many web appli...
In this paper, we propose a Web page archiving system that combines state-of-the-art comparison meth...
In this paper we propose an architecture that exploit web pages stuctural information for the extrac...
We present in this paper a Web page archiving approach combining image and structural techniques. Ou...
Extracting and processing information from Web pages is an important task in many areas like constru...
Extracting and processing information from Web pages is an important task in many areas like constru...
We present general-purpose methods for recognizing certain types of structure in HTML documents. The...
Hyperlinks among webpages are very important information and are widely used for webpage clustering ...
AbstractAbility to create web page is one of basic IT skills. In the web page creation learning proc...
Though there are millions of websites on the internet, half of the ones we come across do not provid...
This thesis describes the design and implementation of an algorithm that, using some initial hints f...
Methods for ranking World Wide Web resources according to their position in the link structure of th...
The summary-based approach is not an optimal solution in presenting a search result for a massive co...