Abstract. The intensive research activity in analysis of social media and micro-blogging data in recent years suggests the necessity and great potential of platforms that can efficiently store, query, analyze, and visualize social media data. To support these “social media observatories ” effectively, a storage platform must satisfy special requirements for loading and storage of multi-terabyte datasets, as well as efficient evaluation of queries involving analysis of the text of millions of social updates. Traditional inverted indexing techniques do not meet such requirements. As a solution, we propose a general indexing framework, IndexedHBase, to build specially customized index structures for facilitating efficient queries on an HBase d...
We study the problem of indexing continuous data streams in which data are heterogeneous in structur...
Peer-to-peer networks are becoming a common form of online data exchange. Querying data, mostly fil...
The thesis consists in the implementation of a modular, distributed and fault tolerant crawler suppo...
Abstract. The intensive research activity in analysis of social media and micro-blogging data in rec...
Abstract — Social media data analysis demonstrates two special characteristics in Big Data processin...
As data intensive applications evolve, many research projects involving Big Data require efficient e...
Social media is an increasingly popular method for people to share information and interact with eac...
The continuous growth of the internet and the popularity of social networks have created a huge amou...
There is an increasing trend of social media usage in recent years and users desire a search system ...
With the proliferation of user-generated data, many emerging applications consume this data to serve...
To answer search queries on a social network rich with user-generated content, it is desirable to gi...
We identify crucial design issues in building a distributed inverted index for a large collection of...
Search engines and database systems both play important roles as we store and organize ever increasi...
Recently, Big Data processing is becoming a necessary technique to efficiently store, manage, and an...
Part 16: Recommendation SystemsInternational audienceDatabase deployment is a complex task depending...
We study the problem of indexing continuous data streams in which data are heterogeneous in structur...
Peer-to-peer networks are becoming a common form of online data exchange. Querying data, mostly fil...
The thesis consists in the implementation of a modular, distributed and fault tolerant crawler suppo...
Abstract. The intensive research activity in analysis of social media and micro-blogging data in rec...
Abstract — Social media data analysis demonstrates two special characteristics in Big Data processin...
As data intensive applications evolve, many research projects involving Big Data require efficient e...
Social media is an increasingly popular method for people to share information and interact with eac...
The continuous growth of the internet and the popularity of social networks have created a huge amou...
There is an increasing trend of social media usage in recent years and users desire a search system ...
With the proliferation of user-generated data, many emerging applications consume this data to serve...
To answer search queries on a social network rich with user-generated content, it is desirable to gi...
We identify crucial design issues in building a distributed inverted index for a large collection of...
Search engines and database systems both play important roles as we store and organize ever increasi...
Recently, Big Data processing is becoming a necessary technique to efficiently store, manage, and an...
Part 16: Recommendation SystemsInternational audienceDatabase deployment is a complex task depending...
We study the problem of indexing continuous data streams in which data are heterogeneous in structur...
Peer-to-peer networks are becoming a common form of online data exchange. Querying data, mostly fil...
The thesis consists in the implementation of a modular, distributed and fault tolerant crawler suppo...