Metadata have always played a key role in favoring the cooperation of heterogeneous data sources. This role has become much more crucial with the advent of data lakes, in which case metadata represent the only possibility to guarantee an effective and efficient management of data source interoperability. For this reason, the necessity to define new models and paradigms for metadata representation and management appears crucial in the data lake scenario. In this paper, we aim at addressing this issue by proposing a new metadata model well suited for data lakes. Furthermore, to give an idea of its capabilities, we present an approach that leverages it to “structure” unstructured sources and to extract thematic views from heterogeneous data la...
International audienceThe rise of big data has revolutionized data exploitation practices and led to...
Valuable insights are frequently only available after combining and analysing data from multiple sou...
There is currently a burst of Big Data (BD) processed and stored in huge raw data repositories, comm...
Metadata have always played a key role in favoring the cooperation of heterogeneous data sources. Th...
In the last years, data lakes are emerging as an effective and efficient support for information and...
For more than 30 decades, data warehouses have been considered the only business intelligence storag...
International audienceOver the past decade, the data lake concept has emerged as an alternative to d...
International audienceTo prevent data lakes from being invisible and inaccessible to users, an effic...
In the last years, data lakes are emerging as an effective and an efficient support for information ...
Although big data is being discussed for some years, it still has many research challenges, such as ...
International audienceData lakes have emerged as an alternative to data warehouses for the storage, ...
In addition to volume and velocity, Big data is also characterized by its variety. Variety in struct...
In addition to volume and velocity, Big data is also characterized by its variety. Variety in struct...
As the challenge of our time, Big Data still has many research hassles, especially the variety of da...
The heterogeneity of sources in Big Data systems requires new integration approaches which can handl...
International audienceThe rise of big data has revolutionized data exploitation practices and led to...
Valuable insights are frequently only available after combining and analysing data from multiple sou...
There is currently a burst of Big Data (BD) processed and stored in huge raw data repositories, comm...
Metadata have always played a key role in favoring the cooperation of heterogeneous data sources. Th...
In the last years, data lakes are emerging as an effective and efficient support for information and...
For more than 30 decades, data warehouses have been considered the only business intelligence storag...
International audienceOver the past decade, the data lake concept has emerged as an alternative to d...
International audienceTo prevent data lakes from being invisible and inaccessible to users, an effic...
In the last years, data lakes are emerging as an effective and an efficient support for information ...
Although big data is being discussed for some years, it still has many research challenges, such as ...
International audienceData lakes have emerged as an alternative to data warehouses for the storage, ...
In addition to volume and velocity, Big data is also characterized by its variety. Variety in struct...
In addition to volume and velocity, Big data is also characterized by its variety. Variety in struct...
As the challenge of our time, Big Data still has many research hassles, especially the variety of da...
The heterogeneity of sources in Big Data systems requires new integration approaches which can handl...
International audienceThe rise of big data has revolutionized data exploitation practices and led to...
Valuable insights are frequently only available after combining and analysing data from multiple sou...
There is currently a burst of Big Data (BD) processed and stored in huge raw data repositories, comm...