At the Uppsala Monitoring Centre (UMC), individual case safety reports (ICSRs) are managed, analyzed and processed for publishing statistics of adverse drug reactions. On top of the UMC’s ICSR database there is a data processing tool used to analyze the data. Unfortunately, there are some constraints limiting the current processing-tool along with that the amount of arriving data to be processed grows at a rapid rate. The UMC’s processing system must be improved in order to handle future demands. In order to improve performance various frameworks forparallelization can be used. In this work, the in-memory computing framework Sparkwas used for parallelization of one of the current data processing tasks. Local clusters for running the new imp...
Entity Resolution is a crucial task for many applications, but its nave solution has a low efficienc...
Apache Spark is an execution engine that besides working as an isolated distributed, in-memory compu...
In last decade, data analytics have rapidly progressed from traditional disk-based processing to mod...
At the Uppsala Monitoring Centre (UMC), individual case safety reports (ICSRs) are managed, analyzed...
While cluster computing frameworks are continuously evolving to provide real-time data analysis capa...
The ever-increasing amount of data being generated worldwide, combined with the business advantages ...
Resource Description Framework (RDF) is a commonly used data model in the Semantic Web environment. ...
The sheer increase in the volume of data over the last decade has triggered research in cluster comp...
While cluster computing frameworks are continuously evolving to provide real-time data analysis capa...
The past few years have seen a major change in computing systems, as growing data volumes and stalli...
While cluster computing frameworks are contin-uously evolving to provide real-time data analysis cap...
International audienceQuerying very large RDF data sets in an efficient and scalable manner requires...
The area of Big Data is commonly characterized by situations where the volumes of data are such that...
Due to the latest development in the context of Internet of Things, the amount of generated and coll...
The analysis of massive databases is a key issue for most applications today and the use of parallel...
Entity Resolution is a crucial task for many applications, but its nave solution has a low efficienc...
Apache Spark is an execution engine that besides working as an isolated distributed, in-memory compu...
In last decade, data analytics have rapidly progressed from traditional disk-based processing to mod...
At the Uppsala Monitoring Centre (UMC), individual case safety reports (ICSRs) are managed, analyzed...
While cluster computing frameworks are continuously evolving to provide real-time data analysis capa...
The ever-increasing amount of data being generated worldwide, combined with the business advantages ...
Resource Description Framework (RDF) is a commonly used data model in the Semantic Web environment. ...
The sheer increase in the volume of data over the last decade has triggered research in cluster comp...
While cluster computing frameworks are continuously evolving to provide real-time data analysis capa...
The past few years have seen a major change in computing systems, as growing data volumes and stalli...
While cluster computing frameworks are contin-uously evolving to provide real-time data analysis cap...
International audienceQuerying very large RDF data sets in an efficient and scalable manner requires...
The area of Big Data is commonly characterized by situations where the volumes of data are such that...
Due to the latest development in the context of Internet of Things, the amount of generated and coll...
The analysis of massive databases is a key issue for most applications today and the use of parallel...
Entity Resolution is a crucial task for many applications, but its nave solution has a low efficienc...
Apache Spark is an execution engine that besides working as an isolated distributed, in-memory compu...
In last decade, data analytics have rapidly progressed from traditional disk-based processing to mod...