The Gene Expression Omnibus (GEO) is a public archive containing >4 million digital samples from functional genomics experiments collected over almost two decades. The accompanying metadata describing the experiments suffer from redundancy, inconsistency and incompleteness due to the prevalence of free text and the lack of well-defined data formats and their validation. To remedy this situation, we created Genomic Metadata Integration (GeMI; http://gmql.eu/gemi/), a web application that learns to automatically extract structured metadata (in the form of key-value pairs) from the plain text descriptions of GEO experiments. The extracted information can then be indexed for structured search and used for various downstream data mining activ...
A crucial and limiting factor in data reuse is the lack of accurate, structured, and complete descri...
Providing a common data model for the metadata of several heterogenous genomic data sources is hard,...
Abstract Background NCBI’s Gene Expression Omnibus (GEO) is a rich community resource containing mil...
The Gene Expression Omnibus (GEO) is a public archive containing >4 million digital samples from ...
A major challenge for functional and comparative genomics resource development is the extraction of ...
A major challenge for the development of resources for functional and comparative genomics is the ex...
The Gene Expression Omnibus (GEO) contains more than two million digital samples from functional gen...
There is a great deal of interest in analyzing very large data sets in the biomedical sciences. This...
The Gene Expression Omnibus (GEO) contains more than two million digital samples from functional gen...
While exponential growth in public genomic data can afford great insights into biological processes ...
A major challenge for functional and comparative genomics resource development is the extraction of ...
A crucial and limiting factor in data reuse is the lack of accurate, structured, and complete descri...
Providing a common data model for the metadata of several heterogenous genomic data sources is hard,...
Abstract Background NCBI’s Gene Expression Omnibus (GEO) is a rich community resource containing mil...
The Gene Expression Omnibus (GEO) is a public archive containing >4 million digital samples from ...
A major challenge for functional and comparative genomics resource development is the extraction of ...
A major challenge for the development of resources for functional and comparative genomics is the ex...
The Gene Expression Omnibus (GEO) contains more than two million digital samples from functional gen...
There is a great deal of interest in analyzing very large data sets in the biomedical sciences. This...
The Gene Expression Omnibus (GEO) contains more than two million digital samples from functional gen...
While exponential growth in public genomic data can afford great insights into biological processes ...
A major challenge for functional and comparative genomics resource development is the extraction of ...
A crucial and limiting factor in data reuse is the lack of accurate, structured, and complete descri...
Providing a common data model for the metadata of several heterogenous genomic data sources is hard,...
Abstract Background NCBI’s Gene Expression Omnibus (GEO) is a rich community resource containing mil...