Public genomic and proteomic databases can be affected by a variety of errors. These errors may involve either the description or the meaning of data (namely, syntactic or semantic errors). We focus our analysis on the detection of semantic errors, in order to verify the accuracy of the stored information. In particular, we address the issue of data constraints and functional dependencies among attributes in a given relational database. Constraints and dependencies show semantics among attributes in a database schema and their knowledge may be exploited to improve data quality and integration in database design, and to perform query optimization and dimensional reduction. We propose a method to discover data constraints and functional depen...
Unsupervised annotation of proteins by software pipelines suffers from very high error rates. Spurio...
The rapid growth of biological databases not only provides biologists with abundant data but also pr...
International audienceDifferent rule semantics have been successively defined in many contexts such...
Public genomic and proteomic databases can be affected by a variety of errors. These errors may invo...
When applying association mining to real datasets, a ma-jor obstacle is that often a huge number of ...
AbstractIn biology, the accumulation of raw experimental data has been accompanied by the accumulati...
Numerous biomolecular data are available, but they are scattered in many databases and only some of ...
Real-world databases often contain syntactic and semantic errors, in spite of integrity constraints ...
The Semantic Web opens up new opportunities for the data mining research. Semantic Web data is usual...
The rapid growth of life science databases demands the fusion of knowledge from heterogeneous databa...
The progress in genome research demands for an adequate infrastructure to analyze the data sets. Dat...
Abstract Background The large biological databases su...
Database mining is the process of extracting interesting and previously unknown patterns and correla...
MOTIVATION: Millions of protein sequences currently being deposited to sequence databanks will never...
Most existing databases suffer from data inconsistencies. Enhancing data quality efforts are necessa...
Unsupervised annotation of proteins by software pipelines suffers from very high error rates. Spurio...
The rapid growth of biological databases not only provides biologists with abundant data but also pr...
International audienceDifferent rule semantics have been successively defined in many contexts such...
Public genomic and proteomic databases can be affected by a variety of errors. These errors may invo...
When applying association mining to real datasets, a ma-jor obstacle is that often a huge number of ...
AbstractIn biology, the accumulation of raw experimental data has been accompanied by the accumulati...
Numerous biomolecular data are available, but they are scattered in many databases and only some of ...
Real-world databases often contain syntactic and semantic errors, in spite of integrity constraints ...
The Semantic Web opens up new opportunities for the data mining research. Semantic Web data is usual...
The rapid growth of life science databases demands the fusion of knowledge from heterogeneous databa...
The progress in genome research demands for an adequate infrastructure to analyze the data sets. Dat...
Abstract Background The large biological databases su...
Database mining is the process of extracting interesting and previously unknown patterns and correla...
MOTIVATION: Millions of protein sequences currently being deposited to sequence databanks will never...
Most existing databases suffer from data inconsistencies. Enhancing data quality efforts are necessa...
Unsupervised annotation of proteins by software pipelines suffers from very high error rates. Spurio...
The rapid growth of biological databases not only provides biologists with abundant data but also pr...
International audienceDifferent rule semantics have been successively defined in many contexts such...