Sequencing technology has improved dramatically over the past few decades. Before the sequencing of complete genomes was possible, the sequencing of a gene was directly linked to the biochemical characterization of its product [1], however biochemical and genetic characterization has not benefited from being scaled up in the same way as has sequencing. Thus, the scientific community is confronted with exponentially growing sequence databases in which roughly half of the entries are either annotated incorrectly or not at all. Therefore, in order to realize the true potential of the data being generated by sequencing projects, something must be done about the way the functions of those sequences are being discovered and identified. One appr...