Siepel's team aligned whole-genome sequences for four groups of eukaryotic species (vertebrates, insects, worms, and yeast). The vertebrates included human, mouse, rat, chicken, and pufferfish, and the insects included three species of fruit fly and one species of mosquito. Two worm species and seven yeast species rounded out the set.
To help ease the gargantuan task of identifying conserved elements in multiple alignments of whole-genome sequences, the researchers developed a new computational tool called phastCons. In contrast to traditional tools that compute conservation levels based on sequence similarity at each nucleotide position, phastCons allows for multiple substitutions per site, accounts for unequal rates of substitutions for different nucleotides, and considers the phylogenetic relationships of the species involved.
After applying phastCons to multiple alignments of each of the four groups of eukaryotic species, the researchers estimated that only between 3-8% of the human genome was conserved in the other vertebrate species. On the other hand, the more compact genomes of insects were more highly conserved (37-53%), as were those of worms (18-37%) and yeast (47-68%).
The scientists also observed that the proportion of conserved sequences located outside of protein-coding regions tended to increase with genome length and with the species' general biological complexity.
Most strikingly, the researchers discovered that two-thirds or more of the conserved DNA sequences in vertebrate and insect species were located outside the exons of protein-coding genes, while non-protein-coding sequences accounted for only about 40% and 15% of the conserved elements in the genomes of worms and yeast, respectively.
"The conserved noncoding story seems to be fairly similar in vertebrates and insects, but looks quite different in worms and ye
Contact: Maria A. Smit
Cold Spring Harbor Laboratory