A software program that has been successfully annotating the genes of common bacteria since 1992 is now capable of finding genes in higher organisms. It is particularly useful for finding human genes in anonymous human DNA sequences.
Understanding the genomes of key microorganisms may increase understanding of human genetics because lower organisms have some genes that correspond to human genes. Also scientists can design new drugs based on knowledge of disease-causing bacteria.
The original software program, called GeneMark, uses probabilistic mathematical models to predict the locations of genes on a strand of DNA. GeneMark was developed by Dr. Mark Borodovsky, a professor of biology at the Georgia Institute of Technology. It has become the world's most-used software program for deciphering bacterial DNA and has proven itself 98 percent accurate.
Borodovsky's latest development uses GeneMark.hmm, a refined version of the original program, as its base to make more sophisticated predictions for the genomes of eukaryotic, or higher organisms.
"Deciphering bacterial DNA is simpler than deciphering human DNA since its genes run continuously, without gaps," Borodovsky explained. "The genes of human DNA may be divided into pieces, called exons, with non-coding genetic material between the exons. These spacers in the genes, called introns, were hard to detect by a computer algorithm. Also, eukaryotic DNA is much longer, with an average gene size of 10,000 nucleotides."
Therefore, the predictions of where eukaryotic genes lie on a strand of DNA must include predictions of the boundaries between the exons, which contain the genetic information, and introns, which are the non-coding regions.
To create a computer program to achieve this, Borodovsky employed a probabilistic mathematical model called
Contact: Jane Sanders
Georgia Institute of Technology Research News