COULD there be forbidden sequences in the genome ones so harmful that they are not compatible with life? One group of researchers thinks so. Unlike most genome sequencing projects which set out to search for genes that are conserved within and between species, their goal is to identify "primes": DNA sequences and chains of amino acids so dangerous to life that they do not exist.
"It's like looking for a needle that's not actually in the haystack," says Greg Hampikian, professor of genetics at Boise State University in Idaho, who is leading the project. "There must be some DNA or protein sequences that are not compatible with life, perhaps because they bind some essential cellular component, for example, and have therefore been selected out of circulation. There may also be some that are lethal in some species, but not others. We're looking for those sequences."
To do this, Hampikian and his colleage Tim Anderson, also at Boise, have developed software that calculates all the possible sequences of nucleotides the "letters" of DNA up to a certain length, and then scans sequence databases such as the US National Institutes of Health's Genbank to identify the smallest sequences that aren't present. Those that don't occur in one species but do in others are termed "nullomers", while those that aren't found in any species are termed primes.
Hampikian's team is deliberately searching for the shortest absent sequences in order to minimise the possibility that absent sequences are missing simply due to chance. So far they have found 86 sequences of 11 nucleotides long that have never been reported in humans.
They have also identified more than 60,000 primes of 15 nucleotides in length and 746 protein "peptoprime" strings of five amino acids that have never been reported in any species. "These represent the largest possible set of lethal sequences," says Hampikian, who expects the numbers to shrink as more sequence infor
Contact: Claire Bowles