In recent years, researchers have made major strides in using DNA sequence data to identify genes, which are traditionally defined as the parts of the genome that code for proteins. The protein-coding component of these genes makes up just a small fraction of the human genome 1.5 percent to 2 percent. Evidence exists that other parts of the genome also have important functions.
However, until now, most studies have concentrated on functional elements associated with specific genes and have not provided insights about functional elements throughout the genome. The ENCODE project represents the first systematic effort to determine where all types of functional elements are located and how they are organized.
In the pilot phase, ENCODE researchers devised and tested high-throughput approaches for identifying functional elements in the genome. Those elements included genes that code for proteins; genes that do not code for proteins; regulatory elements that control the transcription of genes; and elements that maintain the structure of chromosomes and mediate the dynamics of their replication.
The collaborative study focused on 44 targets, which together cover about 1 percent of the human genome sequence, or about 30 million DNA base pairs. The targets were strategically selected to provide a representative cross section of the entire human genome. All told, the ENCODE consortium generated more than 200 datasets and analyzed more than 600 million data points.
Our results reveal important principles about the organization of functional elements in the human genome, providing new perspectives on everything from DNA transc
Contact: Geoff Spencer
NIH/National Human Genome Research Institute