Their approach uses clinical data to identify a list of genes that correspond to a particular clinical factor--such as survival time, tumor stage, or metastasis--in tandem with statistical analysis to look for additional patterns in the data to identify clinically relevant subsets of genes. In many retrospective studies, patient survival time is known, even though tumor subtypes are not; Bair and Tibshirani used that survival data to guide their analysis of the microarray data. They calculated the correlation of each gene in the microarray data with patient survival to generate a list of "significant" genes and then used these genes to identify tumor subtypes. Creating a list of candidate genes based on clinical data, the authors explain, reduces the chances of including genes unrelated to survival, increasing the probability of identifying gene clusters with clinical and thus predictive significance. Such "indicator gene lists" could identify subgroups of patients with similar gene expression profiles. The lists of subgroups, based on gene expressi
'"/>
Contact: Barbara Cohen
bcohen@plos.org
415-624-1206
Public Library of Science
13-Apr-2004