Artificial Intelligence in Medicine
Volume 44, Issue 3 , Pages 221-231, November 2008

Using support vector regression to model the correlation between the clinical metastases time and gene expression profile for breast cancer

  • Shih-Hau Chiu

      Affiliations

    • Institute of Molecular Medicine & Department of Life Science, National Tsing Hua University, HsinChu, Taiwan
    • Bioresource Collection and Research Center, Food Industry Research and Development Institute, HsinChu, Taiwan
  • ,
  • Chien-Chi Chen

      Affiliations

    • Bioresource Collection and Research Center, Food Industry Research and Development Institute, HsinChu, Taiwan
  • ,
  • Thy-Hou Lin

      Affiliations

    • Institute of Molecular Medicine & Department of Life Science, National Tsing Hua University, HsinChu, Taiwan
    • Corresponding Author InformationCorresponding author at: Institute of Molecular Medicine & Department of Life Science, National Tsing Hua University, HsinChu, Taiwan. Tel.: +886 3 574 2759; fax: +886 3 571 5934.

Received 11 December 2007; received in revised form 13 May 2008; accepted 25 June 2008.

Summary 

Objective

Recently, the microarray analysis has been an important tool used for studying the cancer type, biological mechanism, and diagnostic biomarkers. There are several machine-learning methods being used to construct the prognostic model based on the microarray data sets. However, most of these previous studies were focused on the supervised classification for predicting the clinical type of patients. In this study, we investigate whether or not the expression level of some significant genes identified can be used to predict the clinical metastases time of patients.

Materials and methods

We have used a regression method to remodel the data set of breast cancer published in 2002. Some significant genes were ranked and selected based on a wrapper method with 10-fold cross-validation procedure and the selected genes were used to fit the support vector regression (SVR) model. This method could model the relationship between the significant gene expression value and the clinical metastases time of breast cancer.

Results

44 significant genes are selected for building the regression model and the corresponding cross-validated correlation coefficient obtained is 0.82 which is much superior to those reported previously by others using some different data sets. Moreover, there are two breast cancer related genes (the ligand 14 of the chemokine C-X-C motif (CXCL14) and estrogen receptor gene (ER)) selected in the gene set and one of them is never been included in the other data sets.

Conclusion

In this report, we have shown that the expression level of some significant genes identified could strongly correlate with the clinical metastases time of breast cancer patients. The 44 selected genes may be used as a benchmark to evaluate the risk of recurrence of breast cancer.

Keywords: Breast cancer, Support vector regression, Feature selection, Metastases time, Microarray

To access this article, please choose from the options below

Login to an existing account or Register a new account.

  • Purchase this article for 31.50 USD (You must login/register to purchase this article)

    Online access for 24 hours. The PDF version can be downloaded as your permanent record.

  • Subscribe to this title

    Get unlimited online access to this article and all other articles in this title 24/7 for one year.

  • Claim access now

    For current subscribers with Society Membership or Account Number.

  • Visit SciVerse ScienceDirect to see if you have access via your institution.
 

PII: S0933-3657(08)00085-7

doi:10.1016/j.artmed.2008.06.005

Artificial Intelligence in Medicine
Volume 44, Issue 3 , Pages 221-231, November 2008