Artificial Intelligence in Medicine
Volume 54, Issue 2 , Pages 103-114, February 2012

Improved modeling of clinical data with kernel methods

  • Anneleen Daemen

      Affiliations

    • Department of Electrical Engineering, Katholieke Universiteit Leuven, 3001 Leuven, Belgium
    • Corresponding Author InformationCorresponding author at: Department of Cancer & DNA Damage Responses, Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, 94720 Berkeley, CA, USA. Tel.: +1 510 486 5202.
  • ,
  • Dirk Timmerman

      Affiliations

    • Department of Obstetrics and Gynecology, University Hospitals Leuven, Katholieke Universiteit Leuven, 3000 Leuven, Belgium
  • ,
  • Thierry Van den Bosch

      Affiliations

    • Department of Obstetrics and Gynecology, University Hospitals Leuven, Katholieke Universiteit Leuven, 3000 Leuven, Belgium
  • ,
  • Cecilia Bottomley

      Affiliations

    • Department of Obstetrics and Gynaecology, St. George's Hospital, St. George's University of London, London SW17 0RE, UK
  • ,
  • Emma Kirk

      Affiliations

    • Early Pregnancy and Gynecological Unit, St. George's Hospital, St. George's University of London, London SW17 0RE, UK
  • ,
  • Caroline Van Holsbeke

      Affiliations

    • Department of Obstetrics and Gynecology, University Hospitals Leuven, Katholieke Universiteit Leuven, 3000 Leuven, Belgium
    • Hospital Oost-Limburg, 3600 Genk, Belgium
  • ,
  • Lil Valentin

      Affiliations

    • Malmö University Hospital, Lund University, SE 20502 Malmö, Sweden
  • ,
  • Tom Bourne

      Affiliations

    • Department of Obstetrics and Gynecology, University Hospitals Leuven, Katholieke Universiteit Leuven, 3000 Leuven, Belgium
    • Hammersmith Hospital, Imperial College London, London W12 0NN, UK
  • ,
  • Bart De Moor

      Affiliations

    • Department of Electrical Engineering, Katholieke Universiteit Leuven, 3001 Leuven, Belgium

Received 27 November 2009; received in revised form 22 October 2011; accepted 7 November 2011.

Abstract 

Objective

Despite the rise of high-throughput technologies, clinical data such as age, gender and medical history guide clinical management for most diseases and examinations. To improve clinical management, available patient information should be fully exploited. This requires appropriate modeling of relevant parameters.

Methods

When kernel methods are used, traditional kernel functions such as the linear kernel are often applied to the set of clinical parameters. These kernel functions, however, have their disadvantages due to the specific characteristics of clinical data, being a mix of variable types with each variable its own range. We propose a new kernel function specifically adapted to the characteristics of clinical data.

Results

The clinical kernel function provides a better representation of patients’ similarity by equalizing the influence of all variables and taking into account the range r of the variables. Moreover, it is robust with respect to changes in r. Incorporated in a least squares support vector machine, the new kernel function results in significantly improved diagnosis, prognosis and prediction of therapy response. This is illustrated on four clinical data sets within gynecology, with an average increase in test area under the ROC curve (AUC) of 0.023, 0.021, 0.122 and 0.019, respectively. Moreover, when combining clinical parameters and expression data in three case studies on breast cancer, results improved overall with use of the new kernel function and when considering both data types in a weighted fashion, with a larger weight assigned to the clinical parameters. The increase in AUC with respect to a standard kernel function and/or unweighted data combination was maximum 0.127, 0.042 and 0.118 for the three case studies.

Conclusion

For clinical data consisting of variables of different types, the proposed kernel function – which takes into account the type and range of each variable – has shown to be a better alternative for linear and non-linear classification problems.

Keywords: Machine learning, Support vector machine, Kernel function, Biostatistics, Clinical data representation, Clinical decision support system, Gynecology, Breast cancer

To access this article, please choose from the options below

Login to an existing account or Register a new account.

  • Purchase this article for 31.50 USD (You must login/register to purchase this article)

    Online access for 24 hours. The PDF version can be downloaded as your permanent record.

  • Subscribe to this title

    Get unlimited online access to this article and all other articles in this title 24/7 for one year.

  • Claim access now

    For current subscribers with Society Membership or Account Number.

  • Visit SciVerse ScienceDirect to see if you have access via your institution.
 

PII: S0933-3657(11)00144-8

doi:10.1016/j.artmed.2011.11.001

Artificial Intelligence in Medicine
Volume 54, Issue 2 , Pages 103-114, February 2012