Artificial Intelligence in Medicine
Volume 50, Issue 1 , Pages 43-53, September 2010

Computer-aided diagnosis of pulmonary nodules using a two-step approach for feature selection and classifier ensemble construction

  • Michael C. Lee

      Affiliations

    • Philips Research North America, 345 Scarborough Road, Briarcliff Manor, NY 10510-2099, USA
    • Corresponding Author InformationCorresponding author. Tel.: +1 914 945 6047; fax: +1 914 945 6580.
  • ,
  • Lilla Boroczky

      Affiliations

    • Philips Research North America, 345 Scarborough Road, Briarcliff Manor, NY 10510-2099, USA
  • ,
  • Kivilcim Sungur-Stasik

      Affiliations

    • College of Physicians and Surgeons, Columbia University, 630 West 168th Street, P&S 8, Room 503, New York, NY 10032, USA
  • ,
  • Aaron D. Cann

      Affiliations

    • College of Physicians and Surgeons, Columbia University, 630 West 168th Street, P&S 8, Room 503, New York, NY 10032, USA
  • ,
  • Alain C. Borczuk

      Affiliations

    • College of Physicians and Surgeons, Columbia University, 630 West 168th Street, P&S 8, Room 503, New York, NY 10032, USA
  • ,
  • Steven M. Kawut

      Affiliations

    • College of Physicians and Surgeons, Columbia University, 630 West 168th Street, P&S 8, Room 503, New York, NY 10032, USA
  • ,
  • Charles A. Powell

      Affiliations

    • College of Physicians and Surgeons, Columbia University, 630 West 168th Street, P&S 8, Room 503, New York, NY 10032, USA

Received 19 September 2008; received in revised form 4 April 2010; accepted 4 April 2010.

Abstract 

Objective

Accurate classification methods are critical in computer-aided diagnosis (CADx) and other clinical decision support systems. Previous research has reported on methods for combining genetic algorithm (GA) feature selection with ensemble classifier systems in an effort to increase classification accuracy. In this study, we describe a CADx system for pulmonary nodules using a two-step supervised learning system combining a GA with the random subspace method (RSM), with the aim of exploring algorithm design parameters and demonstrating improved classification performance over either the GA or RSM-based ensembles alone.

Methods and materials

We used a retrospective database of 125 pulmonary nodules (63 benign; 62 malignant) with CT volumes and clinical history. A total of 216 features were derived from the segmented image data and clinical history. Ensemble classifiers using RSM or GA-based feature selection were constructed and tested via leave-one-out validation with feature selection and classifier training executed within each iteration. We further tested a two-step approach using a GA ensemble to first assess the relevance of the features, and then using this information to control feature selection during a subsequent RSM step. The base classification was performed using linear discriminant analysis (LDA).

Results

The RSM classifier alone achieved a maximum leave-one-out Az of 0.866 (95% confidence interval: 0.794–0.919) at a subset size of s=36 features. The GA ensemble yielded an Az of 0.851 (0.775–0.907). The proposed two-step algorithm produced a maximum Az value of 0.889 (0.823–0.936) when the GA ensemble was used to completely remove less relevant features from the second RSM step, with similar results obtained when the GA-LDA results were used to reduce but not eliminate the occurrence of certain features. After accounting for correlations in the data, the leave-one-out Az in the two-step method was significantly higher than in the RSM and the GA-LDA.

Conclusions

We have developed a CADx system for evaluation of pulmonary nodule based on a two-step feature selection and ensemble classifier algorithm. We have shown that by combining classifier ensemble algorithms in this two-step manner, it is possible to predict the malignancy for solitary pulmonary nodules with a performance exceeding that of either of the individual steps.

Keywords: Genetic algorithms, Linear discriminant analysis, Feature selection, Random subspace, Computer-aided diagnosis, Pulmonary nodules

To access this article, please choose from the options below

Login to an existing account or Register a new account.

  • Purchase this article for 31.50 USD (You must login/register to purchase this article)

    Online access for 24 hours. The PDF version can be downloaded as your permanent record.

  • Subscribe to this title

    Get unlimited online access to this article and all other articles in this title 24/7 for one year.

  • Claim access now

    For current subscribers with Society Membership or Account Number.

  • Visit SciVerse ScienceDirect to see if you have access via your institution.
 

PII: S0933-3657(10)00054-0

doi:10.1016/j.artmed.2010.04.011

Artificial Intelligence in Medicine
Volume 50, Issue 1 , Pages 43-53, September 2010