Artificial Intelligence in Medicine
Volume 48, Issue 2 , Pages 91-98 , February 2010

Clustering of high-dimensional gene expression data with feature filtering methods and diffusion maps

  • Rui Xu

      Affiliations

    • Applied Computational Intelligence Laboratory, Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO 65409-0249, USA
    • Corresponding Author InformationCorresponding author. Tel.: +1 573 341 6811; fax: +1 573 341 4532.
  • ,
  • Steven Damelin

      Affiliations

    • Department of Mathematical Sciences, Georgia Southern University, Statesboro, GA 30460-8093, USA
  • ,
  • Boaz Nadler

      Affiliations

    • Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel
  • ,
  • Donald C. Wunsch II

      Affiliations

    • Applied Computational Intelligence Laboratory, Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO 65409-0249, USA

Received 14 August 2008 ,Revised 24 June 2009 ,Accepted 30 June 2009.

References 

  1. Xu J, Kochanek K, Tejada-Vera B. Deaths: preliminary data for 2007. National Vital Statistics Reports 2009; 58.
  2. Schena M, Shalon D, Davis R, Brown P. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995;270:467–470
  3. Lipshutz R, Fodor S, Gingeras T, Lockhart D. High density synthetic oligonucleotide arrays. Nat Genet. 1999;21:20–24
  4. Golub T, Slonim D, Tamayo P, Huard C, Gaasenbeek M, Mesirov J, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286:531–537
  5. Alizadeh A, Eisen M, Davis R, Ma C, Lossos I, Rosenwald A, et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 2000;403:503–511
  6. Shyamsundar R, Kim Y, Higgins J, Montgomery K, Jorden M, Sethuraman A, et al. A DNA microarray survey of gene expression in normal human tissues. Genome Biol. 2005;6:R22
  7. Alon U, Barkai N, Notterman D, Gish K, Ybarra S, Mack D, et al. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA. 1999;96:6745–6750
  8. Bittner M, Meltzer P, Chen Y, Jiang Y, Seftor E, Hendrix M, et al. Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature. 2000;406:536–540
  9. Dyrskjøt L, Thykjaer T, Kruhøffer M, Jensen J, Marcussen N, Hamilton-Dutoit S, et al. Identifying distinct classes of bladder carcinoma using microarrays. Nat Genet. 2003;33:90–96
  10. Wang Y, Klijn J, Zhang Y, Sieuwerts A, Look M, Yang F, et al. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet. 2005;365:671–679
  11. Garber M, Troyanskaya O, Schluens K, Petersen S, Thaesler Z, Pacyna-Gengelbach M, et al. Diversity of gene expression in adenocarcinoma of the lung. Proc Natl Acad Sci USA. 2001;98:13784–13789
  12. Khan J, Wei J, Ringnér M, Saal L, Ladanyi M, Westermann F, et al. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med. 2001;7:673–679
  13. Li T, Zhang C, Ogihara M. A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics. 2004;20:2429–2437
  14. Statnikov A, Aliferis C, Tsamardinos I, Hardin D, Levy S. A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics. 2005;21:631–643
  15. Xu R, Anagnostopoulos G, Wunch DC. Multi-class cancer classification using semi-supervised ellipsoid ARTMAP and particle swarm optimization with gene expression data. IEEE/ACM Trans Comput Biol Bioinform. 2007;4:65–77
  16. Tan A, Naiman D, Xu L, Winslow R, Geman D. Simple decision rules for classifying human cancers from gene expression profiles. Bioinformatics. 2005;21:3896–3904
  17. Wang L, Chu F, Xie W. Accurate cancer classification using expressions of very few genes. IEEE/ACM Trans Comput Biol Bioinform. 2007;4:40–53
  18. Xu R, Wunsch DC. Survey of clustering algorithms. IEEE Trans Neural Networks. 2005;16:645–678
  19. Bellman R. Dynamic programming. Princeton, NJ: Princeton University Press; 1957;
  20. Yeung K, Ruzzo W. Principal component analysis for clustering gene expression data. Bioinformatics. 2001;17:763–774
  21. Nguyen D, Rocke D. Multi-class cancer classification via partial least squares with gene expression profiles. Bioinformatics. 2002;18:1216–1226
  22. Ben-Dor A, Bruhn L, Friedman N, Nachman I, Schummer M, Yakhini Z. Tissue classification with gene expression profiles. In:  Shamir R,  Miyano S,  Istrail S,  Pevzner P,  Waterman M editor. Proceedings of the fourth annual international conference on computational molecular biology. New York, NY: ACM Press; 2000;p. 54–64
  23. Shen R, Ghosh D, Chinnaiyan A, Meng Z. Eigengene-based linear discriminant model for tumor classification using gene expression microarray data. Bioinformatics. 2006;22:2635–2642
  24. Cawley G, Talbot N. Gene selection in cancer classification using sparse logistic regression with Bayesian regularization. Bioinformatics. 2006;22:2348–2355
  25. Tang Y, Zhang Y, Huang Z. Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis. IEEE/ACM Trans Comput Biol Bioinform. 2007;4:365–381
  26. Xu R, Damelin S, Wunsch DC. Applications of diffusion maps in gene expression data-based cancer diagnosis analysis. In: Proceedings of the 29th annual international conference of IEEE engineering in medicine and biology society. Piscataway, NJ: IEEE Press; 2007;p. 4613–4616
  27. Xu R, Damelin S, Wunsch DC. Clustering of cancer tissues using diffusion maps and fuzzy ART with gene expression data. In: Proceedings of world congress on computational intelligence 2008. Piscataway, NJ: IEEE Press; 2008;p. 183–188
  28. Coifman R, Lafon S. Diffusion maps. Appl Comput Harmonic Anal. 2006;21:5–30
  29. Lafon S, Lee A. Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization. IEEE Trans Pattern Anal Mach Intell. 2006;28:1393–1403
  30. Lafon S, Keller Y, Coifman R. Data fusion and multicue data matching by diffusion maps. IEEE Trans Pattern Anal Mach Intell. 2006;28:1784–1797
  31. Carpenter G, Grossberg S, Rosen D. Fuzzy ART: fast stable learning and categorization of analog patterns by an adaptive resonance system. Neural Networks. 1991;4:759–771
  32. Carpenter G, Grossberg S. A massively parallel architecture for a self-organizing neural pattern recognition machine. Computer Vision Graphics Image Process. 1987;37:54–115
  33. Nadler B, Coifman R. The prediction error in CLS and PLS: the importance of feature selection prior to multivariate calibration. J Chemom. 2005;19:107–118
  34. Jain A, Dubes R. Algorithms for clustering data. Englewood Cliffs, NJ: Prentice Hall; 1988;
  35. Tanay A, Sharan R, Shamir R. Biclustering algorithms: a survey. In:  Aluru S editors. Handbook of computational molecular biology. Boca Raton, FL: Chapman & Hall/CRC; 2005;
  36. Madeira S, Oliveira A. Biclustering algorithms for biological data analysis: a survey. IEEE Trans Comput Biol Bioinfor. 2004;1:24–45
  37. Nadler B, Lafon S, Coifman R, Kevrekidis I. Diffusion maps, spectral clustering and eigenfunctions of Fokker–Planck operators. In:  Weiss Y,  Schölkopf B,  Platt J editor. Advances in neural information processing systems. vol. 18:Cambridge, MA: MIT Press; 2006;p. 955–962
  38. Xu R, Damelin S, Nadler B, Wunsch II DC. Clustering of high-dimensional gene expression data with feature filtering methods and diffusion maps. In: Proceedings of the International Conference on Biomedical Engineering and Informatics. Piscataway, NJ: IEEE Computer Society Press. 2008;p. 245–249

PII: S0933-3657(09)00100-6

doi: 10.1016/j.artmed.2009.06.001

Artificial Intelligence in Medicine
Volume 48, Issue 2 , Pages 91-98 , February 2010