Artificial Intelligence in Medicine
Volume 37, Issue 1 , Pages 7-18 , May 2006

Learning from imbalanced data in surveillance of nosocomial infection

  • Gilles Cohen

      Affiliations

    • Medical Informatics Service, University Hospital of Geneva, Geneva, Switzerland
    • Corresponding Author InformationCorresponding author. Tel.: +41 22 372 7550; fax: +41 22 320 2927.
  • ,
  • Mélanie Hilario

      Affiliations

    • Artificial Intelligence Laboratory, University of Geneva, Geneva, Switzerland
  • ,
  • Hugo Sax

      Affiliations

    • Department of Internal Medicine, University Hospital of Geneva, Geneva, Switzerland
  • ,
  • Stéphane Hugonnet

      Affiliations

    • Department of Internal Medicine, University Hospital of Geneva, Geneva, Switzerland
  • ,
  • Antoine Geissbuhler

      Affiliations

    • Medical Informatics Service, University Hospital of Geneva, Geneva, Switzerland

Received 27 July 2004 ,Revised 8 March 2005 ,Accepted 10 March 2005.

References 

  1. French GG, Cheng AF, Wong SL, Donnan S. Repeated prevalence surveys for monitoring effectiveness of hospital infection control. Lancet. 1983;2:1021–1023
  2. Garner JS, Jarvis WR, Emori TG, Horan TC, Huges JM. CDC definitions for nosocomial infections. Am J Infect Control. 1988;16(3):128–140
  3. Kelsey MC, Emmerson AM, Enstone JE. The second national prevalence survey of infection in hospitals: methodology. J Hosp Infect. 1995;30(1):7–29
  4. Trick WE, Zagorski BM, Tokars JI. Computer algorithms to detect bloodstream infections. Emerg Infect Dis. 2004;10(9):1612–1620
  5. Brossette SE, Sprague AP, Hardin JM, Waites KB, Jones WT, Moser SA. Association rules and data mining in hospital infection control and public health surveillance. J Am Med Inform Assoc. 1998;5(4):373–391
  6. Moser SA, Jones WT, Brossette SE. Application of data mining to intensive care unit microbiologic data. Emerg Infect Dis. 1999;5(3):454–457
  7. Brossette SE, Sprague AP, Jones WT, Moser SA. A data mining system for infection control surveillance. Meth Inf Med. 2000;39(4):303–310
  8. Ma L, Tsui FC, Hogan WR, Wagner MM, Ma H. A framework for infection control surveillance using association rules. In: AMIA Annual Symposium. Washington: AMIA; 2003;p. 410–414
  9. Lamma E, Manservigi M, Mello P, Riguzzi F, Serra R, Storari S. A system for monitoring nosocomial infections. In:  Brause Rüdigger W,  Hanisch Ernst editor. ISMDA, LNCS. Franckfurt, Germany: Springer; 2000;p. 282–292
  10. Harbarth S, Ruef Ch, Francioli P, Widmer A, Pittet D Swiss-Noso Network. Nosocomial infections in Swiss university hospitals: a multi-centre survey and review of the published experience. Schweiz Med Wochenschr. 1999;129:1521–1528
  11. Perner P. Data mining on multimedia data. Berlin, Heidelberg: Springer Verlag; 2002;
  12. Japkowicz N. The class imbalance problem: a systematic study. Intell Data Anal J. 2002;6(5):429–449
  13. Chawla N, Bowyer K, Hall L, Kegelmeyer WP. SMOTE: Synthetic Minority Over-sampling TEchnique. J Artif Intell Res (JAIR). 2002;16:321–357
  14. Kubat M, Matwin S. Addressing the curse of imbalanced data sets: one-sided sampling. In:  Fisher Douglas H editors. Proceedings of the 14th International Conference on Machine Learning. San Francisco, CA, USA: Morgan Kaufmann; 1997;p. 179–186
  15. Domingos P. Metacost: a general method for making classifiers cost-sensitive. In:  Chaudhuri S,  Madigan D editor. Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM Press; 1999;p. 155–164
  16. Ali K, Manganaris S, Srikant R. Partial classification using association rules. In:  Heckerman David,  Mannila Heikki,  Pregibon Daryl editor. Proceedings of the Third International Conference on Knowledge Discovery in Databases and Data Mining. Menlo Park, California: AAAI Press; 1997;p. 115–118
  17. Vapnik V. Statistical learning theory. New York: John Wiley & Sons; 1998;
  18. Cortes C, Vapnik V. Support vector networks. Mach Learn. 1995;20(3):273–297
  19. Fletcher R. Practical methods of optimization. 2nd ed. New York: John Wiley & Sons; 1987;
  20. Burges CJC. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery. 1998;2(2):121–167
  21. Cristianini N, Taylor JS. An introduction to support vector machines. Cambridge, UK: Cambridge University Press; 2000;
  22. Karakoulas G, Shawe-Taylor J. Optimizing classifiers for imbalanced training sets. In:  Kearns Michael J,  Solla Sara A,  Cohn David A editor. Advances in neural information processing systems (NIPS-99). Cambridge, MA: The MIT Press; 1999;p. 253–259
  23. Veropoulos K, Cristianini N, Campbell C. Controlling the sensitivity of support vector machines. In: Thomas Dean  editors. Proceedings of the International Joint Conference on Artificial Intelligence. San Francisco, CA, USA: Morgan Kaufmann; 1999;p. 55–60
  24. Morik K, Brockhausen P, Joachims T. Combining statistical learning with a knowledge-based approach – a case study in intensive care monitoring. In:  Bratko Ivan,  Dzeroski Saso editor. Proceedings of the Sixteenth International Conference on Machine Learning (ICML99). San Francisco, CA, USA: Morgan Kaufmann; 1999;p. 268–277
  25. Quinlan JR. C4.5: programs for machine learning. San Mateo, CA: Morgan Kaufmann; 1993;
  26. Duda R, Hart P, Stork D. Pattern classification. New York: John Wiley & Sons; 2000;
  27. Freund Y, Schapire RE. Experiments with a new boosting algorithm. In:  Saitta Lorenza editors. Proceedings of the 13th International Conference on Machine Learning. San Francisco, CA, USA: Morgan Kaufmann; 1996;p. 148–156
  28. Centor RM. Signal detectability: the use of ROC curves and their analyses.. Med Decis Making. 1991;(11):102–106
  29. Provost F, Fawcett T, Kohavi R. The case against accuracy estimation for comparing induction algorithms. In:  Shavlik Jude editors. Proceedings of the 15th International Conference on Machine Learning (ICML98). Madison, WI, USA: Morgan Kaufmann; 1998;p. 445–453
  30. Amari S, Wu S. Improving support vector machine classifiers by modifying kernel functions. Neural Networks. 1999;12(6):783–789
  31. Wu S, Amari S. Conformal transformation of kernel functions: a data-dependent way to improve support vector machine classifiers. Neural Process Lett. 2002;15(1):59–67

PII: S0933-3657(05)00085-0

doi: 10.1016/j.artmed.2005.03.002

Artificial Intelligence in Medicine
Volume 37, Issue 1 , Pages 7-18 , May 2006