Artificial Intelligence in Medicine
Volume 41, Issue 3 , Pages 177-196 , November 2007

Evaluation of rule interestingness measures in medical knowledge discovery in databases

  • Miho Ohsaki

      Affiliations

    • Faculty of Engineering, Doshisha University, 1-3 Tataramiyakodani, Kyotanabe-shi, Kyoto 610-0321, Japan
    • Corresponding Author InformationCorresponding author. Tel.: +81 774 65 6468; fax: +81 774 65 6468.
  • ,
  • Hidenao Abe

      Affiliations

    • Department of Medical Informatics, Shimane University, 89-1 Enya-cho, Izumo-shi, Shimane 693-8501, Japan
  • ,
  • Shusaku Tsumoto

      Affiliations

    • Department of Medical Informatics, Shimane University, 89-1 Enya-cho, Izumo-shi, Shimane 693-8501, Japan
  • ,
  • Hideto Yokoi

      Affiliations

    • Department of Medical Informatics, Kagawa University Hospital, 1750-1 Ikenobe, Miki-cho, Kita-gun, Kagawa 761-0793, Japan
  • ,
  • Takahira Yamaguchi

      Affiliations

    • Faculty of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama-shi, Kangawa 223-8522, Japan

Received 18 October 2006 ,Revised 24 July 2007 ,Accepted 24 July 2007.

References 

  1. Lavrac N, Flach P, Zupan B. Rule evaluation measures: a unifying view. In:  Dzeroski S,  Flach P editor. Proceedings of the 9th international workshop on inductive logic programming ILP-1999. Lecture notes in artificial intelligence, vol. 1634. Berlin: Springer; 1999;p. 174–185
  2. Yao YY, Zhong N. An analysis of quantitative measures associated with rules. In:  Zhong N,  Zhou L editor. Proceedings of the 3rd Pacific-Asia conference on knowledge discovery and data mining PAKDD-1999. Lecture notes in computer science, vol. 1574. Berlin: Springer; 1999;p. 479–488
  3. Hilderman RJ, Hamilton HJ. Knowledge discovery and measure of interest. Boston: Kluwer Academic Publishers; 2001;p. 1–97
  4. Tan PN, Kumar V, Srivastava J. Selecting the right interestingness measure for association patterns. In:  Hand D,  Keim D,  Ng R editor. Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining KDD-2002. New York: ACM Press; 2002;p. 32–41
  5. Fürnkranz J, Flach P. ROC ‘n’ rule learning—towards a better understanding of covering algorithms. Mach Learn. 2005;58:39–77
  6. Ohsaki M, Kitaguchi S, Okamoto K, Yokoi H, Yamaguchi T. Evaluation of rule interestingness measures with a clinical dataset on hepatitis. In:  Boulicaut JF,  Esposito F,  Giannotti F,  Pedreschi D editor. Proceedings of the 15th European conference on machine learning and the 8th European conference on principles and practice of knowledge discovery in databases ECML/PKDD-2004. Lecture notes in artificial intelligence, vol. 3202. Berlin: Springer; 2004;p. 362–373
  7. Carvalho DR, Freitas AA, Ebecken N. Evaluating the correlation between objective rule interestingness measures and real human interest. In:  Jorge A,  Torgo L,  Brazdil P,  Camacho R,  Gama J editor. Proceedings of the 16th European conference on machine learning and the 9th European conference on principles and practice of knowledge discovery in databases ECML/PKDD-2005. Lecture notes in artificial intelligence, vol. 3731. Berlin: Springer; 2005;p. 453–461
  8. Geng L, Hamilton HJ. Interestingness measures for data mining—a survey. ACM Comput Surveys. 2006;38(3):[article 9]
  9. Hatazawa H, Abe H, Komori M, Tachibana Y, Yamaguchi T. Knowledge discovery support from a meningoencephalitis dataset using an automatic composition tool for inductive applications. In:  Terano T,  Nishida T,  Namatame A,  Tsumoto S,  Ohsawa Y,  Washio T editor. Post-Proceedings of the joint JSAI-2001 workshop on new frontiers in artificial intelligence. Lecture notes in artificial intelligence, vol. 2253. Berlin: Springer; 2002;p. 500–507
  10. Ohsaki M, Sato Y, Yokoi H, Yamaguchi T. A rule discovery support system for sequential medical data—in the case study of a chronic hepatitis dataset. In:  Tsumoto S,  Yamaguchi T,  Numao M,  Motoda H editor. Proceedings of the 1st international workshop on active mining AM-2002 in the 2nd IEEE international conference on data mining ICDM-2002. Washington, DC: IEEE; 2002;p. 97–102
  11. Freitas AA. On rule interestingness measures. Knowl Syst J. 1999;12(5/6):309–315
  12. Freitas AA. Data mining and knowledge discovery with evolutionary algorithms. Berlin: Springer; 2002;p. 27–31
  13. McGarry K. A survey of interestingness measures for knowledge discovery. Knowl Eng Rev. 2005;20(1):39–61
  14. Silberschatz A, Tuzhilin A. On subjective measures of interestingness in knowledge discovery. In:  Fayyad U,  Uthurusamy R editor. Proceedings of the 1st ACM SIGKDD international conference on knowledge discovery and data mining KDD-1995. Cambridge: AAAI/MIT Press; 1995;p. 275–281
  15. Silberschatz A, Tuzhilin A. What makes patterns interesting in knowledge discovery systems. IEEE Trans Knowl Data Eng. 1996;8(6):970–974
  16. Padmanabhan B, Tuzhilin A. A belief-driven method for discovering unexpected patterns. In:  Agrawal R,  Stolorz P,  Piatetsky-Shapiro G editor. Proceedings of the 4th ACM SIGKDD international conference on knowledge discovery and data mining KDD-1998. Cambridge: AAAI/MIT Press; 1998;p. 94–100
  17. Sahara S. On incorporating subjective interestingness into the mining process. In:  Kumar V,  Tsumoto S,  Zhong N,  Yu PS,  Wu X editor. Proceedings of the 2nd IEEE international conference on data mining ICDM-2002. Washington, DC: IEEE; 2002;p. 681–684
  18. Yao H, Hamilton HJ. Mining itemset utilities from transaction databases. Data Knowl Eng J. 2006;59:603–626
  19. Klementtinen M, Mannila H, Ronkainen P, Toivonen H, Verkamo AI. Finding interesting rules from large sets of discovered association rules. In:  Grossman D,  Gravano L,  Zhai CX,  Herzog O,  Evans DA editor. Proceedings of the 3rd international conference on information and knowledge management CIKM-1994. New York: ACM Press; 1994;p. 401–407
  20. Klementtinen M, Mannila H, Toivonen H. A data mining methodology and its application to semi-automatic knowledge acquisition. In:  Hameurlain A,  Tjoa AM editor. Proceedings of the 8th international conference on database and expert systems applications DEXA-1997. Berlin: Springer; 1997;p. 670–677
  21. Liu B, Hsu W, Chen S, Mia Y. Analyzing the subjective interestingness of association rules. Intell Syst J. 2000;15(5):47–55
  22. Liu B, Hsu W, Mia Y. Identifying non-actionable association rules. In:  Provost F,  Srikant R editor. Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining KDD-2001. New York: ACM Press; 2001;p. 329–334
  23. Padmanabhan B, Tuzhilin A. On characterization and discovery of minimal unexpected patterns in rule discovery. IEEE Trans Knowl Data Eng. 2006;18(2):202–216
  24. Tsumoto S. Guide to the meningoencephalitis diagnosis data set. In: Dataset guideline of international workshop of KDD challenge on real-world data KDD-challenge-2000 in the 6th ACM SIGKDD international conference on knowledge discovery data mining KDD-2000; 2000. URL http://www.slab.dnj.ynu.ac.jp/challenge2000/menin.html [accessed April 1, 2001].
  25. Abe H, Yamaguchi T. Constructive meta-learning with machine learning method repository. In:  Orchard B,  Yang C,  Ali M editor. Proceedings of the 17th international conference on industrial and engineering applications of artificial intelligence and expert systems IEA/AIE-2004. Lecture notes in artificial intelligence, vol. 3029. Springer: Berlin; 2004;p. 502–511
  26. In:  Arikawa S,  Shinohara A editor. Progress in discovery science—final report of the Japanese discovery science project. Lecture notes in artificial intelligence, vol. 2281. Berlin: Springer; 2002;p. 1–684
  27. Davis J, Goadrich M. The relationship between precision–recall and ROC curves. UW-CS Technical Report, TR1551; 2006.
  28. Rijsbergen C. Information retrieval, Oxford: Butterworth-Heinemann; 1979. http://www.dcs.gla.ac.uk/Keith/Chapter.7/Ch.7.html [accessed July 23, 2007].
  29. Brazdil BP, Soares C. A comparison of ranking methods for classification algorithm selection. In:  Mántaras RL,  Plaza E editor. Proceedings of the 11th European conference on machine learning ECML-2000. Lecture notes in artificial intelligence, vol. 1810. Berlin: Springer; 2000;p. 63–74
  30. Tsumoto S. Discovery challenge—a collaborative effort in knowledge discovery from databases. In: Proceedings of the dataset guideline of discovery challenge in the 13th European conference on machine learning and the 6th European conference on principles and practice of knowledge discovery in databases ECML/PKDD-2002; 2002. http://lisp.vse.cz/challenge/ecmlpkdd2002/index.html [accessed July 23, 2007].
  31. Das G, King-Ip L, Heikki M, Renganathan G, Smyth P. Rule discovery from time series. In:  Agrawal R,  Stolorz P,  Piatetsky-Shapiro G editor. Proceedings of the 4th ACM SIGKDD international conference on knowledge discovery and data mining KDD-1998. Cambridge: AAAI/MIT Press; 1998;p. 16–22
  32. In:  Motoda H editors. Active mining—new directions of data mining. Amsterdam: IOS Press; 2002;p. 1–302
  33. Abe H. COIN (calculating objective indices for data mining results). http://sourceforge.jp/projects/coin/ [accessed July 23, 2007] 2005;
  34. Abe H, Ohsaki M, Yokoi H, Yamaguchi T. Implementing an integrated time-series data mining environment based on temporal pattern extraction methods—a case study of an interferon therapy risk mining for chronic hepatitis. In:  Washio T,  Sakurai A,  Nakajima K,  Takeda H,  Tojo S,  Yokoo M editor. Post-proceedings of the joint JSAI-2005 workshop on new frontiers in artificial intelligence. Lecture notes in artificial intelligence, vol. 4012. Berlin: Springer; 2005;p. 425–435
  35. Abe H, Tsumoto S, Ohsaki M, Yamaguchi T. A rule evaluation support method with learning models. In:  Han J,  Wah BW,  Raghavan V,  Wu X,  Rastogi R editor. Proceedings of the 5th IEEE international conference on data mining ICDM-2005. Washington, DC: IEEE; 2005;p. 549–552
  36. Abe H, Tsumoto S, Ohsaki M, Yamaguchi T. Evaluating a rule evaluation support method based on objective rule evaluation indices. In:  Ng WK,  Kitsuregawa M,  Li J,  Chang K editor. Proceedings of the 10th Pacific-Asia conference on knowledge discovery and data mining PAKDD-2006. Lecture notes in computer science, vol. 3918. Berlin: Springer; 2006;p. 509–519
  37. Chen M, Zheng A, Lloyd J, Jordan M, Brewer E. Failure diagnosis using decision trees. In:  Kephart J,  Parashar M,  Das R,  Sunderam V editor. Proceedings of the 1st IEEE international conference on autonomic computing ICAC-2004. Washington, DC: IEEE; 2004;p. 36–43
  38. Huang Y, Shekhar S, Xiong H. Discovering colocation patterns from spatial data sets: a general approach. IEEE Trans Knowl Data Eng. 2004;16(12):1472–1485
  39. Bayardo R, Agrawal R, Gunopulos D. Constraint-based rule mining in large, dense databases. Data Mining Knowl Discov J. 2000;217–240
  40. Knowledge discovery in databases. In: Piatetsky-Shapiro G, Frawley WJ., editors. Ch. Discovery, analysis and presentation of strong rules. Cambridge: AAAI/MIT Press; 1991. p. 229–48.
  41. Advances in knowledge discovery and data mining. In: Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthurusamy, R. (editors), Ch. Explora: a multipattern and multistrategy discovery assistant. Cambridge: AAAI/MIT Press; 1996. p. 249–71.
  42. Ali K, Manganaris S, Srikant R. Partial classification using association rules. In:  Heckerman D,  Mannila H,  Pregibon D,  Uthurusamy R editor. Proceedings of the 3rd international conference on knowledge discovery and data mining KDD-1997. Cambridge: AAAI/MIT Press; 1997;p. 115–118
  43. Brin S, Motwani R, Silverstein C. Beyond market baskets: Generalizing association rules to correlations. In:  Peckham J editors. Proceedings of the 16th ACM SIGMOD international conference on Management of Data SIGMOD-1997. New York: ACM Press; 1997;p. 265–276
  44. Jaccard P. Nouvelles recherches sur la distribution florale. Bull Soc Vaudoise Sci Nat 1908;44:223–70 [in French].
  45. Fleiss J. Statistical methods for rates and proportions. New York: John Wiley and Sons; 1981;p. 1–352
  46. Gray B, Orlowska ME. CCAIIA: Clustering categorical attributes into interesting association rules. In:  Wu X,  Ramamohanarao K,  Korb KB editor. Proceedings of the 2nd Pacific-Asia conference on knowledge discovery and data mining PAKDD-1998. Lecture notes in computer science, vol. 1394. Berlin: Springer; 1998;p. 132–143
  47. Aggrawal C, Yu P. A new framework for itemset generation. In: Proceedings of the 17th ACM SIGACT–SIGMOD–SIGART symposium on principles of database systems PODS-1998. New York: ACM Press; 1998;p. 18–24
  48. Gini C. Variability and mutability—contribution to the study of statistical distributions and relations. Studi Economico-Giuridici dell’ Univ di Cagliari. 1912;3:1–158[in Italian]
  49. Hamilton HJ, Shan N, Ziarko W. Machine learning of credible classifications. In:  Sattar A editors. Proceedings of the 10th Australian joint conference on Artificial Intelligence AUS-AI-1997. Lecture notes in computer science, vol. 1342. Berlin: Springer; 1997;p. 330–339
  50. Goodman LA, Kruskal WH. Measures of Association for Cross Classifications. Berlin: Springer; 1979;p. 1–146
  51. Morimoto Y, Fukuda T, Matsuzawa H, Tokuyama T, Yoda K. Algorithms for mining association rules for binary segmentations of huge categorical databases. In:  Gupta A,  Shmueli O,  Widom J editor. Proceedings of the 24th international conference on very large databases VLDB-1998. San Fransisco: Morgan Kaufmann Publishers; 1998;p. 380–391
  52. Knowledge discovery in databases. In: Piatetsky-Shapiro G, Frawley WJ, editors. Ch. Rule induction using information theory. Cambridge: AAAI/MIT Press; 1991; p. 159–76.
  53. Yao J, Liu H. Searching multiple databases for interesting complexes. In:  Lu H,  Motoda H,  Liu H editor. Proceedings of the 1st Pacific-Asia conference on knowledge discovery and data mining PAKDD-1997. Singapore: World Scientific Publishing Company; 1997;p. 198–210
  54. Gago P, Bento C. A metric for selection of the most promising rules. In:  Zytkow JM,  Quafafou M editor. Proceedings of the 2nd European conference on principles of data mining and knowledge discovery PKDD-1998. Lecture notes in artificial intelligence, vol. 1510. Berlin: Springer; 1998;p. 19–27
  55. Zhong N, Yao YY, Ohshima M. Peculiarity oriented multi-database mining. IEEE Trans Knowl Data Eng. 2003;15(4):952–960

PII: S0933-3657(07)00092-9

doi: 10.1016/j.artmed.2007.07.005

Artificial Intelligence in Medicine
Volume 41, Issue 3 , Pages 177-196 , November 2007