Artificial Intelligence in Medicine
Volume 45, Issue 1 , Pages 77-89, January 2009

Efficient discovery of risk patterns in medical data

  • Jiuyong Li

      Affiliations

    • School of Computer and Information Science, University of South Australia, Mawson Lakes, Adelaide 5095, South Australia, Australia
    • Corresponding Author InformationCorresponding author. Tel.: +61 8 8302 3898; Fax: +61 8 8302 3381.
  • ,
  • Ada Wai-chee Fu

      Affiliations

    • Department of Computer Science and Engineering, Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
  • ,
  • Paul Fahey

      Affiliations

    • Department of Mathematics and Computing, University of Southern Queensland, Toowoomba 4350, Queensland, Australia

Received 8 November 2007; received in revised form 30 June 2008; accepted 4 July 2008.

Summary 

Objective

This paper studies a problem of efficiently discovering risk patterns in medical data. Risk patterns are defined by a statistical metric, relative risk, which has been widely used in epidemiological research.

Methods

To avoid fruitless search in the complete exploration of risk patterns, we define optimal risk pattern set to exclude superfluous patterns, i.e. complicated patterns with lower relative risk than their corresponding simpler form patterns. We prove that mining optimal risk pattern sets conforms an anti-monotone property that supports an efficient mining algorithm. We propose an efficient algorithm for mining optimal risk pattern sets based on this property. We also propose a hierarchical structure to present discovered patterns for the easy perusal by domain experts.

Results

The proposed approach is compared with two well-known rule discovery methods, decision tree and association rule mining approaches on benchmark data sets and applied to a real world application. The proposed method discovers more and better quality risk patterns than a decision tree approach. The decision tree method is not designed for such applications and is inadequate for pattern exploring. The proposed method does not discover a large number of uninteresting superfluous patterns as an association mining approach does. The proposed method is more efficient than an association rule mining method. A real world case study shows that the method reveals some interesting risk patterns to medical practitioners.

Conclusion

The proposed method is an efficient approach to explore risk patterns. It quickly identifies cohorts of patients that are vulnerable to a risk outcome from a large data set. The proposed method is useful for exploratory study on large medical data to generate and refine hypotheses. The method is also useful for designing medical surveillance systems.

Keywords: Relative risk, Risk pattern, Data mining, Association rule, Decision tree, Epidemiology

To access this article, please choose from the options below

Login to an existing account or Register a new account.

  • Purchase this article for 31.50 USD (You must login/register to purchase this article)

    Online access for 24 hours. The PDF version can be downloaded as your permanent record.

  • Subscribe to this title

    Get unlimited online access to this article and all other articles in this title 24/7 for one year.

  • Claim access now

    For current subscribers with Society Membership or Account Number.

  • Visit SciVerse ScienceDirect to see if you have access via your institution.
 

PII: S0933-3657(08)00090-0

doi:10.1016/j.artmed.2008.07.008

Artificial Intelligence in Medicine
Volume 45, Issue 1 , Pages 77-89, January 2009