Artificial Intelligence in Medicine
Volume 48, Issue 1 , Pages 29-41 , January 2010

Secure construction of k-unlinkable patient records from distributed providers

  • Bradley Malin

      Affiliations

    • Corresponding Author InformationTel.: +1 615 343 9096; fax: +1 615 322 0502.

Received 25 November 2008 ,Revised 8 June 2009 ,Accepted 12 September 2009.

References 

  1. National Institutes of Health. Final NIH statement on sharing research data. NOT-OD-03-032; February 2003.
  2. Wiederhold G. Future of security and privacy in medical information. Stud Health Technol Inform. 2002;80:213–229
  3. Berman JJ. Confidentiality issues for medical data miners. Artif Intell Med. 2002;26(1–2):25–36
  4. Department of Health and Human Services. Standards for privacy of individually identifiable health information; Final Rule. Federal Register, 45 CFR, Parts 160–164; 12 August 2002.
  5. National Institutes of Health. Policy for sharing of data obtained in NIH supported or conducted genome-wide association studies (GWAS). NOT-OD-07-088; August 2007.
  6. El Emam K, Jabbouri S, Sams S, Drouet Y, Power M. Evaluating common de-identification heuristics for personal health information. J Med Internet Res. 2006;8(4):e28
  7. Griffith V, Jakobsson M. Messin’ with Texas: deriving mother's maiden names using public records. In:  Ioannidis J,  Keromytis A,  Yung M editor. Proceedings of the 3rd international conference on applied cryptography and network security. New York, NY. 2005;p. 91–103
  8. Malin B. An evaluation of the current state of genomic data privacy protection technology and a roadmap for the future. J Am Med Inform Assoc. 2005;12(January–February (1)):28–34
  9. Narayanan A, Shmatikov V. Robust de-anonymization of large sparse datasets. In:  McDaniel P,  Rubin A editor. Proceedings of the 29th IEEE symposium on security and privacy. New York: ACM Press; 2008;p. 111–125
  10. Sweeney L. Weaving technology and policy together to maintain confidentiality. J Law Med Ethics. 1997;25(Summer–Fall (2–3)):98–110
  11. Malin B, Sweeney L. How (not) to protect genomic data privacy in a distributed network: using trail re-identification to evaluate and design anonymity protection systems. J Biomed Inform. 2004;37(3):179–192
  12. Malin B. K-unlinkability: a formal privacy protection model for distributed data. Data Knowledge Eng. 2008;64:294–311
  13. Malin B. A computational model to protect patient data from location-based re-identification. Artif Intell Med. 2007;40(3):223–239
  14. Malin B, Sweeney L. A secure protocol to distribute unlinkable health data. AMIA Annu Symp Proc. 2005;485–489
  15. Malin B, Sweeney L. Composition and disclosure of unlinkable distributed databases. In:  Liu L,  Reuter A,  Whang KY,  Zhang J editor. Proceedings of 22nd IEEE international conference on data engineering. New York: IEEE CS Press; 2006;p. 118
  16. Sweeney L. k-Anonymity: a model for protection privacy. Int J Uncertainty Fuzziness Knowledge-based Syst. 2002;10(5):557–570
  17. Sweeney L. Achieving k-anonymity privacy protection using generalization and suppression. Int J Uncertainty Fuzziness Knowledge-based Syst. 2002;10(5):571–588
  18. Øhrn A, Ohno-Machado L. Using Boolean reasoning to anonymize databases. Artif Intell Med. 1999;15:235–254
  19. Vinterbo S. A note on the hardness of the k-ambiguity problem. DSG Technical Report 2002–2006. Boston, MA: Harvard Medical School; 2002.
  20. Agrawal R, Johnson C. Securing electronic health records without impeding the flow of information. Int J Med Inform. 2007;76(5–6):471–479
  21. Brownstein J, Cassa C, Kohane I, Mandl K. An unsupervised classification method for inferring original case locations from low-resolution disease maps. Int J Health Geogr. 2006;6:56
  22. Cassa C, Grannis S, Overhage M, Mandl K. A context-sensitive approach to anonymizing spatial surveillance data: impact on outbreak detection. J Am Med Inform Assoc. 2006;13(2):160–165
  23. Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, et al. Proc approximation algorithms for k-anonymity. J Privacy Technol. 2005;20051120001
  24. Meyerson A, Williams R. On the complexity of optimal k-anonymity. In:  Deutsch A editors. Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. Paris, France. New York: ACM Press; 2004;p. 223–228
  25. Vinterbo S. Privacy: a machine learning view. IEEE Trans Knowledge Data Eng. 2004;16(8):939–948
  26. Iyengar V. Transforming data to satisfy privacy constraints. In:  Hand D,  Keim D,  Ng R editor. Proceedings of the 8th ACM SIGKDD international conference on data mining and knowledge discovery. Edmonton, Canada. New York: ACM Press; 2002;p. 279–288
  27. Lefevre K, DeWitt D, Ramakrishnan R. Mondrian multidimensional k-anonymity. In:  Liu L,  Reuter A,  Whang KY,  Zhang J editor. Proceedings of the 22nd IEEE international conference on data engineering. New York: IEEE CS Press; 2006;p. 25
  28. Winkler W. Using simulated annealing for k-anonymity. Washington, DC: US Census Bureau Statistical Research Division. Technical Report 2002–07; 2002.
  29. Chaytor R. Allowing privacy problems to jump out of local optimums: an ordered greed framework. In:  Bonchi F,  Ferrari E,  Malin B,  Saygin Y editor. Proceedings of the 1st ACM SIGKDD workshop on privacy, security, and trust in KDD. Springer; 2008;p. 33–55
  30. Goldreich O. Foundations of cryptography, vol. 2—Basic applications. Cambridge University Press; 2004;
  31. Kantarcioglu M, Jin J, Clifton C. When do data mining results violate privacy?. In:  Kim W,  Kohavi R,  Gehrke J,  DuMouchel W editor. Proceedings of the 10th ACM SIGKDD international conference on data mining and knowledge discovery. Seattle, WA. New York: ACM Press; 2004;p. 599–604
  32. Jiang W, Clifton C. A secure distributed framework for achieving k-anonymity. VLDB J. 2006;15(4):316–333
  33. Benaloh J, de Mare M. One-way accumulators: a decentralized alternative to digital signatures (extended abstract). In:  Hellsuth T editors. Proceedings of advances in cryptology—EUROCRYPT’93: workshop on the theory and application of cryptographic techniques. Lofthus, Norway. 1993;p. 274–285
  34. Pohlig S, Hellman M. An improved algorithm for computing logarithms over GF(p) and its cryptographic significance. IEEE Trans Information Theory. 1978;24:106–110
  35. Rivest R, Shamir A, Adleman L. A method for obtaining digital signatures and public-key cryptosystems. Commun ACM. 1978;21(2):120–126
  36. Kantarcioglu M, Clifton C. Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans Knowledge Data Eng. 2004;16(9):1026–1037
  37. Malin B, Airoldi E, Edoho-Eket S, Li Y. Configurable security protocols for multi-party data analysis with malicious participants. In:  Aberer K,  Franklin M,  Nishio S editor. Proceedings of the 21st IEEE international conference on data engineering. New York: IEEE CS Press; 2005;p. 533–544
  38. State of Illinois Health Care Cost Containment Council. Data release overview. Springfield, IL: State of Illinois Health Care Cost Containment Council; March 1998.
  39. Malin B. Trail re-identification and unlinkability in distributed databases. Ph.D. Thesis, Carnegie Mellon University, Pittsburgh, PA; May 2006.
  40. Canetti R, Lindell Y, Ostrovsky R, Sahai A. Universally composable two-party and multi-party secure computation. In: Proceedings of the 34th symposium on theory of computing. Montreal, Canada. 2002;p. 494–503
  41. Berman JJ. Zero-check: a zero-knowledge protocol for reconciling patient identities across institutions. Arch Pathol Lab Med. 2004;128(March (3)):344–346
  42. Churches T. A proposed architecture and method of operation for improving the protection of privacy and confidentiality in disease registers. BMC Med Res Methodol. 2003;3(1):1
  43. de Moor G, Claerhout B, de Meyer F. Privacy enhancing technologies: the key to secure communication and management of clinical and genomic data. Methods Inf Med. 2003;42:148–153
  44. Gulcher J, Kristjansson K, Gudbjartsson H, Stefansson K. Protection of privacy by third-party encryption in genetic research. Eur J Hum Genet. 2000;8:739–742
  45. Manasco P. Ethical and legal aspects of applied genomic technologies: practical solutions. Curr Mol Med. 2005;5(February):23–28
  46. Pharow P, Blobel B. Security infrastructure services for electronic archives and electronic health records. Stud Health Technol Inform. 2004;103:434–440
  47. Al-Lawati A, Lee D, McDaniel P. Blocking aware private record linkage. In:  Berti-Equille L,  Batini C,  Srivastava D editor. Proceedings of the 2nd ACM international workshop on information quality in information systems. New York: ACM Press; 2005;p. 59–68
  48. Atallah M, Kerschbaum F, Du W. Secure and private sequence computations. In:  Jajodia S,  Samarati P,  Syverson P editor. Proceedings of the 2nd ACM workshop on privacy in the electronic society. New York: ACM Press; 2003;p. 39–44
  49. Churches T, Christen P. Some methods for blindfolded record linkage. BMC Med Inform Decis Mak. 2004;4(June):9
  50. Du W, Atallah M. Protocols for secure remote database access with approximate matching. In: Proceedings of the 1st ACM workshop on security and privacy in E-commerce. Athens, Greece. 2000;
  51. Ravikumar P, Cohen W, Fienberg S. A secure protocol for computing string distance metrics. In: Proceedings of the 3rd IEEE workshop on privacy and security aspects of data mining. Brighton, England. 2004;p. 40–46

PII: S0933-3657(09)00135-3

doi: 10.1016/j.artmed.2009.09.002

Artificial Intelligence in Medicine
Volume 48, Issue 1 , Pages 29-41 , January 2010