Volume 45, Issue 2 , Pages 125-134, February 2009
Modeling adaptive kernels from probabilistic phylogenetic trees
Summary
Objective
Modeling phylogenetic interactions is an open issue in many computational biology problems. In the context of gene function prediction we introduce a class of kernels for structured data leveraging on a hierarchical probabilistic modeling of phylogeny among species.
Methods and materials
We derive three kernels belonging to this setting: a sufficient statistics kernel, a Fisher kernel, and a probability product kernel. The new kernels are used in the context of support vector machine learning. The kernels adaptivity is obtained through the estimation of the parameters of a tree structured model of evolution using as observed data phylogenetic profiles encoding the presence or absence of specific genes in a set of fully sequenced genomes.
Results
We report results obtained in the prediction of the functional class of the proteins of the budding yeast Saccharomyces cerevisae which favorably compare to a standard vector based kernel and to a non-adaptive tree kernel function. A further comparative analysis is performed in order to assess the impact of the different components of the proposed approach.
Conclusions
We show that the key features of the proposed kernels are the adaptivity to the input domain and the ability to deal with structured data interpreted through a graphical model representation.
Keywords: Kernels for structures, Phylogenetic trees, Fisher kernel, Probability product kernel, Gene function prediction, Bayesian networks
To access this article, please choose from the options below
PII: S0933-3657(08)00124-3
doi:10.1016/j.artmed.2008.08.007
© 2008 Elsevier B.V. All rights reserved.
Volume 45, Issue 2 , Pages 125-134, February 2009
