Artificial Intelligence in Medicine
Volume 48, Issue 1 , Pages 43-50, January 2010

Coding of amino acids by texture descriptors

Department of Electronic, Informatics and Systems (DEIS), Università di Bologna, Via Venezia 52, 47023 Cesena, Italy

Received 12 December 2008; received in revised form 24 September 2009; accepted 3 October 2009.

Abstract 

Objective

In this paper we propose a new feature extractor for peptide/protein classification based on the calculation of texture descriptors. Representing a peptide/protein using a matrix descriptor, instead of a vector, allows to deal with the peptide/protein as an image and to use texture descriptors for representation purposes.

Methods and materials

A matrix descriptor, which is a squared matrix of the dimension of the peptide/protein, is obtained considering a partial ordering of the amino acids of the peptide/protein according to their value of a given physicochemical property. Each matrix descriptor is considered as a texture image and several texture descriptors are considered to obtain a compact representation which is scale invariant (i.e. independent on the length of the peptide\protein). The texture descriptors tested in this work are: local binary patterns (LBP), discrete cosine transform (DCT) and Daubechies wavelets.

Results and conclusion

The experimental section reports several tests, aimed at supporting our ideas, performed on the following datasets: vaccine dataset for the predictions of peptides that bind human leukocyte antigens; human immunodeficiency virus (HIV-1) protease cleavage site prediction dataset and membrane proteins type dataset.

The experimental results confirm the usefulness of the novel descriptors: the performance obtained by our system on the three difficult datasets is quite high, indicating that the proposed method is a feasible system for extracting information from peptides and proteins. The performance obtained by each of the three texture descriptors calculated from the matrix-based representation, and coupled to a support vector machine classifier, is lower than the performance obtained by other vector-based descriptors based on physicochemical properties proposed in the literature. Anyway the new descriptors bring different information and our tests show that the texture descriptors and the vector-based descriptors can be combined to improve the overall performance of the system.

In particular the proposed approach improves the state-of-the-art results in two out of three tested problems (HIV-1 protease cleavage site prediction dataset and membrane proteins type dataset).

Keywords: Protein classification, Peptide classification, Vaccine development, Locally binary patterns, Discrete cosine transform, Support vector machine

To access this article, please choose from the options below

Login to an existing account or Register a new account.

  • Purchase this article for 31.50 USD (You must login/register to purchase this article)

    Online access for 24 hours. The PDF version can be downloaded as your permanent record.

  • Subscribe to this title

    Get unlimited online access to this article and all other articles in this title 24/7 for one year.

  • Claim access now

    For current subscribers with Society Membership or Account Number.

  • Visit SciVerse ScienceDirect to see if you have access via your institution.
 

PII: S0933-3657(09)00137-7

doi:10.1016/j.artmed.2009.10.001

Artificial Intelligence in Medicine
Volume 48, Issue 1 , Pages 43-50, January 2010