Eser Aygün | Computer / MS Thesis

My MS thesis was a research on protein function prediction. As a part of this work, I measured the contribution of the secondary structure information on the performance of protein function prediction and I applied Oommen-Kashyap syntactic transition probability calculation on the peptide classification problem. This is the first published work, as far as I know, that combines Oommen-Kashyap method and biological sequence analysis.

Abstract:

Improvement of Protein Function Prediction Using Structural Information and Peptide Classification Using Syntactic Transition Probabilities

Biological sequence analysis deals with nucleotide and amino acid sequences, aiming to expose their evolutionary, structural and functional properties. This study intends to provide a review of well known pairwise alignment methods, to introduce the syntactic transition probability of Oommen and Kashyap as a biological sequence similarity metric, to demonstrate how the structural information improves protein function prediction, to compare syntactic transition probability of Oommen and Kashyap with standard sequence similarity metrics on two peptide classifaction problems, and to implement necessary sequence analysis tools as a computer software. In the first part of the experiments, the results clearly indicate that the use of secondary structure sequences along with amino acid sequence alignments improves molecular function prediction performance, while the use of predicted secondary structures does not. In the second part, syntactic transition probabilities are compared with standard global alignment scores as being features fed into a machine learning classifier. The classification performance measurements undoubtedly proved that syntactic transition probabilities are much better features than global alignment scores for peptides.

Thesis (in Turkish): Aygun2009.pdf

Related Publications

Aygün, E.; Oommen B.J. & Cataltepe, Z. Peptide Classification Using Optimal and Information Theoretic Syntactic Modelling Pattern Recognition, 2010, 43, 3891
Aygün, E.; Oommen B.J. & Cataltepe, Z. On Utilizing Optimal and Information Theoretic Syntactic Modelling for Peptide Classification Pattern Recognition in Bioinformatics, 2009 (Presentation)
Aygün, E.; Komurlu, C.; Aydin, Z. & Cataltepe, Z. Protein Function Prediction with Amino Acid Sequence and Secondary Structure Alignment Scores International Symposium on Health Informatics and Bioinformatics, 2008
Filiz, A.; Aygün, E.; Keskin, O. & Cataltepe, Z. Importance of Secondary Structure Elements for Prediction of GO Annotations International Symposium on Health Informatics and Bioinformatics, 2008
Aygün E. & Cataltepe Z. Gene Ontology (GO) Molecular Function Prediction Based on Alignment Scores International Symposium on Health Informatics and Bioinformatics, 2007
Cataltepe, Z.; Ayan, U. & Aygün, E. Protein Function Prediction Using Motifs, Sequence Features, Alignment Scores Research in Computational Molecular Biology, 2007
Cataltepe, Z.; Aygün, E.; Filiz, A.; Keskin, O.; Komurlu, C. & Altunbasak, Y. Dimensionality Reduction for Protein Function Prediction Automated Function Prediction – Biosapiens Joint Special Interest Group Meeting at ISMB/ECCB, 2007

From Eser Aygün

Computer: MS Thesis

Related Publications