CryoProtect: a web server for classifying antifreeze proteins from non-antifreeze proteins

Abstract

Antifreeze protein (AFP) is an ice-binding protein that protects organisms from freezing in extremely cold environments. AFPs are found across a diverse range of species and, therefore, significantly differ in their structures. As there are no consensus sequences available for determining the ice-binding domain of AFPs, thus the prediction and characterization of AFPs from their sequence is a challenging task. This study addresses this issue by predicting AFPs directly from sequence on a large set of 478 AFPs and 9,139 non-AFPs using machine learning (e.g., random forest) as a function of interpretable features (e.g., amino acid composition, dipeptide composition, and physicochemical properties). Furthermore, AFPs were characterized using propensity scores and important physicochemical properties via statistical and principal component analysis. The predictive model afforded high performance with an accuracy of 88.28% and results revealed that AFPs are likely to be composed of hydrophobic amino acids as well as amino acids with hydroxyl and sulfhydryl side chains. The predictive model is provided as a free publicly available web server called CryoProtect for classifying query protein sequence as being either AFP or non-AFP. The data set and source code are for reproducing the results which are provided on GitHub.

Publication
Journal of Chemistry
Date
Citation
Pratiwi R, Malik AA, Schaduangrat N, Prachayasittikul V, Wikberg JES, Nantasenamat C, Shoombuatong W. CryoProtect: a web server for classifying antifreeze proteins from non-antifreeze proteins. Journal of Chemistry 2017 (2017) 9861752.