Full Text:   <1409>

CLC number: Q55

On-line Access: 

Received: 2005-08-12

Revision Accepted: 2005-10-23

Crosschecked: 0000-00-00

Cited: 2

Clicked: 3462

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
1. Reference List
Open peer comments

Journal of Zhejiang University SCIENCE B 2006 Vol.7 No.1 P.1~6


EHPred: an SVM-based method for epoxide hydrolases recognition and classification

Author(s):  Jia Jia, Yang Liang, Zhang Zi-zhang

Affiliation(s):  James. D. Watson Institute of Genome Sciences, Zhejiang University, Hangzhou 310008, China; more

Corresponding email(s):   zhangzz@zju.edu.cn

Key Words:  Epoxide hydrolases (EHs), Amino acid composition (AAC), Dipeptide composition (DPC), Pseudo-amino acid composition (PAAC), Support vector machines (SVM)

Share this article to: More |Next Article >>>

Jia Jia, Yang Liang, Zhang Zi-zhang. EHPred: an SVM-based method for epoxide hydrolases recognition and classification[J]. Journal of Zhejiang University Science B, 2006, 7(1): 1~6.

@article{title="EHPred: an SVM-based method for epoxide hydrolases recognition and classification",
author="Jia Jia, Yang Liang, Zhang Zi-zhang",
journal="Journal of Zhejiang University Science B",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T EHPred: an SVM-based method for epoxide hydrolases recognition and classification
%A Jia Jia
%A Yang Liang
%A Zhang Zi-zhang
%J Journal of Zhejiang University SCIENCE B
%V 7
%N 1
%P 1~6
%@ 1673-1581
%D 2006
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.2006.B0001

T1 - EHPred: an SVM-based method for epoxide hydrolases recognition and classification
A1 - Jia Jia
A1 - Yang Liang
A1 - Zhang Zi-zhang
J0 - Journal of Zhejiang University Science B
VL - 7
IS - 1
SP - 1
EP - 6
%@ 1673-1581
Y1 - 2006
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.2006.B0001

A two-layer method based on support vector machines (SVMs) has been developed to distinguish epoxide hydrolases (EHs) from other enzymes and to classify its subfamilies using its primary protein sequences. SVM classifiers were built using three different feature vectors extracted from the primary sequence of EHs: the amino acid composition (AAC), the dipeptide composition (DPC), and the pseudo-amino acid composition (PAAC). Validated by 5-fold cross tests, the first layer SVM classifier can differentiate EHs and non-EHs with an accuracy of 94.2% and has a Matthew’s correlation coefficient (MCC) of 0.84. Using 2-fold cross validation, PAAC-based second layer SVM can further classify EH subfamilies with an overall accuracy of 90.7% and MCC of 0.87 as compared to AAC (80.0%) and DPC (84.9%). A program called EHPred has also been developed to assist readers to recognize EHs and to classify their subfamilies using primary protein sequences with greater accuracy.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1] Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local alignment search tool. Journal of Molecular Biology, 215:403-410.

[2] Argiriadi, M.A., Morisseau, D., Hammock, B.D., Christianson, D.W., 1999. Detoxification of environmental mutagens and carcinogens: structure, mechanism, and evolution of liver epoxide hydrolase. Proceedings of the National Academy of Sciences USA, 96:10637-10642.

[3] Armstrong, R.N., 1987. Enzyme-catalyzed detoxication reactions: mechanisms and stereochemistry. CRC Critical Reviews in Biochemistry, 22:39-88.

[4] Baldi, P., Brunak, S., Chauvin, Y., Anderson, C.A.F., Nielsen, H., 2000. Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics, 16:412-419.

[5] Barth, S., Fischer, M., Schmid, R.D., Pleiss, J., 2004a. The database of epoxide hydrolases and haloalkane dehalogenases: one structure, many functions. Bioinformatics, 20:2845-2847. EH/HD database can be available at http://www.led.uni-stuttgart.de.

[6] Barth, S., Fischer, M., Schmid, R.D., Pleiss, J., 2004b. Sequence and structure of epoxide hydrolases: a systematic analysis. PROTEINS: Structure, Function, and Bioinformatics, 55:846-855.

[7] Bhasin, M., Raghava, G.P.S., 2004a. GPCRpred: an SVM-based method for prediction of families and subfamilies of G-protein coupled receptors. Nucleic. Acids Research, 32:W383-W389.

[8] Bhasin, M., Raghava, G.P.S., 2004b. ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST. Nucleic. Acids Research, 32:W414-W419.

[9] Cai, C.Z., Wang, W.L., Sun, L.Z., Chen, Y.Z., 2003. Protein function classification via support vector machine approach. Mathematical Biosciences, 185:111-122.

[10] Cai, C.Z., Han, L.Y., Ji, Z.L., Chen, Y.Z., 2004. Enzyme family classification by support vector machines. PROTEINS: Structure, Function, and Bioinformatics, 55:66-76.

[11] Chou, K.C., 2005. Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics, 21:10-19.

[12] Emanuelsson, O., Nielsen, H., Brunak, S., von Heijne, G., 2000. Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. Journal of Molecular Biology, 300:1005-1016.

[13] Fretland, A.J., Omiecinski, C.J., 2000. Epoxide hydrolases: biochemistry and molecular biology. Chemico-Biological Interactions, 129:41-59.

[14] Hopp, T.P., Woods, K.R., 1981. Prediction of protein antigenic determinants from amino acid sequences. Proceedings of the National Academy of Sciences USA, 78:3824-3828.

[15] Hua, S.J., Sun, Z.R., 2001. Support vector machine approach for protein subcellular localization prediction. Bioinformatics, 17:721-728.

[16] Karchin, R., Karplus, K., Haussler, D., 2002. Classifying G-protein coupled receptors with support vector machines. Bioinformatics, 18:147-159.

[17] Kyte, J., Doolittle, R.F., 1982. A simple method for displaying the hydropathic character of a protein. Journal of Molecular Biology, 157:105-132.

[18] Matthews, B.W., 1975. Comparison of predicted and observed secondary structure of T4 phage lysozyme. Biochimica. Biophysica. Acta, 405:442-451.

[19] Nardini, M., Ridder, I.S., Rozeboom, H.J., Kalk, K.H., Rink, R., Janssen, D.B., Dijkstra, B.W., 1999. The X-ray structure of epoxide hydorlase from Agrobacterium radiobacter AD1. Journal of Biological Chemistry, 274:14579-14586.

[20] Pavlidis, P., Wapinski, I., Noble, W.S., 2004. Support vector machine classification on the web. Bioinformatics, 20:586-587.

[21] Reczko, M., Bohr, H., 1994. The DEF data base of sequence based protein fold class predictions. Nucleic. Acids Research, 22:3616-3619.

[22] Reinhardt, A., Hubbard, T., 1998. Using neural networks for prediction of the subcellular location of proteins. Nucleic. Acids Research, 26:2230-2236.

[23] Schomburg, I., Chang, A., Ebeling, C., Gremse, M., Heldt, C., Huhn, G., Schomburg, D., 2004. BRENDA, the enzyme database: updates and major new developments. Nucleic. Acids Research, 32:D431-D433. The BRENDA database can be available at http://www.brenda.uni-koeln.de/.

[24] Shepherd, A.J., Gorse, D., Thornton, J.M., 2003. A novel approach to the recognition of protein architecture from sequence using Fourier analysis and neural networks. Proteins, 50:290-302.

[25] Varfolomeev, S.D., Uporov, I., Fedorov, E.V., 2002. Bioinformatics and molecular modeling in chemical enzymology active sites of hydrolases. Biochemistry (Moscow), 67:1328-1340.

[26] Zavaljevski, N., Stevens, F.J., Reifman, J., 2002. Support vector machines with selective kernel scaling for protein classification and identification of key amino acid positions. Bioinformatics, 18:689-696.

[27] Zou, J., Hallberg, B.M., Bergfors, T., Oesch, F., Arand, M., Mowbray, S.L., Jones, T.A., 2000. Structure of Aspergillus niger epoxide hydrolase at 1.8 A resolution: implications fro the structure and function of the mammalian microsomal class of epoxide hydrolases. Structure, 8:111-122.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE