Full Text:   <1236>

CLC number: Q756;TP181

On-line Access: 

Received: 2002-09-07

Revision Accepted: 2002-12-08

Crosschecked: 0000-00-00

Cited: 0

Clicked: 3084

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
1. Reference List
Open peer comments

Journal of Zhejiang University SCIENCE A 2003 Vol.4 No.5 P.573~577

http://doi.org/10.1631/jzus.2003.0573


Splicing-site recognition of rice (Oryza sativa L.)DNA sequences by support vector machines


Author(s):  PENG Si-hua, FAN Long-jiang, PENG Xiao-ning, ZHUANG Shu-lin, DU Wei, CHEN Liang-biao

Affiliation(s):  Department of Control Science and Engineering, College of Information Science and Engineering,Zhejiang University, Hangzhou 310027, China; more

Corresponding email(s):   pengsihua@zju.edu.cn, liangbiao@zju.edu.cn

Key Words:  Support vector machines, Machine learning, Intron, Splicing site, Oryza sativa


Share this article to: More

PENG Si-hua, FAN Long-jiang, PENG Xiao-ning, ZHUANG Shu-lin, DU Wei, CHEN Liang-biao. Splicing-site recognition of rice (Oryza sativa L.)DNA sequences by support vector machines[J]. Journal of Zhejiang University Science A, 2003, 4(5): 573~577.

@article{title="Splicing-site recognition of rice (Oryza sativa L.)DNA sequences by support vector machines",
author="PENG Si-hua, FAN Long-jiang, PENG Xiao-ning, ZHUANG Shu-lin, DU Wei, CHEN Liang-biao",
journal="Journal of Zhejiang University Science A",
volume="4",
number="5",
pages="573~577",
year="2003",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.2003.0573"
}

%0 Journal Article
%T Splicing-site recognition of rice (Oryza sativa L.)DNA sequences by support vector machines
%A PENG Si-hua
%A FAN Long-jiang
%A PENG Xiao-ning
%A ZHUANG Shu-lin
%A DU Wei
%A CHEN Liang-biao
%J Journal of Zhejiang University SCIENCE A
%V 4
%N 5
%P 573~577
%@ 1869-1951
%D 2003
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.2003.0573

TY - JOUR
T1 - Splicing-site recognition of rice (Oryza sativa L.)DNA sequences by support vector machines
A1 - PENG Si-hua
A1 - FAN Long-jiang
A1 - PENG Xiao-ning
A1 - ZHUANG Shu-lin
A1 - DU Wei
A1 - CHEN Liang-biao
J0 - Journal of Zhejiang University Science A
VL - 4
IS - 5
SP - 573
EP - 577
%@ 1869-1951
Y1 - 2003
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.2003.0573


Abstract: 
Motivation: It was found that high accuracy splicing-site recognition of rice (Oryza sativa L.) DNA sequence is especially difficult. We described a new method for the splicing-site recognition of rice DNA sequences. Method: Based on the intron in eukaryotic organisms conforming to the principle of GT-AG, we used support vector machines (SVM) to predict the splicing sites. By machine learning, we built a model and used it to test the effect of the test data set of true and pseudo splicing sites. Results: The prediction accuracy we obtained was 87.53% at the true 5' end splicing site and 87.37% at the true 3' end splicing sites. The results suggested that the SVM approach could achieve higher accuracy than the previous approaches.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Burge, C., 1997. Identification of Genes in Human Genomic DNA. Doctoral Thesis, Stanford University.

[2]Burbidge, R., Trotter, M., Buxton, B. and Holden, S., 2001. Drug design by machine learning: support vector machines for pharmaceutical data analysis. Computers and Chemistry, 26: 5-14.

[3]Chang, C.C., Hsu, C.W. and Lin, C.J., 2000. The analysis of decomposition methods for support vector machines. IEEE Trans. Neural Networks, 11(4): 1003-1008.

[4]Cortes, C. and Vapnik, V., 1995. Support-Vector networks. Machine learning, 20:275-297.

[5]Gao,J.R. and Ye,L.B., 1999. Molecular Biology. Wuhan University Press, Wuhan, p.135-138(in Chinese).

[6]Hua, S.J. and Sun, Z.R., 2001a. A Novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. J. Mol. Biol., 308:397-407.

[7]Hua, S.J. and Sun, Z.R., 2001b. Support vector machine approach for protein sub cellular localization prediction. Bioinformatics, 17(8): 721-728.

[8]Ogura, H. and Hideyuki, Agata, 1997. A study of learning splicing site of DNA sequence by neural networks. Comput. Biol. Med., 27(1): 67-75.

[9]Osuna, E., Freund, R. and Girosi, F., 1997. Support Vector Machines: Training and Applications. AI Memo 1602, Massachusetts Institute of Technology.

[10]Sun, J.,Xu, J. and Lin, L.J., 1993. Using neural networks to recognize the splicing sites of mRNA. Transactions of Biophysical Sinica,9(1):127-131(in Chinese).

[11]Tong, K.Z., 1998. Gene and its Expression. Science Press, Beijing.

[12]Yu, J., Hu, S.N. and Wang, J., 2002. A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. Indica). Science, 296:79-92.

[13]Vapnik, V., 2000. The Nature of Statistical Learning Theory. Traslated by Zhang Yuegong, Tsinghua University Press, Beijing(in Chinese).

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE