Full Text:   <1386>

CLC number: TP391.1; R394.3

On-line Access: 2011-04-11

Received: 2010-04-11

Revision Accepted: 2010-07-05

Crosschecked: 2011-01-31

Cited: 0

Clicked: 3866

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
1. Reference List
Open peer comments

Journal of Zhejiang University SCIENCE C 2011 Vol.12 No.4 P.263-272


Structural visualization of sequential DNA data

Author(s):  Xiao-hong Mao, Jing-hua Fu, Wei Chen, Qian You, Shiao-fen Fang, Qun-sheng Peng

Affiliation(s):  The Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 310013, China, State Key Lab of CAD & CG, Zhejiang University, Hangzhou 310058, China, Department of Computer and Information Science, Indiana University-Purdue University Indianapolis (IUPUI), Indianapolis, IN 46202, USA

Corresponding email(s):   chenwei@cad.zju.edu.cn

Key Words:  Genome sequence, Sequential visualization, Bio-information visualization

Share this article to: More |Next Article >>>

Xiao-hong Mao, Jing-hua Fu, Wei Chen, Qian You, Shiao-fen Fang, Qun-sheng Peng. Structural visualization of sequential DNA data[J]. Journal of Zhejiang University Science C, 2011, 12(4): 263-272.

@article{title="Structural visualization of sequential DNA data",
author="Xiao-hong Mao, Jing-hua Fu, Wei Chen, Qian You, Shiao-fen Fang, Qun-sheng Peng",
journal="Journal of Zhejiang University Science C",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T Structural visualization of sequential DNA data
%A Xiao-hong Mao
%A Jing-hua Fu
%A Wei Chen
%A Qian You
%A Shiao-fen Fang
%A Qun-sheng Peng
%J Journal of Zhejiang University SCIENCE C
%V 12
%N 4
%P 263-272
%@ 1869-1951
%D 2011
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.C1000091

T1 - Structural visualization of sequential DNA data
A1 - Xiao-hong Mao
A1 - Jing-hua Fu
A1 - Wei Chen
A1 - Qian You
A1 - Shiao-fen Fang
A1 - Qun-sheng Peng
J0 - Journal of Zhejiang University Science C
VL - 12
IS - 4
SP - 263
EP - 272
%@ 1869-1951
Y1 - 2011
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.C1000091

To date, comparing and visualizing genome sequences remain challenging due to the large genome size. Existing approaches take advantage of the stable property of oligonucleotides and exhibit the main characteristics of the whole genome, yet they commonly fail to show progression patterns of the genome adjustably. This paper presents a novel visual encoding technique, which not only supports the binning process (phylogenetic analysis), but also allows the sequential analysis of the genome. The key idea is to regard the combination of each k-nucleotide and its reverse complement as a visual word, and to represent a long genome sequence with a list of local statistical feature vectors derived from the local frequency of the visual words. Experimental results on a variety of examples demonstrate that the presented approach has the ability to quickly and intuitively visualize DNA sequences, and to help the user identify regions of differences among multiple datasets.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Assa, J., Cohen-Or, D., Yeh, I.C., Lee, T.Y., 2008. Motion overview of human action. ACM Trans. Graph., 27(5):480-489.

[2]Blei, D.M., Lafferty, J.D., 2006. Dynamic Topic Models. Proc. 23rd Int. Conf. on Machine Learning, p.113-120.

[3]Blei, D.M., Lafferty, J.D., 2007. Modeling Science. Available from http://www.cs.cmu.edu/~lemur/science

[4]Borg, I., Groenen, P., 2003. Modern multidimensional scaling: theory and applications. J. Educat. Meas., 40(3):277-280.

[5]Bourque, G., Pevzner, P.A., 2002. Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Res., 12(1):26-36.

[6]Deschavanne, P.J., Giron, A., Vilain, J., Fagot, G., Fertil, B., 1999. Genomic signature: characterization and classification of species assessed by chaos game representation of sequences. Mol. Biol. Evol., 16:1391-1399.

[7]Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D., 1998. Cluster analysis and display of genome-wide expression patterns. PNAS, 95(25):14863-14868.

[8]Fortuna, B., Grobelnik, M., Mladenic, D., 2005. Visualization of text document corpus. Informatica, 29:497-502.

[9]Goldman, D.B., Curless, B., Seitz, S.M., Salesion, D., 2006. Schematic storyboarding for video visualization and editing. ACM Trans. Graph., 25(3):862-871.

[10]Grundy, E., Jones, M.W., Laramee, R.S., Wilson, R.P., Shepard, E.L.C., 2009. Visualisation of sensor data from animal movement. Comput. Graph. Forum, 28(3):815-822.

[11]Hallin, P., Binnewies, T., Ussery, D., 2008. The genome blastatlas—a genewiz extension for visualization of whole-genome homology. Mol. BioSyst., 4(5):363.

[12]Hastie, T., Tibshirani, R., Friedman, J., Franklin, J., 2005. The elements of statistical learning: data mining, inference and prediction. Math. Intell., 27(2):83-85.

[13]Havre, S., Hetzler, E., Perrine, K., Jurrus, E., Miller, N., 2001. Interactive Visualization of Multiple Query Results. Proc. IEEE Information Visualization, p.105-112.

[14]Herniou, E., Luque, T., Chen, X., Vlak, J.M., Winstanley, D., Copy, J.S., O′Reilly, D.R., 2001. Use of whole genome sequence data to infer baculovirus phylogeny. J. Virol., 75(17):8117-8126.

[15]Karlin, S., Burge, C., 1995. Dinucleotide relative abundance extremes: a genomic signature. Trends Genet., 11(7):283-290.

[16]Karlin, S., Zhu, Z., Karlin, K.D., 1997. The extended environment of mononuclear metal centers in protein structures. PNAS, 94(26):14225-14230.

[17]Karlin, S., Brocchieri, L., Mrazek, J., Campbell, A.M., Spormann, A.M., 1999. A chimeric prokaryotic ancestry of mitochondria and primitive eukaryote. PNAS, 96(16):9190-9195.

[18]Lu, A., Shen, H., 2008. Interactive Storyboard for Overall Time-Varying Data Visualization. IEEE Pacific Visual- ization Symp., p.143-150.

[19]Mao, Y., Dillon, J., Lebanon, G., 2007. Sequential document visualization. IEEE Trans. Visual. Comput. Graph., 13(6):1208-1215.

[20]Meyer, M., Munzner, T., Pfister, H., 2009. MizBee: a multiscale synteny browser. IEEE Trans. Visual. Comput. Graph., 15(6):897-904.

[21]Savva, G., Dicks, J., Roberts, I.N., 2003. Current approaches to whole genome phylogenetic analysis. Brief. Bioinform., 4(1):63-74.

[22]Schbath, S., Prum, B., de Turckheim, E., 1995. Exceptional motifs in different Markov chain models for a statistical analysis of DNA sequences. J. Comput. Biol., 2(3):417-437.

[23]Shah, N., Dillard, S.E., Weber, G.H., Hamann, B., 2004. Volume Visualization of Multiple Alignment of Large Genomic {DNA}. Springer-Verlag, p.325-342.

[24]Trifonov, E.N., Sussman, J.L., 1980. The pitch of chromatin DNA is reflected in its nucleotide sequence. PNAS, 77(7):3816-3820.

[25]Zhou, F., Olman, V., Xu, Y., 2008. Barcodes for genomes and applications. BMC Bioinform., 9:546.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE