Journal of Zhejiang University SCIENCE B 2005 Vol.6 No.5 P.408~412


Statistical properties of nucleotide clusters in DNA sequences

Author(s):  CHENG Jun, ZHANG Lin-xi

Affiliation(s):  Department of Physics, Jinhua University, Jinhua 321017, China; more

Corresponding email(s):   Jh_Chengjun@163.com

Key Words:  DNA sequence, Plasmodium falciparum 3D7, Nucleotide clusters, Power law

Using the complete genome of Plasmodium falciparum 3D7 which has 14 chromosomes as an example, we have examined the distribution functions for the amount of C or G and A or T consecutively and non-overlapping blocks of m bases in this system. The function P(S) about the number of the consecutive C-G or A-T content cluster conforms to the relation P(S)∝eαs; values of the scaling exponent αCG are much larger than αAT; and αAT of 14 chromosomes are hardly changed, whereas αCG of 14 chromosomes have a number of fluctuations. We found maximum value of A-T cluster size is much larger than C-G, which implies the existence of large A-T cluster. Our study of the width function ξ(m) of cluster C-G content showed that follows good power law ξ(m)∝mγ. The average γ̄ for 14 chromosomes is 0.931. These investigations provide some insight into the nucleotide clusters of DNA sequences, and help us understand other properties of DNA sequences.

