Full Text:   <2768>

CLC number: TP391.41

On-line Access: 

Received: 2005-10-18

Revision Accepted: 2006-02-22

Crosschecked: 0000-00-00

Cited: 8

Clicked: 3920

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
1. Reference List
Open peer comments

Journal of Zhejiang University SCIENCE A 2007 Vol.8 No.1 P.63~71


Automatic character detection and segmentation in natural scene images

Author(s):  ZHU Kai-hua, QI Fei-hu, JIANG Ren-jie, XU Li

Affiliation(s):  Department of Computer Science and Technology, Shanghai Jiao Tong University, Shanghai 200030, China

Corresponding email(s):   godzkh@gmail.com

Key Words:  Text detection and segmentation, Adaboost, NLNiblack decomposition method, Attentional cascade

ZHU Kai-hua, QI Fei-hu, JIANG Ren-jie, XU Li. Automatic character detection and segmentation in natural scene images[J]. Journal of Zhejiang University Science A, 2007, 8(1): 63~71.

@article{title="Automatic character detection and segmentation in natural scene images",
author="ZHU Kai-hua, QI Fei-hu, JIANG Ren-jie, XU Li",
journal="Journal of Zhejiang University Science A",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T Automatic character detection and segmentation in natural scene images
%A ZHU Kai-hua
%A QI Fei-hu
%A JIANG Ren-jie
%A XU Li
%J Journal of Zhejiang University SCIENCE A
%V 8
%N 1
%P 63~71
%@ 1673-565X
%D 2007
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.2007.A0063

T1 - Automatic character detection and segmentation in natural scene images
A1 - ZHU Kai-hua
A1 - QI Fei-hu
A1 - JIANG Ren-jie
A1 - XU Li
J0 - Journal of Zhejiang University Science A
VL - 8
IS - 1
SP - 63
EP - 71
%@ 1673-565X
Y1 - 2007
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.2007.A0063

We present a robust connected-component (CC) based method for automatic detection and segmentation of text in real-scene images. This technique can be applied in robot vision, sign recognition, meeting processing and video indexing. First, a Non-Linear Niblack method (NLNiblack) is proposed to decompose the image into candidate CCs. Then, all these CCs are fed into a cascade of classifiers trained by adaboost algorithm. Each classifier in the cascade responds to one feature of the CC. Proposed here are 12 novel features which are insensitive to noise, scale, text orientation and text language. The classifier cascade allows non-text CCs of the image to be rapidly discarded while more computation is spent on promising text-like CCs. The CCs passing through the cascade are considered as text components and are used to form the segmentation result. A prototype system was built, with experimental results proving the effectiveness and efficiency of the proposed method.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1] Chen, B.T., Bae, Y., Kim, T.Y., 1999. Automatic Text Extraction in Digital Videos Using FFT and Neural Network. Proceedings of the IEEE International Fuzzy Systems Conference. Seoul, Korea.

[2] Clark, P., Mirmehdi, M., 2000. Finding Text Regions Using Localised Measures. Proceedings of the 11th British Machine Vision Conference, p.675-684.

[3] Doermann, D., Liang, J., Li, H., 2003. Progress in Camera-based Document Image Analysis. Proceedings of the 7th International Conference on Document Analysis and Recognition. Edinburgh, Scotland, 1:606-616.

[4] Ferreira, S., Thillou, C., Gosselin, B., 2003. From Picture to Speech: An Innovative OCR Application for Embedded Environment. ProRISC 2003. Veldhoven, Netherland.

[5] Gao, J., Yang, J., 2001. An Adaptive Algorithm for Text Detection from Natural Scenes. CVPR 2001, p.84-89.

[6] Haritaoglu, I., 2001. Scene Text Extraction and Translation for Handheld Devices. CVPR 2001, p.408-413.

[7] Hasan, Y.M.Y., Karam, L.J., 2000. Morphological text extraction from images. IEEE Transactions on Image Processing, 9(11):1978-1983.

[8] Jie, S., Rehg, J.M., Bobick, A., 2004. Automatic Cascade Training with Perturbation Bias. CVPR 2004, 2:276-283.

[9] Li, H., Doermann, D., Kia, O., 2000. Automatic text detection and tracking in digital video. IEEE Transactions on Image Processing, 9(1):147-156.

[10] Shin, C.S., Kim, K.I., Park, M.H., Kim, H.J., 2000. Support Vector Machine-based Text Detection in Digital Video. Proceedings of the IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing X, 2:634-641.

[11] Viola, P., Jones, M., 2001. Robust Real-time Face Detection. ICCV01, 2:747.

[12] Wang, K., Kangas, J., 2003. Character location in scene images from digital camera. Pattern Recognition, 36(10):2287-2299.

[13] Winger, L., Robinson, J.A., Jernigan, M.E., 2000. Low-complexity character extraction in low-contrast scene images. International Journal of Pattern Recognition and Artificial Intelligence, 14(2):113-135.

[14] Zhong, Y., Zhang, H.J., Jain, A.K., 2000. Automatical caption localization in compressed video. IEEE Transactions on PAMI, 22(4):385-392.

Open peer comments: Debate/Discuss/Question/Opinion



2014-07-25 19:43:25

You can help me solve the binary image image image illumination caused by the uneven distribution of larger error。

Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE