Full Text:   <2341>

CLC number: TP391

On-line Access: 2017-07-31

Received: 2016-02-15

Revision Accepted: 2016-06-24

Crosschecked: 2017-06-16

Cited: 0

Clicked: 5337

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Guoqiang Zhong

http://orcid.org/0000-0002-2952-6642

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2017 Vol.18 No.7 P.978-988

http://doi.org/10.1631/FITEE.1600996


Tandem hidden Markov models using deep belief networks for offline handwriting recognition


Author(s):  Partha Pratim Roy, Guoqiang Zhong, Mohamed Cheriet

Affiliation(s):  Department of Computer Science & Engineering, Indian Institute of Technology Roorkee, Roorkee 247667, India; more

Corresponding email(s):   proy.fcs@iitr.ac.in, gqzhong@ouc.edu.cn, mohamed.cheriet@etsmtl.ca

Key Words:  Handwriting recognition, Hidden Markov models, Deep learning, Deep belief networks, Tandem approach


Partha Pratim Roy, Guoqiang Zhong, Mohamed Cheriet. Tandem hidden Markov models using deep belief networks for offline handwriting recognition[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(7): 978-988.

@article{title="Tandem hidden Markov models using deep belief networks for offline handwriting recognition",
author="Partha Pratim Roy, Guoqiang Zhong, Mohamed Cheriet",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="18",
number="7",
pages="978-988",
year="2017",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1600996"
}

%0 Journal Article
%T Tandem hidden Markov models using deep belief networks for offline handwriting recognition
%A Partha Pratim Roy
%A Guoqiang Zhong
%A Mohamed Cheriet
%J Frontiers of Information Technology & Electronic Engineering
%V 18
%N 7
%P 978-988
%@ 2095-9184
%D 2017
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1600996

TY - JOUR
T1 - Tandem hidden Markov models using deep belief networks for offline handwriting recognition
A1 - Partha Pratim Roy
A1 - Guoqiang Zhong
A1 - Mohamed Cheriet
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 18
IS - 7
SP - 978
EP - 988
%@ 2095-9184
Y1 - 2017
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1600996


Abstract: 
Unconstrained offline handwriting recognition is a challenging task in the areas of document analysis and pattern recognition. In recent years, to sufficiently exploit the supervisory information hidden in document images, much effort has been made to integrate multi-layer perceptrons (MLPs) in either a hybrid or a tandem fashion into hidden Markov models (HMMs). However, due to the weak learnability of MLPs, the learnt features are not necessarily optimal for subsequent recognition tasks. In this paper, we propose a deep architecture-based tandem approach for unconstrained offline handwriting recognition. In the proposed model, deep belief networks are adopted to learn the compact representations of sequential data, while HMMs are applied for (sub-)word recognition. We evaluate the proposed model on two publicly available datasets, i.e., RIMES and IFN/ENIT, which are based on Latin and Arabic languages respectively, and one dataset collected by ourselves called Devanagari (an Indian script). Extensive experiments show the advantage of the proposed model, especially over the MLP-HMMs tandem approaches.

融合深度置信网络的串联隐马尔科夫模型及其在脱机手写识别中的应用

概要:在文档分析和模式识别领域,自由书写的脱机手写识别是一个非常具有挑战性的研究课题。近年来,为了充分探索隐藏在文档图像中的监督信息,许多研究工作试图将多层感知机以一种混合或串联的形式嵌入隐马尔科夫模型当中。然而,因为多层感知机学习能力的不足,学习到的特征对于后续的识别任务不一定是最优的。在本文中,我们针对自由书写的脱机手写识别提出一种基于深度结构的串联方法。在提出的模型中,深度置信网络被用于学习序列数据的紧致表示,隐马尔科夫模型被用于(子-)词的识别。我们在两个公开的数据集上验证了所提出的模型,这两个数据集是分别基于拉丁和阿拉伯语的RIMES和IFN/ENIT;我们还在Devanagari数据集上验证了所提出的模型,这个数据集是基于印度语的。大量的实验展示了所提出模型的优势,特别是相对于多层感知机-隐马尔科夫模型的串联方法。

关键词:手写识别;隐马尔科夫模型;深度学习;深度置信网络;串联方法

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Augustin, E., Carré, M., Grosicki, E., et al., 2006. RIMES evaluation campaign for handwritten mail processing. Proc. Int. Workshop on Frontiers in Handwriting Recognition, p.231-235.

[2]Baum, L.E., Petrie, T., Soules, G., et al., 1970. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Statist., 41(1):164-171.

[3]Bertolami, R., Bunke, H., 2008. Hidden Markov model-based ensemble methods for offline handwritten text line recognition. Patt. Recog., 41(11):3452-3460.

[4]Bianne-Bernard, A.L., Menasri, F., Mohamad, R.A.H., et al., 2011. Dynamic and contextual information in HMM modeling for handwritten word recognition. IEEE Trans. Patt. Anal. Mach. Intell., 33(10):2066-2080.

[5]Bourlard, H.A., Morgan, N., 1994. Connectionist Speech Recognition: a Hybrid Approach. Springer US, USA.

[6]Bunke, H., 2003. Recognition of cursive Roman handwriting: past, present and future. Proc. 7th Int. Conf. on Document Analysis and Recognition, p.448-459.

[7]Dahl, G., Yu, D., Deng, L., et al., 2011. Large vocabulary continuous speech recognition with context-dependent DBN-HMMs. Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, p.4688-4691.

[8]Deselaers, T., Hasan, S., Bender, O., et al., 2009. A deep learning approach to machine transliteration. Proc. 4th Workshop on Statistical Machine Translation, p.233-241.

[9]Dreuw, P., Heigold, G., Ney, H., 2009. Confidence-based discriminative training for model adaptation in offline Arabic handwriting recognition. Proc. 10th Int. Conf. on Document Analysis and Recognition, p.596-600.

[10]Dreuw, P., Doetsch, P., Plahl, C., et al., 2011a. Hierarchical hybrid MLP/HMM or rather MLP features for a discriminatively trained Gaussian HMM: a comparison for offline handwriting recognition. Proc. 18th Int. Conf. on Image Processing, p.3541-3544.

[11]Dreuw, P., Heigold, G., Ney, H., 2011b. Confidence- and margin-based MMI/MPE discriminative training for off-line handwriting recognition. Int. J. Doc. Anal. Recog., 14:273-288.

[12]El-Yacoubi, A., Gilloux, M., Sabourin, R., et al., 1999. An HMM-based approach for off-line unconstrained handwritten word modeling and recognition. IEEE Trans. Patt. Anal. Mach. Intell., 21(8):752-760.

[13]Espana-Boquera, S., Castro-Bleda, M.J., Gorbe-Moya, J., et al., 2011. Improving offline handwritten text recognition with hybrid HMM/ANN models. IEEE Trans. Patt. Anal. Mach. Intell., 33(4):767-779.

[14]Fujisawa, H., 2008. Forty years of research in character and document recognition–-an industrial perspective. Patt. Recog., 41:2435-2446.

[15]Graves, A., Schmidhuber, J., 2008. Offline handwriting recognition with multidimensional recurrent neural networks. Proc. 21st Int. Conf. on Neural Information Processing Systems, p.545-552.

[16]Graves, A., Liwicki, M., Fernández, S., et al., 2009. A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Patt. Anal. Mach. Intell., 31(5):855-868.

[17]Grosicki, E., El Abed, H., 2009. ICDAR 2009 handwriting recognition competition. Proc. 10th Int. Conf. on Document Analysis and Recognition, p.1398-1402.

[18]Haykin, S., 1998. Neural Networks: a Comprehensive Foundation. Prentice Hall, USA.

[19]Hermansky, H., Ellis, D.P.W., Sharma, S., 2000. Tandem connectionist feature extraction for conventional HMM systems. Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, p.1-4.

[20]Hinton, G.E., 2002. Training products of experts by minimizing contrastive divergence. Neur. Comput., 14(8):1771-1800.

[21]Hinton, G.E., Osindero, S., Teh, Y.W., 2006. A fast learning algorithm for deep belief nets. Neur. Comput., 18(7):1527-1554.

[22]Kessentini, Y., Paquet, T., Benhamadou, A., 2008. A multi-stream HMM-based approach for off-line multi-script handwritten word recognition. Proc. Int. Conf. on Frontiers in Handwriting Recognition, p.1-6.

[23]Kittler, J., Young, P.C., 1973. A new approach to feature selection based on the Karhunen-Loeve expansion. Patt. Recog., 5(4):335-352.

[24]Kozielski, M., Doetsch, P., Ney, H., 2013. Improvements in RWTH’s system for off-line handwriting recognition. Proc. 12th Int. Conf. on Document Analysis and Recognition, p.935-939.

[25]Margner, V., El Abed, H., 2010. ICFHR 2010–-Arabic handwriting recognition competition. Proc. Int. Conf. on Frontiers in Handwriting Recognition, p.709-714.

[26]Marinai, S., Gori, M., Soda, G., 2005. Artificial neural networks for document analysis and recognition. IEEE Trans. Patt. Anal. Mach. Intell., 27(1):23-35.

[27]Marti, U.V., Bunke, H., 2001. Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int. J. Patt. Recog. Artif. Intell., 15(1):65-90.

[28]Mohamad, R.A.H., Likforman-Sulem, L., Mokbel, C., 2009. Combining slanted-frame classifiers for improved HMM-based Arabic handwriting recognition. IEEE Trans. Patt. Anal. Mach. Intell., 31(7):1165-1177.

[29]Mohamed, A.R., Dahl, G., Hinton, G., 2009. Deep belief networks for phone recognition. Proc. NIPS Workshop on Deep Learning for Speech Recognition and Related Applications, p.1-9.

[30]Mohamed, A.R., Dahl, G., Hinton, G., 2012. Acoustic modeling using deep belief networks. IEEE Trans. Audio Speech Lang. Process., 20(1):14-22.

[31]Otsu, N., 1979. A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern., 9(1):62-66.

[32]Pal, U., Chaudhuri, B.B., 2004. Indian script character recognition: a survey. Patt. Recog., 37(9):1887-1899.

[33]Rabiner, L.R., 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE, 77(2):257-286.

[34]Renals, S., Morgan, N., Bourlard, H., et al., 1994. Connectionist probability estimators in HMM speech recognition. IEEE Trans. Speech Audio Process., 2(1):161-174.

[35]Rodríguez, J.A., Perronnin, F., 2008. Local gradient histogram features for word spotting in unconstrained handwritten documents. Proc. Int. Conf. on Frontiers in Handwriting Recognition, p.7-12.

[36]Schenk, J., Rigoll, G., 2006. Novel hybrid NN/HMM modelling techniques for on-line handwriting recognition. Proc. 10th Int. Workshop on Frontiers in Handwriting Recognition, p.1-5.

[37]Senior, A., Robinson, A.J., 1998. An off-line cursive handwriting recognition system. IEEE Trans. Patt. Anal. Mach. Intell., 20(3):309-321.

[38]Senior, A., Heigold, G., Bacchiani, M., et al., 2014. GMM-free DNN training. Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, p.1-5.

[39]Sharma, S., Ellis, D., Kajarekar, S., et al., 2000. Feature extraction using non-linear transformation for robust speech recognition on the Aurora database. Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, p.1117-1120.

[40]Shaw, B., Bhattacharya, U., Parui, S.K., 2014. Combination of features for efficient recognition of offline handwritten Devanagari words. Proc. 14th Int. Conf. on Frontiers in Handwriting Recognition, p.240-245.

[41]Thomas, S., Chatelain, C., Heutte, L., et al., 2015. A deep HMM model for multiple keywords spotting in handwritten documents. Patt. Anal. Appl., 18(4):1003-1015.

[42]Vinciarelli, A., 2002. A survey on off-line cursive word recognition. Patt. Recog., 35(7):1433-1446.

[43]Vinciarelli, A., Bengio, S., Bunke, H., 2004. Offline recog- nition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans. Patt. Anal. Mach. Intell., 26(6):709-720.

[44]Young, S., Evermann, G., Gales, M.J.F., 2006. The HTK Book (Version 3.4). Engineering Department, Cambridge University, UK.

[45]Zimmermann, M., Chappelier, J.C., Bunke, H., 2006. Offline grammar-based recognition of handwritten sentences. IEEE Trans. Patt. Anal. Mach. Intell., 28(5):818-821.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE