Full Text:   <1066>

Summary:  <403>

CLC number: TP391.1

On-line Access: 2015-06-04

Received: 2014-10-15

Revision Accepted: 2015-03-12

Crosschecked: 2015-05-07

Cited: 3

Clicked: 2334

Citations:  Bibtex RefMan EndNote GB/T7714


Xi-ming Li


-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2015 Vol.16 No.6 P.457-465


Topic modeling for large-scale text data

Author(s):  Xi-ming Li, Ji-hong Ouyang, You Lu

Affiliation(s):  College of Computer Science and Technology, Jilin University, Changchun 130012, China; more

Corresponding email(s):   liximing86@gmail.com, ouyj@jlu.edu.cn

Key Words:  Latent Dirichlet allocation (LDA), Topic modeling, Online learning, Moving average

Xi-ming Li, Ji-hong Ouyang, You Lu. Topic modeling for large-scale text data[J]. Frontiers of Information Technology & Electronic Engineering, 2015, 16(6): 457-465.

@article{title="Topic modeling for large-scale text data",
author="Xi-ming Li, Ji-hong Ouyang, You Lu",
journal="Frontiers of Information Technology & Electronic Engineering",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T Topic modeling for large-scale text data
%A Xi-ming Li
%A Ji-hong Ouyang
%A You Lu
%J Frontiers of Information Technology & Electronic Engineering
%V 16
%N 6
%P 457-465
%@ 2095-9184
%D 2015
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1400352

T1 - Topic modeling for large-scale text data
A1 - Xi-ming Li
A1 - Ji-hong Ouyang
A1 - You Lu
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 16
IS - 6
SP - 457
EP - 465
%@ 2095-9184
Y1 - 2015
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1400352

This paper develops a novel online algorithm, namely moving average stochastic variational inference (MASVI), which applies the results obtained by previous iterations to smooth out noisy natural gradients. We analyze the convergence property of the proposed algorithm and conduct a set of experiments on two large-scale collections that contain millions of documents. Experimental results indicate that in contrast to algorithms named ‘stochastic variational inference’ and ‘SGRLD’, our algorithm achieves a faster convergence rate and better performance.

Overall, I liked the idea introduced by the paper, as well as the large empirical case study. Scaling up topic models without loss of precision indeed is an important area.




Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Amari, S., 1998. Natural gradient works efficiently in learning. Neur. Comput., 10(2):251-276.

[2]Andrieu, C., de Freitas, N., Doucet, A., et al., 2003. An introduction to MCMC for machine learning. Mach. Learn., 50(1-2):5-43.

[3]Blatt, D., Hero, A.O., Gauchman, H., 2007. A convergent incremental gradient method with a constant step size. SIAM J. Optim., 18(1):29-51.

[4]Blei, D.M., 2012. Probabilistic topic models. Commun. ACM, 55(4):77-84.

[5]Blei, D.M., Ng, A.Y., Jordan, M.I., 2003. Latent Dirichlet allocation. J. Mach. Learn. Res., 3:993-1022.

[6]Canini, K.R., Shi, L., Griffiths, T.L., 2009. Online inference of topics with latent Dirichlet allocation. J. Mach. Learn. Res., 5(2):65-72.

[7]Griffiths, T.L., Steyvers, M., 2004. Finding scientific topics. PNAS, 101(suppl 1):5228-5235.

[8]Hoffman, M., Bach, F.R., Blei, D.M., 2010. Online learning for latent Dirichlet allocation. Advances in Neural Information Processing Systems, p.856-864.

[9]Hoffman, M., Blei, D.M., Wang, C., et al., 2013. Stochastic variational inference. J. Mach. Learn. Res., 14(1): 1303-1347.

[10]Liu, Z., Zhang, Y., Chang, E.Y., et al., 2011. PLDA+: parallel latent Dirichlet allocation with data placement and pipeline processing. ACM Trans. Intell. Syst. Technol., 2(3), Article 26.

[11]Newman, D., Asuncion, A., Smyth, P., et al., 2009. Distributed algorithms for topic models. J. Mach. Learn. Res., 10:1801-1828.

[12]Ouyang, J., Lu, Y., Li, X., 2014. Momentum online LDA for large-scale datasets. Proc. 21st European Conf. on Artificial Intelligence, p.1075-1076.

[13]Patterson, S., Teh, Y.W., 2013. Stochastic gradient Riemannian Langevin dynamics on the probability simplex. Advances in Neural Information Processing Systems, p.3102-3110.

[14]Ranganath, R., Wang, C., Blei, D.M., et al., 2013. An adaptive learning rate for stochastic variational inferencen. J. Mach. Learn. Res., 28(2):298-306.

[15]Schaul, T., Zhang, S., LeCun, Y., 2013. No more pesky learning rates. arXiv preprint, arXiv:1206:1106v2.

[16]Song, X., Lin, C.Y., Tseng, B.L., et al., 2005. Modeling and predicting personal information dissemination behavior. Proc. 11th ACM SIGKDD Int. Conf. on Knowledge Discovery in Data Mining, p.479-488.

[17]Tadić, V.B., 2009. Convergence rate of stochastic gradient search in the case of multiple and non-isolated minima. arXiv preprint, arXiv:0904.4229v2.

[18]Teh, Y.W., Newman, D., Welling, M., 2007. A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. Advances in Neural Information Processing Systems, p.1353-1360.

[19]Wang, C., Chen, X., Smola, A.J., et al., 2013. Variance reduction for stochastic gradient optimization. Advances in Neural Information Processing Systems, p.181-189.

[20]Wang, Y., Bai, H., Stanton, M., et al., 2009. PLDA: parallel latent Dirichlet allocation for large-scale applications. Proc. 5th Int. Conf. on Algorithmic Aspects in Information and Management, p.301-314.

[21]Yan, F., Xu, N., Qi, Y., 2009. Parallel inference for latent Dirichlet allocation on graphics processing units. Advances in Neural Information Processing Systems, p.2134-2142.

[22]Ye, Y., Gong, S., Liu, C., et al., 2013. Online belief propagation algorithm for probabilistic latent semantic analysis. Front. Comput. Sci., 7(5):526-535.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE