JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

Accepted manuscript available online (unedited version)

Cohort-based personalized query auto-completion

Author(s): Dan-yang Jiang, Hong-hui Chen
Affiliation(s): Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China
Corresponding email(s): danyangjiang@nudt.edu.cn, chenhonghui@nudt.edu.cn
Key Words: Query auto-completion, Cohort-based retrieval, Topic models

Share this article to： More <<< Previous Paper \|Next Paper >>>

Dan-yang Jiang, Hong-hui Chen. Cohort-based personalized query auto-completion[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.1800010

@article{title="Cohort-based personalized query auto-completion",
author="Dan-yang Jiang, Hong-hui Chen",
journal="Frontiers of Information Technology & Electronic Engineering",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/FITEE.1800010"
}

%0 Journal Article
%T Cohort-based personalized query auto-completion
%A Dan-yang Jiang
%A Hong-hui Chen
%J Frontiers of Information Technology & Electronic Engineering
%P 1246-1258
%@ 2095-9184
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/FITEE.1800010"

TY - JOUR
T1 - Cohort-based personalized query auto-completion
A1 - Dan-yang Jiang
A1 - Hong-hui Chen
J0 - Frontiers of Information Technology & Electronic Engineering
SP - 1246
EP - 1258
%@ 2095-9184
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/FITEE.1800010"

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: query auto-completion (QAC) facilitates query formulation by predicting completions for given query prefix inputs. Most web search engines use behavioral signals to customize query completion lists for users. To be effective, such personalized QAC models rely on the access to sufficient context about each user’s interest and intentions. Hence, they often suffer from data sparseness problems. For this reason, we propose the construction and application of cohorts to address context sparsity and to enhance QAC personalization. We build an individual’s interest profile by learning his/her topic preferences through topic models and then aggregate users who share similar profiles. As conventional topic models are unable to automatically learn cohorts, we propose two cohort topic models that handle topic modeling and cohort discovery in the same framework. We present four cohort-based personalized QAC models that employ four different cohort discovery strategies. Our proposals use cohorts’ contextual information together with query frequency to rank completions. We perform extensive experiments on the publicly available AOL query log and compare the ranking effectiveness with that of models that discard cohort contexts. Experimental results suggest that our cohort-based personalized QAC models can solve the sparseness problem and yield significant relevance improvement over competitive baselines.

基于同类用户的个性化查询词自动推荐方法

摘要：查询词自动推荐（query auto-completion，QAC）通过预测查询词前缀对应的完整补全查询词帮助用户构造查询词。大多互联网搜索引擎利用用户的行为信息为用户提供个性化的查询词自动推荐列表。为提高推荐成功率，个性化的QAC方法需获取大量关于用户搜索兴趣和搜索意图的上下文信息。因此，这些方法通常受制于用户数据的稀疏性问题。本文提出利用同类用户的搜索记录解决用户数据的稀疏性问题，并提升个性化QAC方法的推荐性能。首先，通过主题模型得到用户的主题兴趣，建立每个用户的兴趣肖像，然后将兴趣肖像相似的用户聚集起来建立同类用户群。由于传统主题模型不能自动识别同类用户，提出两个同类用户主题模型，将主题建模与同类用户识别包含在同一个模型框架内。根据不同的同类用户识别方法，提供4个不同的基于同类用户的个性化QAC方法。所提个性化QAC方法通过同类用户的上下文信息和查询词的频率对补全的查询词排序。在公开的AOL查询词数据集上进行大量实验，并与不采用同类用户上下文信息的方法进行排序性能对比。实验结果显示，本文提出的基于同类用户的个性化QAC方法能有效解决用户数据稀疏性问题，并且相对于基准方法能大幅提升排序结果准确性。

关键词组：查询词自动推荐；基于同类用户的信息检索；主题模型

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Bar-Yossef Z, Kraus N, 2011. Context-sensitive query auto-completion. Proc 20$^rm th$ Int Conf on World Wide Web, p.107-116.

[2]Blei DM, Ng AY, Jordan MI, 2003. Latent Dirichlet allocation. J Mach Learn Res, 3:993-1022.

[3]Burges CJC, Svore KM, Bennett PN, et al., 2011. Learning to rank using an ensemble of lambda-gradient models. J Mach Learn Res, 14:25-35.

[4]Cai F, de Rijke M, 2016a. Learning from homologous queries and semantically related terms for query auto completion. Inform Process Manag, 52(4):628-643.

[5]Cai F, de Rijke M, 2016b. A survey of query auto completion in information retrieval. Found Trends Inform Retr, 10(4):273-363.

[6]Cai F, Reinanda R, de Rijke M, 2016a. Diversifying query auto-completion. ACM Trans Inform Syst, 34(4), Artilce 25.

[7]Cai F, Liang SS, de Rijke M, 2016b. Prefix-adaptive and time-sensitive personalized query auto completion. IEEE Trans Knowl Data Eng, 28(9):2452-2466.

[8]Chen X, Zhou MY, Carin L, 2012. The contextual focused topic model. Proc 18$^rm th$ ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.96-104.

[9]Fiorini N, Lu ZY, 2018. Personalized neural language models for real-world query auto completion. 16$^rm th$ Annual Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, p.208-215.

[10]Gama J, vZliobait.e I, Bifet A, et al., 2014. A survey on concept drift adaptation. ACM Comput Surv, 46(4), Article 44.

[11]Hassan A, White RW, 2013. Personalized models of search satisfaction. Proc 22$^rm nd$ ACM Int Conf on Information and Knowledge Management, p.2009-2018.

[12]Jiang JY, Cheng PJ, 2016. Classifying user search intents for query auto-completion. Proc ACM Int Conf on the Theory of Information Retrieval, p.49-58.

[13]Li DF, Ding Y, Shuai X, et al., 2012. Adding community and dynamic to topic models. J Inform, 6(2):237-253.

[14]Li LD, Deng HB, Dong AL, et al., 2017a. Exploring query auto-completion and click logs for contextual-aware web search and query suggestion. Proc 26$^rm th$ Int Conf on World Wide Web, p.539-548.

[15]Li LD, Deng HB, Chen JH, et al., 2017b. Learning parametric models for context-aware query auto-completion via Hawkes processes. Proc 10$^rm th$ ACM Int Conf on Web Search and Data Mining, p.131-139.

[16]Mitra B, 2015. Exploring session context using distributed representations of queries and reformulations. Proc 38$^rm th$ Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.3-12.

[17]Mitra B, Shokouhi M, Radlinski F, et al., 2014. On user interactions with query auto-completion. Proc 37$^rm th$ Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.1055-1058.

[18]Morris MR, Horvitz E, 2007. Searchtogether: an interface for collaborative web search. Proc 20$^rm th$ Annual ACM Symp on User Interface Software and Technology, p.3-12.

[19]Neiswanger W, Wang C, Xing E, 2014. Asymptotically exact, embarrassingly parallel MCMC. Proc 13$^rm th$ Conf on Uncertainty in Artificial Intelligence, p.623-632.http://arxiv.org/abs/1311.4780v2

[20]Park DH, Chiba R, 2017. A neural language model for query auto-completion. Proc 40^th Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.1189-1192.

[21]Pass G, Chowdhury A, Torgeson C, 2006. A picture of search. Proc 1^st Int Conf on Scalable Information Systems, p.1.

[22]Shokouhi M, 2013. Learning to personalize query auto-completion. Proc 36^th Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.103-112.

[23]Shokouhi M, Radinsky K, 2012. Time-sensitive query auto-completion. Proc 35^th Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.601-610.

[24]Smyth B, Balfe E, Briggs P, et al., 2003. Collaborative web search. Proc 18^th Int Joint Conf on Artificial Intelligence, p.1417-1419.

[25]Steyvers M, Smyth P, Rosen-Zvi M, et al., 2004. Probabilistic author-topic models for information discovery. Proc 10^th Int Conf on Knowledge Discovery and Data Mining, p.306-315.

[26]Teevan J, Morris MR, Bush S, 2009. Discovering and using groups to improve personalized search. Proc 2^nd ACM Int Conf on Web Search and Data Mining, p.15-24.

[27]White RW, Chu W, Hassan A, et al., 2013. Enhancing personalized search by mining and modeling task behavior. Proc 22>^nd Int Conf on World Wide Web, p.1411-1420.

[28]Wu XD, Kumar V, Quinlan JR, et al., 2008. Top 10 algorithms in data mining. Knowl Inform Syst, 14(1):1-37.

[29]Yan JY, Chu W, White RW, 2014. Cohort modeling for enhanced personalized search. Proc 37^th Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.505-514.

[30]Yin ZJ, Cao LL, Gu QQ, et al., 2012. Latent community topic analysis: integration of community discovery with topic modeling. ACM Trans Intell Syst Technol, 3(4), Article 63.

[31]Zhang A, Goyal A, Kong WZ, et al., 2015. adaQAC: adaptive query auto-completion via implicit negative feedback. Proc 38^th Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.143-152.

[32]Zheng GQ, Guo JW, Yang LC, et al., 2011. Mining topics on participations for community discovery. Proc 34^th Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.445-454.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

- Go to

基于同类用户的个性化查询词自动推荐方法

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference