CLC number: TP391.4

On-line Access: 2017-04-12

Received: 2016-05-15

Revision Accepted: 2016-10-15

Crosschecked: 2017-03-29

Yuan-ping Nie


Frontiers of Information Technology & Electronic Engineering  2017 Vol.18 No.4 P.535-544


Attention-based encoder-decoder model for answer selection in question answering

Author(s):  Yuan-ping Nie, Yi Han, Jiu-ming Huang, Bo Jiao, Ai-ping Li

Affiliation(s):  College of Computer, National University of Defense Technology, Changsha 410073, China; more

Corresponding email(s):   yuanpingnie@nudt.edu.cn

Key Words:  Question answering, Answer selection, Attention, Deep learning

Yuan-ping Nie, Yi Han, Jiu-ming Huang, Bo Jiao, Ai-ping Li. Attention-based encoder-decoder model for answer selection in question answering[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(4): 535-544.

One of the key challenges for question answering is to bridge the lexical gap between questions and answers because there may not be any matching word between them. Machine translation models have been shown to boost the performance of solving the lexical gap problem between question-answer pairs. In this paper, we introduce an attention-based deep learning model to address the answer selection task for question answering. The proposed model employs a bidirectional long short-term memory (LSTM) encoder-decoder, which has been demonstrated to be effective on machine translation tasks to bridge the lexical gap between questions and answers. Our model also uses a step attention mechanism which allows the question to focus on a certain part of the candidate answer. Finally, we evaluate our model using a benchmark dataset and the results show that our approach outperforms the existing approaches. Integrating our model significantly improves the performance of our question answering system in the TREC 2015 LiveQA task.


概要:问答技术的重要挑战之一就是解决问题与答案之间的语义空白。机器翻译模型已经被证明能有效的提升解决问题与答案之间的语义空白。本文提出了一种基于注意机制的深度神经网络模型来解决问答系统中的答案选择任务。该模型采用了基于双向长短时记忆(Long short-term memory, LSTM)的编码解码模型,编码解码模型是一个被证明再机器翻译领域取得了突出的成绩。我们还在模型中应用了注意力机制来提升模型的效果。本文在一个公开数据集上验证了实验的有效性,同时通过结合该模型显著提高了问答系统的性能在TREC 2015 liveQA的任务中。


