JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

Accepted manuscript available online (unedited version)

Hybrid embedding and joint training of stacked encoder for opinion question machine reading comprehension

Author(s): Xiang-zhou Huang, Si-liang Tang, Yin Zhang, Bao-gang Wei
Affiliation(s): College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
Corresponding email(s): huangxiangzhou@zju.edu.cn, siliang@zju.edu.cn, yinzh@zju.edu.cn
Key Words: Machine reading comprehension, Neural networks, Joint training, Data augmentation

Share this article to： More <<< Previous Paper \|Next Paper >>>

Xiang-zhou Huang, Si-liang Tang, Yin Zhang, Bao-gang Wei. Hybrid embedding and joint training of stacked encoder for opinion question machine reading comprehension[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.1900571

@article{title="Hybrid embedding and joint training of stacked encoder for opinion question machine reading comprehension",
author="Xiang-zhou Huang, Si-liang Tang, Yin Zhang, Bao-gang Wei",
journal="Frontiers of Information Technology & Electronic Engineering",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/FITEE.1900571"
}

%0 Journal Article
%T Hybrid embedding and joint training of stacked encoder for opinion question machine reading comprehension
%A Xiang-zhou Huang
%A Si-liang Tang
%A Yin Zhang
%A Bao-gang Wei
%J Frontiers of Information Technology & Electronic Engineering
%P 1346-1355
%@ 2095-9184
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/FITEE.1900571"

TY - JOUR
T1 - Hybrid embedding and joint training of stacked encoder for opinion question machine reading comprehension
A1 - Xiang-zhou Huang
A1 - Si-liang Tang
A1 - Yin Zhang
A1 - Bao-gang Wei
J0 - Frontiers of Information Technology & Electronic Engineering
SP - 1346
EP - 1355
%@ 2095-9184
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/FITEE.1900571"

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: Opinion question machine reading comprehension (MRC) requires a machine to answer questions by analyzing corresponding passages. Compared with traditional MRC tasks where the answer to every question is a segment of text in corresponding passages, opinion question MRC is more challenging because the answer to an opinion question may not appear in corresponding passages but needs to be deduced from multiple sentences. In this study, a novel framework based on neural networks is proposed to address such problems, in which a new hybrid embedding training method combining text features is used. Furthermore, extra attention and output layers which generate auxiliary losses are introduced to jointly train the stacked recurrent neural networks. To deal with imbalance of the dataset, irrelevancy of question and passage is used for data augmentation. Experimental results show that the proposed method achieves state-of-the-art performance. We are the biweekly champion in the opinion question MRC task in Artificial Intelligence Challenger 2018 (AIC2018).

观点型问题机器阅读理解中混合词向量和层叠循环神经网络联合训练方法的应用

黄祥洲，汤斯亮，张引，魏宝刚
浙江大学计算机科学与技术学院，中国杭州市，310027

摘要：观点型问题机器阅读理解指计算机通过分析相应段落回答问题。相比于传统机器阅读理解任务的答案是在相关段落中的某一段文本，观点型问题因其答案可能不出现在相关段落中而需由多个句子推理得出，其对应的机器阅读理解任务更具挑战性。针对这个任务，提出一种新颖的基于神经网络的解决方案，其中使用了一种结合文本特征的混合词向量训练方法。此外，引入额外的注意力网络和输出层，产生多个辅助损失函数用于联合训练层叠循环神经网络。针对数据集样本分布不平衡的问题，引入问题和段落的不相关性实现数据增强。实验结果验证了所提方法的有效性。该方案获得了AIC2018观点型问题机器阅读理解赛道的双周赛冠军。

关键词组：机器阅读理解；神经网络；联合训练；数据增强

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Bajaj P, Campos D, Craswell N, et al., 2016. MS MARCO: a human generated MAchine Reading COmprehension dataset. https://arxiv.org/abs/1611.09268

[2]Devlin J, Chang MW, Lee K, et al., 2018. BERT: pre-training of deep bidirectional transformers for language understanding. https://arxiv.org/abs/1810.04805

[3]He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. IEEE Conf on Computer Vision and Pattern Recognition, p.770-778.

[4]He W, Liu K, Liu J, et al., 2018. DuReader: a Chinese machine reading comprehension dataset from real-world applications. Proc Workshop on Machine Reading for Question Answering, p.37-46.

[5]Hermann KM, Kočiský T, Grefenstette E, et al., 2015. Teaching machines to read and comprehend. Proc 28^th Int Conf on Neural Information Processing Systems, p.1693-1701.

[6]Hochreiter S, Schmidhuber J, 1997. Long short-term memory. Neur Comput, 9(8):1735-1780.

[7]Joshi M, Choi E, Weld DS, et al., 2017. TriviaQA: a large scale distantly supervised challenge dataset for reading comprehension. https://arxiv.org/abs/1705.03551

[8]Kingma DP, Ba J, 2014. Adam: a method for stochastic optimization. https://arxiv.org/abs/1412.6980

[9]Liu JH, Wei W, Sun MS, et al., 2018. A multi-answer multi-task framework for real-world machine reading comprehension. Proc Conf on Empirical Methods in Natural Language Processing, p.2109-2118.

[10]Mikolov T, Sutskever I, Chen K, et al., 2013a. Distributed representations of words and phrases and their compositionality. Proc 26^th Int Conf on Neural Information Processing Systems, p.3111-3119.

[11]Mikolov T, Chen K, Corrado G, et al., 2013b. Efficient estimation of word representations in vector space. https://arxiv.org/abs/1301.3781

[12]Pan YH, 2016. Heading toward artificial intelligence 2.0. Engineering, 2(4):409-413.

[13]Pascanu R, Mikolov T, Bengio Y, 2012. Understanding the exploding gradient problem. https://arxiv.org/abs/1211.5063v1

[14]Rajpurkar P, Zhang J, Lopyrev K, et al., 2016. SQuAD: 100,000+ questions for machine comprehension of text. Proc Conf on Empirical Methods in Natural Language Processing, p.2383-2392.

[15]Richardson M, Burges CJC, Renshaw E, 2013. MCTest: a challenge dataset for the open-domain machine comprehension of text. Proc Conf on Empirical Methods in Natural Language Processing, p.193-203.

[16]Seo M, Kembhavi A, Farhadi A, et al., 2016. Bidirectional attention flow for machine comprehension. https://arxiv.org/abs/1611.01603

[17]Srivastava N, Hinton G, Krizhevsky A, et al., 2014. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res, 15(1):1929-1958.

[18]Sutskever I, Vinyals O, Le QV, 2014. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, p.3104-3112.

[19]Tan CQ, Wei FR, Wang WH, et al., 2018. Multiway attention networks for modeling sentence pairs. Proc 27^th Int Joint Conf on Artificial Intelligence, p.4411-4417.

[20]Vinyals O, Fortunato M, Jaitly N, 2015. Pointer networks. Advances in Neural Information Processing Systems, p.2692-2700.

[21]Wang SH, Jiang J, 2016. Learning natural language inference with LSTM. Proc Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, p.1442-1451.

[22]Wang W, Yang N, Wei F, et al., 2017. R-NET: Machine Reading Comprehension with Self-matching Networks. Technical Report, Natural Language Computing Group, Microsoft Research Asia, Beijing, China. https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/r-net.pdf

[23]Wu HC, Luk RWP, Wong KF, et al., 2008. Interpreting TF-IDF term weights as making relevance decisions. ACM Trans Inform Syst, 26(3):13.

[24]Wu YH, Schuster M, Chen ZF, et al., 2016. Google‘s neural machine translation system: bridging the gap between human and machine translation. https://arxiv.org/abs/1609.08144

[25]Yang Y, Yih WT, Meek C, 2015. WikiQA: a challenge dataset for open-domain question answering. Proc Conf on Empirical Methods in Natural Language Processing, p.2013-2018.

[26]Yu AW, Dohan D, Luong MT, et al., 2018. QANet: combining local convolution with global self-attention for reading comprehension. https://arxiv.org/abs/1804.09541

[27]Zhuang YT, Wu F, Chen C, et al., 2017. Challenges and opportunities: from big data to knowledge in AI 2.0. Front Inform Technol Electron Eng, 18(1):3-14.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

- Go to

观点型问题机器阅读理解中混合词向量和层叠循环神经网络联合训练方法的应用

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference