CLC number: TP391.1
On-line Access: 2020-09-09
Received: 2019-10-19
Revision Accepted: 2020-03-16
Crosschecked: 2020-08-10
Cited: 0
Clicked: 4255
Citations: Bibtex RefMan EndNote GB/T7714
Xiang-zhou Huang, Si-liang Tang, Yin Zhang, Bao-gang Wei. Hybrid embedding and joint training of stacked encoder for opinion question machine reading comprehension[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.1900571 @article{title="Hybrid embedding and joint training of stacked encoder for opinion question machine reading comprehension", %0 Journal Article TY - JOUR
观点型问题机器阅读理解中混合词向量和层叠循环神经网络联合训练方法的应用浙江大学计算机科学与技术学院,中国杭州市,310027 摘要:观点型问题机器阅读理解指计算机通过分析相应段落回答问题。相比于传统机器阅读理解任务的答案是在相关段落中的某一段文本,观点型问题因其答案可能不出现在相关段落中而需由多个句子推理得出,其对应的机器阅读理解任务更具挑战性。针对这个任务,提出一种新颖的基于神经网络的解决方案,其中使用了一种结合文本特征的混合词向量训练方法。此外,引入额外的注意力网络和输出层,产生多个辅助损失函数用于联合训练层叠循环神经网络。针对数据集样本分布不平衡的问题,引入问题和段落的不相关性实现数据增强。实验结果验证了所提方法的有效性。该方案获得了AIC2018观点型问题机器阅读理解赛道的双周赛冠军。 关键词组: Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference[1]Bajaj P, Campos D, Craswell N, et al., 2016. MS MARCO: a human generated MAchine Reading COmprehension dataset. https://arxiv.org/abs/1611.09268 [2]Devlin J, Chang MW, Lee K, et al., 2018. BERT: pre-training of deep bidirectional transformers for language understanding. https://arxiv.org/abs/1810.04805 [3]He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. IEEE Conf on Computer Vision and Pattern Recognition, p.770-778. [4]He W, Liu K, Liu J, et al., 2018. DuReader: a Chinese machine reading comprehension dataset from real-world applications. Proc Workshop on Machine Reading for Question Answering, p.37-46. [5]Hermann KM, Kočiský T, Grefenstette E, et al., 2015. Teaching machines to read and comprehend. Proc 28th Int Conf on Neural Information Processing Systems, p.1693-1701. [6]Hochreiter S, Schmidhuber J, 1997. Long short-term memory. Neur Comput, 9(8):1735-1780. [7]Joshi M, Choi E, Weld DS, et al., 2017. TriviaQA: a large scale distantly supervised challenge dataset for reading comprehension. https://arxiv.org/abs/1705.03551 [8]Kingma DP, Ba J, 2014. Adam: a method for stochastic optimization. https://arxiv.org/abs/1412.6980 [9]Liu JH, Wei W, Sun MS, et al., 2018. A multi-answer multi-task framework for real-world machine reading comprehension. Proc Conf on Empirical Methods in Natural Language Processing, p.2109-2118. [10]Mikolov T, Sutskever I, Chen K, et al., 2013a. Distributed representations of words and phrases and their compositionality. Proc 26th Int Conf on Neural Information Processing Systems, p.3111-3119. [11]Mikolov T, Chen K, Corrado G, et al., 2013b. Efficient estimation of word representations in vector space. https://arxiv.org/abs/1301.3781 [12]Pan YH, 2016. Heading toward artificial intelligence 2.0. Engineering, 2(4):409-413. [13]Pascanu R, Mikolov T, Bengio Y, 2012. Understanding the exploding gradient problem. https://arxiv.org/abs/1211.5063v1 [14]Rajpurkar P, Zhang J, Lopyrev K, et al., 2016. SQuAD: 100,000+ questions for machine comprehension of text. Proc Conf on Empirical Methods in Natural Language Processing, p.2383-2392. [15]Richardson M, Burges CJC, Renshaw E, 2013. MCTest: a challenge dataset for the open-domain machine comprehension of text. Proc Conf on Empirical Methods in Natural Language Processing, p.193-203. [16]Seo M, Kembhavi A, Farhadi A, et al., 2016. Bidirectional attention flow for machine comprehension. https://arxiv.org/abs/1611.01603 [17]Srivastava N, Hinton G, Krizhevsky A, et al., 2014. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res, 15(1):1929-1958. [18]Sutskever I, Vinyals O, Le QV, 2014. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, p.3104-3112. [19]Tan CQ, Wei FR, Wang WH, et al., 2018. Multiway attention networks for modeling sentence pairs. Proc 27th Int Joint Conf on Artificial Intelligence, p.4411-4417. [20]Vinyals O, Fortunato M, Jaitly N, 2015. Pointer networks. Advances in Neural Information Processing Systems, p.2692-2700. [21]Wang SH, Jiang J, 2016. Learning natural language inference with LSTM. Proc Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, p.1442-1451. [22]Wang W, Yang N, Wei F, et al., 2017. R-NET: Machine Reading Comprehension with Self-matching Networks. Technical Report, Natural Language Computing Group, Microsoft Research Asia, Beijing, China. https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/r-net.pdf [23]Wu HC, Luk RWP, Wong KF, et al., 2008. Interpreting TF-IDF term weights as making relevance decisions. ACM Trans Inform Syst, 26(3):13. [24]Wu YH, Schuster M, Chen ZF, et al., 2016. Google‘s neural machine translation system: bridging the gap between human and machine translation. https://arxiv.org/abs/1609.08144 [25]Yang Y, Yih WT, Meek C, 2015. WikiQA: a challenge dataset for open-domain question answering. Proc Conf on Empirical Methods in Natural Language Processing, p.2013-2018. [26]Yu AW, Dohan D, Luong MT, et al., 2018. QANet: combining local convolution with global self-attention for reading comprehension. https://arxiv.org/abs/1804.09541 [27]Zhuang YT, Wu F, Chen C, et al., 2017. Challenges and opportunities: from big data to knowledge in AI 2.0. Front Inform Technol Electron Eng, 18(1):3-14. Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou
310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE |
Open peer comments: Debate/Discuss/Question/Opinion
<1>