Journal of Zhejiang University SCIENCE A 2007 Vol.8 No.1 P.79~87


Using LSA and text segmentation to improve automatic Chinese dialogue text summarization

Author(s):  LIU Chuan-han, WANG Yong-cheng, ZHENG Fei, LIU De-rong

Affiliation(s):  Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, China; more

Corresponding email(s):   uuchliu@163.com

Key Words:  Automatic text summarization, Latent semantic analysis (LSA), Text segmentation, Dialogue style, Coherence, Question-answer pairs

Automatic Chinese text summarization for dialogue style is a relatively new research area. In this paper, latent semantic analysis (LSA) is first used to extract semantic knowledge from a given document, all question paragraphs are identified, an automatic text segmentation approach analogous to TextTiling is exploited to improve the precision of correlating question paragraphs and answer paragraphs, and finally some “important” sentences are extracted from the generic content and the question-answer pairs to generate a complete summary. Experimental results showed that our approach is highly efficient and improves significantly the coherence of the summary while not compromising informativeness.

