Full Text:   <901>

Summary:  <321>

CLC number: H313; TP391

On-line Access: 2017-03-10

Received: 2016-04-06

Revision Accepted: 2016-08-23

Crosschecked: 2017-02-20

Cited: 1

Clicked: 1754

Citations:  Bibtex RefMan EndNote GB/T7714


Wen-yan Xiao


-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2017 Vol.18 No.3 P.362-372


Corpus-based research on English word recognition rates in primary school and word selection strategy

Author(s):  Wen-yan Xiao, Ming-wen Wang, Zhen Weng, Li-lin Zhang, Jia-li Zuo

Affiliation(s):  School of Computer Information Engineering, Jiangxi Normal University, Nanchang 330022, China; more

Corresponding email(s):   wyxiao@jxnu.edu.cn, mwwang@jxnu.edu.cn, 1091013334@qq.com, 1006806747@qq.com, 44124148@qq.com

Key Words:  Corpus, Primary English, Recognition rate, Word frequency, Coverage rate

Wen-yan Xiao, Ming-wen Wang, Zhen Weng, Li-lin Zhang, Jia-li Zuo. Corpus-based research on English word recognition rates in primary school and word selection strategy[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(3): 362-372.

@article{title="Corpus-based research on English word recognition rates in primary school and word selection strategy",
author="Wen-yan Xiao, Ming-wen Wang, Zhen Weng, Li-lin Zhang, Jia-li Zuo",
journal="Frontiers of Information Technology & Electronic Engineering",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T Corpus-based research on English word recognition rates in primary school and word selection strategy
%A Wen-yan Xiao
%A Ming-wen Wang
%A Zhen Weng
%A Li-lin Zhang
%A Jia-li Zuo
%J Frontiers of Information Technology & Electronic Engineering
%V 18
%N 3
%P 362-372
%@ 2095-9184
%D 2017
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601118

T1 - Corpus-based research on English word recognition rates in primary school and word selection strategy
A1 - Wen-yan Xiao
A1 - Ming-wen Wang
A1 - Zhen Weng
A1 - Li-lin Zhang
A1 - Jia-li Zuo
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 18
IS - 3
SP - 362
EP - 372
%@ 2095-9184
Y1 - 2017
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601118

Acquiring vocabulary is important when studying English, as it assists in listening, speaking, reading, and writing. In this paper, we develop an English webpage corpus (EWC) and create a word frequency list using web crawler technology. By comparing EWC word lists with the British National corpus (BNC), we find that the BNC word frequency list possesses the feature of timeliness. We also explore primary school students’ English word recognition rates by comparing the word frequency lists of several corpora, including EWC, BNC, SUBTLEX-US, and Subtitle corpus of Children’s BBC (CBBC). The results show that the word recognition rates for primary school children are relatively low in both general language and specific language register. Motivated by the experiment results, we finally propose some word-selection strategies for compiling English textbooks for Chinese primary school students.


概要:词汇是语言学习中的基础任务之一,是语言学习者发展听、说、读、写语言技能的重要前提,在教材课文选择中要覆盖哪些词汇,是教材编写中的基本问题。针对这个问题,本文利用网络爬虫等技术构建英文网页语料库(English webpage corpus, EWC),并进行词频分析;将EWC与英国国家语料库(British National Corpus, BNC)进行词频对比分析,发现词频分布具有一定的时效性。通过我国目前小学英语教材词汇表与EWC,BNC,SUBTLEX-US,CBBC词频表的对比分析,给出了小学生在一般阅读时的英语词汇认识率,分析结果表明,小学生对一般语域和特定语域的词汇认识率都相对较低。通过这些定量分析,本文为我国小学英语教材编写提出了一些词汇选择方面的策略。


Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Beijing Educational Scientific Academy (BESA), 2001. Reflections on the current evaluation system of foreign language teaching and the importance of formative assessment in foreign language teaching, For. Lang. Teach. Schools (Middle Vers.), 24(6):1-4 (in Chinese).

[2]Brysbaert, M., New, B., 2009. Moving beyond Kučera and Francis: a critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Beh. Res. Meth., 41(4):977-990.

[3]Brysbaert, M., Buchmeier, M., Conrad, M., et al., 2011. The word frequency effect: a review of recent developments and implications for the choice of frequency estimates in German. Exp. Psychol., 58(5):412-424.

[4]Cunningsworth, A., 1995. Choosing Your Coursebook. Heinemann Publishers, Oxford.

[5]Fu, Y.C., 2013. A Vocabulary Study in Textbooks for Primary School and Junior High School Students. MS Thesis, Nanjing Normal University, China (in Chinese).

[6]Halliday, M.A.K., Hasan, R., 1976. Cohesion in English. Longman, London, UK.

[7]Jescheniak, J.D., Levelt, W.J.M., 1994. Word frequency effects in speech production: retrieval of syntactic information and of phonological form. J. Exp. Psychol. Learn. Mem. Cogn., 20:824-843.

[8]Kennedy, G., 1998. An Introduction to Corpus Linguistics. Longman.

[9]Kilgarriff, A., 1995. BNC database and word frequency lists. http://www.kilgarriff.co.uk/bnc-readme.html

[10]Kilgarriff, A., 1997. Putting frequencies in the dictionary. Int. J. Lexicogr., 10(2):135-155.

[11]Lang, J.G., Li, J., 2009. On English frequent words and frequency annotations of four English learners’ dictionaries. For. Lang. Teach. Res., 42(1):61-66 (in Chinese).

[12]Leech, G., 2001. The role of frequency in ELT: new corpus evidence brings a re-appraisal. For. Lang. Teach. Res., 33(5):328-339.

[13]Lei, L., Liu, D.L., 2016. A new medical academic word list: a corpus-based study with enhanced methodology. J. Engl Acad. Purp., 22:42-53.

[14]Lenneberg, E.H., 1967. Biological Foundations of Language. Wiley, New York.

[15]Liang, M.C., Li, W.Z., Xu, J.J., 2010. Using Corpora: a Practical Coursebook. Foreign Language Teaching and Research Press, Beijing.

[16]Liu, L., 2014. An Evaluation of 2012 PEP Primary English. MS Thesis, Ludong University, China (in Chinese).

[17]Liu, X.C., Sun, Y.J., 2013. Research on correlation of English vocabulary class information processing and comprehensive English ability. Inform. Sci., 34(7):64-67 (in Chinese).

[18]Monsell, S., Doyle, M.C., Haggard, P.N., 1989. Effects of frequency on visual word recognition tasks: where are they J. Exp. Psychol. Gen., 118(1):43-71.

[19]Nation, I.S.P., 1990. Teaching and Learning Vocabulary. Heinle ELT, London, UK.

[20]Nation, I.S.P, 2001. Learning Vocabulary in Another Language. Cambridge University Press, London, UK.

[21]Nation, P., Waring, R., 1997. Vocabulary size, text coverage and word lists. In: Schmitt, N., McCarthy, M. (Eds.), Vocabulary: Description, Acquisition and Pedagogy. Cambridge University Press, London, UK.

[22]Rietveld, T., van Hout, R., Ernestus, M., 2004. Pitfalls in corpus research. Comput. Human., 38(4):343-362.

[23]Schmitt, N., 2010. Researching Vocabulary: a Vocabulary Research Manual. Palgrave MacMillan.

[24]Shi, S.T., 2015. A Research on the Vocabulary Setting of Primary School Textbooks and the Students’ Communicative Competence. MS Thesis, Shanghai Normal University, China (in Chinese).

[25]Sinclair, J., 1996. Preliminary Recommendations on Corpus Typology. EAGLES Document TCWG-CTYP/P. http://www.ilc.cnr.it/EAGLES96/corpustyp/corpustyp. html

[26]Spolsky, B., 1998. Sociolinguistics. Oxford University Press.

[27]Sun, W.K., 2005. On the compilation principles and methods of English teaching vocabulary syllabus of basic education—with comments on vocabulary syllabus of basic education. Curricul. Teach. Mat. Meth., 25(3):61-65 (in Chinese).

[28]Svartvik, J., 1996. Corpora are becoming mainstream. In: Thomas, J., Short, M. (Eds.), Using Corpora for Language Research. Longman, London, p.3-13.

[29]Thornbury, S., 2006. How to Teach Vocabulary. Pearson Education, India.

[30]Trudgill, P., 2000. Sociolinguistics: an Introduction to Language and Society. Penguin, UK.

[31]van Heuven, W.J.B., Mandera, P., Keuleers, E., et al., 2014. SUBTLEX-UK: a new and improved word frequency database for British English. Q. J. Exp. Psychol., 67(6): 1176-1190.

[32]Verschueren, J., 1999. Understanding Pragmatics. Oxford University Press.

[33]Wang, Z.Q., Wu, X., 2008. The construction of corpus of English textbooks in China and its application in primary English textbook writing. Curricul. Teach. Mat. Meth., 6:53-57 (in Chinese).

[34]Wardhaugh, R., 1972. Introduction to Linguistics. McGraw-Hill, New York.

[35]White, R., 1998. The ELT Curriculum: Design, Innovation and Mangement. Wiley-Blackwell.

[36]Wilkins, D.A., 1972. Linguistics in Language Teaching. PhD Thesis, Edward Amold, London.

[37]Xie, J.C., He, A.P., 2008. A study on the appendix vocabulary of middle school English textbooks. For. Lang. Teach. Schools (Middle Vers.), 31(9):1-5 (in Chinese).

[38]Zhang, W., Ma, G.H., 2007. Analysis on the vocabulary of Go for It. For. Lang. Teach. Schools (Middle Vers.), 30(1): 9-13 (in Chinese).

[39]Zhao, X.B., 2007. A Study on Recognition and Extraction Method of Contemporary Chinese Basic Vocabulary Based on Dynamic Circuit Corpus. PhD Thesis, Beijing Language and Culture University, China (in Chinese).

[40]Zhu, X.M., 2013. A Study of Second Language Function Words Acquisition Based on Attention Theory. MS Thesis, Sichuan International Studies University, China (in Chinese).

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE