A Corpus-Based Analysis of Lexical Characteristics Across English News Categories for L2 Pedagogical Use

Main Article Content

Rattavit Loesnopchaimongkhon
Chanapa Phrommopakorn
Pancheewa Chernchom
Piyapong Laosrirattanachai

Abstract

News articles are widely regarded as valuable resources for vocabulary acquisition. However, they encompass diverse categories, each catering to specific learner needs. This study analysed the vocabulary of 3,000 news articles across 12 categories, focusing on lexical profiling, lexical level, variation, density, and CEFR level to support L2 learners. The results showed that the Health category had the highest General Service List word coverage (81.01%), while Technology featured the most Academic Word List terms (8.23%). Fashion contained the largest proportion of specialised vocabulary (18.31%) and exhibited the highest lexical variation (51.29%). High-frequency words dominated all categories (91–94.79%), while Fashion included the most mid-frequency (5.84%) and low-frequency (2.36%) words. Lexical density was highest in the Environment category (57.85%) and lowest in Sports (53.2%). The CEFR analysis indicated that A1 and A2 words comprised the majority (76.66% and 10.44%, respectively), while categories such as Fashion and Nutrition included the highest proportions of C1-C2 words (6.32% and 6.53%, respectively). These findings suggest that categories such as Health and Sports are suitable for beginner learners, while Fashion and Nutrition offer more complex vocabulary for advanced learners. This study highlights the unique lexical characteristics of news categories, providing educators and learners with guidance on selecting authentic materials to enhance vocabulary learning.

Article Details

How to Cite
Loesnopchaimongkhon, R., Phrommopakorn, C. ., Chernchom, P., & Laosrirattanachai, P. (2025). A Corpus-Based Analysis of Lexical Characteristics Across English News Categories for L2 Pedagogical Use. Journal of Studies in the English Language, 20(2), 36–65. https://doi.org/10.64731/jsel.v20i2.279647
Section
Research Articles

References

Abyaad, R., Kabir, M. R., & Hasan, S. (2020). A novel approach to categorize news articles from headlines and short text. Proceedings of the 2020 IEEE Region 10 Symposium (TENSYMP), Bangladesh, 2020, 162–165. https://doi.org/10.1109/TENSYMP50017.2020.9230675

Anthony, L. (2024). AntWordProfiler (Version 2.2.1) [Computer Software]. Tokyo, Japan: Waseda University. https://www.laurenceanthony.net/software/AntWordProfiler

Astika, G. (2015). Profiling the vocabulary of news texts as capacity building for language teachers. Indonesian Journal of Applied Linguistics, 4(2), 123–134. https://doi.org/10.17509/ijal.v4i2.689

August, D., Carlo, M., Dressler, C., & Snow, C. (2005). The critical role of vocabulary development for English language learners. Learning Disabilities Research & Practice, 20(1), 50–57. https://doi.org/10.1111/j.1540-5826.2005.00120.x

Baker, P. (2006). Using corpora in discourse analysis. Bloomsbury Publishing.

Baranowska, K. (2020). Learning most with least effort: subtitles and cognitive load. ELT Journal, 74(2), 105–115. https://doi.org/10.1093/elt/ccz060

Bates, E., Bretherton, I., & Snyder, L. S. (1988). From first words to grammar: Individual differences and dissociable mechanisms. Cambridge University Press.

Benigno, V., & de Jong, J. (2019). Linking vocabulary to the CEFR and the Global Scale of English: A psychometric model. In A. Huhta, G. Erickson, & N. Figueras (Eds.), Development in language education: A memorial volume in honour of Sauli Takala (pp. 8–29). Jyväskylä University Printing House.

Bleyer, W. G. (1916). Types of news writing. Houghton Mifflin.

Chung, M. (2009). The newspaper word list: A specialised vocabulary for reading newspapers. JALT journal, 31(2), 159–182. https://doi.org/10.37546/JALTJJ31.2-2

Cobb, T. (2022). Vocabprofile. [Computer program]. http://www.lextutor.ca/vp/

Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34(2), 213–238. https://doi.org/10.2307/3587951

Coxhead, A. (2018). Vocabulary and English for Specific Purposes research: Quantitative and qualitative perspectives. Routledge. https://doi.org/10.4324/9781315146478

Coxhead, A., & Byrd, P. (2007). Preparing Writing Teachers to Teach the Vocabulary and Grammar of Academic Prose. Journal of Second Language Writing, 16(3), 129–147. https://doi.org/10.1016/j.jslw.2007.07.002

Coxhead, A., & Demecheleer, M. (2018). Investigating the technical vocabulary of Plumbing. English for Specific Purposes, 51, 84–97. https://doi.org/10.1016/j.esp.2018.03.006

Coxhead, A. & Hirsch, D. (2007). A pilot science word list for EAP. Revue Française de linguistique appliqueé, 12(2), 65–78. https://doi.org/10.3917/rfla.122.0065

Coxhead, A., & Walls, R. (2012). Ted talks, vocabulary, and listening for EAP. TESOLANZ Journal, 20(1), 55–67.

Council of Europe. (2020). Common European Framework of Reference for Languages: Learning, teaching, assessment–Companion volume. Strasbourg.

Coyle, Y., & Gracia, R. G. (2014). Using songs to enhance L2 vocabulary acquisition in preschool children. ELT Journal, 68(3), 276–285. https://doi.org/10.1093/elt/ccu015

Dang, T.N.Y., & Long, X. (2023). Online news as a resource for incidental learning of core academic words, academic formulas, and general formulas. TESOL Quarterly, 58(1), 32–62. https://doi.org/10.1002/tesq.3208

Dang, T. N. Y., & Lu, C. (2024). Learning academic vocabulary through reading online news. International Review of Applied Linguistics in Language Teaching, 1–21. https://doi.org/10.1515/iral-2023-0206

Dang, T. N. Y., & Webb, S. (2014). The lexical profile of academic spoken English. English for Specific Purposes, 33, 66–76. https://doi.org/10.1016/j.esp.2013.08.001

Davis, G. M. (2017). Songs in the young learner classroom: a critical review of evidence. ELT Journal, 71(4), 445–455. https://doi.org/10.1093/elt/ccw097

Davis, B. H., & Brewer, J., & Brewer, J. P. (1997). Electronic discourse: Linguistic individuals in virtual space. Suny Press.

Graves, K. (2008). The language curriculum: A social contextual perspective. Language Teaching, 41(2), 147–181. https://doi.org/10.1017/S0261444807004867

Ha, H. T. (2022a). Vocabulary demands of informal spoken English revisited: What does it take to understand movies, TV programs, and soap operas? Frontiers in Psychology, 13, Article 831684. https://doi.org/10.3389/fpsyg.2022.831684

Ha, H. T. (2022b). Lexical profile of newspapers revisited: A corpus-based analysis. Frontiers in Psychology, 13, Article 800983. https://doi.org/10.3389/fpsyg.2022.800983

Hsu, W. (2018). Voice of America News as voluminous reading material for mid-frequency vocabulary learning. RELC Journal, 50(3), 408–421. https://doi.org/10.1177/0033688218764460

Hu, M., & Nation, I. S. P. (2000). Unknown vocabulary density and reading comprehension. Reading in a Foreign Language, 13(1), 403–430.

Hulstijn, J. H., Charles, A. J., & Schoonen, R. (2010). Developmental stages in second‑language acquisition and levels of second‑language proficiency: Are there links between them? In I. Bartning, M. Martin, & I. Veddar (Eds.), Communicative Proficiency and Linguistic Development: Intersections between SLA and Language Testing Research (pp. 11–20). Eurosla.

Indarti, D. (2017). Lexical richness of the Jakarta Post opinion articles: Comparison between native and non-native writers. Wanastra, 4(2), 138–142. https://doi.org/10.31294/w.v9i2.2550

Johansson, V. (2008). Lexical diversity and lexical density in speech and writing: A developmental perspective. Working papers, 53, 61–79.

Kaspar, K., & Fuchs, L. A. M. (2021). Who likes what kind of news? The relationship between characteristics of media consumers and news interest. SAGE Open, 11(1), 1–12. https://doi.org/10.1177/21582440211003089

Kembaren, F. R., & Aswani, A. N. (2022). Exploring lexical density in the New York Times. Journal of English Language, Literature, and Teaching, 7(2), 110–119. https://doi.org/10.32528/ellite.v7i2.8795

Kyongho, H., & Nation, I. S. P. (1989). Reducing the Vocabulary Load and Encouraging Vocabulary Learning through Reading Newspapers. Reading in a Foreign Language, 6, 323–335.

Laosrirattanachai, P., & Laosrirattanachai, P. (2023). Analysis of vocabulary use and move structures of the World Health Organization Emergencies press conferences on Coronavirus Disease: A corpus–based investigation. LEARN Journal: Language Education and Acquisition Research Network, 16(1), 121–146. https://so04.tci-thaijo.org/index.php/LEARN/article/view/263436

Laosrirattanachai, P., & Laosrirattanachai, P. (2025a). Unveiling the distinction of near synonymy: A corpus-based analysis on attempt, endeavor, strive, and try. PASAA, 70, 132–163.

Laosrirattanachai, P., & Laosrirattanachai, P. (2025b). Tracing tourism business research trends in Scopus-indexed journals using corpus-based and judgement-based approaches. Humanities, Arts and Social Sciences Studies, 25(1), 32–53. https://doi.org/10.69598/hasss.25.1.268122

Laufer, B., & Nation, I. S. P. (1995). Vocabulary size and use: Lexical richness in L2 written production. Applied Linguistics, 16(3), 307–322. https://doi.org/10.1093/applin/16.3.307

Li, Z., Li, J. Z., Zhang, X., & Reynolds, B. L. (2024). Mastery of listening and reading vocabulary levels in relation to CEFR: Insights into student admissions and English as a medium of instruction. Languages, 9(7), 239. https://doi.org/10.3390/languages9070239

Liu, C. Y. (2021). Examining the implementation of academic vocabulary, lexical density, and speech rate features on OpenCourseWare and MOOC lectures. Interactive Learning Environments, 31(8), 4924–4939. https://doi.org/10.1080/10494820.2021.1987274

Madarbakus-Ring, N., & Benson, S. (2024). TED Talks and the textbook: An in-depth lexical analysis. Languages, 9(10), 309. https://doi.org/10.3390/languages9100309

Meebangsai, D., Pongtin, P., Kitipoontanakorn, P., & Laosrirattanachai, P. (2023). Investigating proficiency of academic English in student writing: A comparative case study on vocabulary utilization in student research article writing vis–à–vis national and international research. PASAA, 67, 66–100. https://doi.org/10.58837/CHULA.PASAA.67.1.3

Moore, T., Morton, J., Hall, D., & Wallis, C. (2015). Literacy practices in the professional workplace: implications for the IELTS reading and writing tests. IELTS Research Reports Online Series, 46. https://ielts.org/researchers/our-research/research-reports/literacy-practices-in-the-professional-workplace-implications-for-the-ielts-reading-and-writing-tests

Na Ayutthaya, J. A., Kunthonjinda, K., Somwang, K., & Laosrirattanachai, P. (2022). Making beverage service word list for English for Specific Purposes classroom. rEFLections, 29(2), 325–343. https://doi.org/10.61508/refl.v29i2.259524

Nation, I. S. P. (2006). How large a vocabulary is needed to reading and listening? The Canadian Modern Language Review, 63(1), 59–82. https://doi.org/10.3138/cmlr.63.1.59

Nation, I. S. P. (2016). Making and using word lists for language learning and testing. John Benjamins. https://doi.org/10.1075/z.208

Nation, I. S. P. (2017). The BNC/COCA Level 6 word family lists (Version 1.0.0) [Data file]. http://www.victoria.ac.nz/lalsstaff/paul-nation.aspxl

Nation, I. S. P. (2018, April, 10). Resources. https://www.wgtn.ac.nz/lals/resources.

Nation, I. S. P. (2022). Learning vocabulary in another language (3rd ed.). Cambridge University Press. https://doi.org/10.1017/9781009093873

Nation, I. S. P., & Crabbe, D. (1991). A survival language learning syllabus for foreign travel. System, 19(3), 191–201. https://doi.org/10.1016/0346-251X(91)90044-P

Nation, P., & Waring, R. (1997). Vocabulary size, text coverage and word lists. In N. Schmitt & M. McCarthy (Eds.), Vocabulary, description, acquisition and pedagogy (pp. 6–19). Cambridge University Press.

Nasseri, M., & Thompson, P. (2021). Lexical density and diversity in dissertation abstracts: Revisiting English L1 vs. L2 text differences. Assessing Writing, 47, Article 100511. https://doi.org/10.1016/j.asw.2020.100511

Reynolds, B. L., Xie, X., & Pham, Q. H. P. (2022). Incidental vocabulary acquisition from listening to English teacher education lectures: A case study from Macau higher education. Frontiers in Psychology, 13, 1–18. https://doi.org/10.3389/fpsyg.2022.993445

Sayer, P., & Ban, R. (2014). Young EFL students’ engagements with English outside the classroom. ELT Journal, 68(3), 321–329. https://doi.org/10.1093/elt/ccu013

Schmitt, N. (2008). Instructed second language vocabulary learning. Language Teaching Research, 12(3), 329–363. https://doi.org/10.1177/1362168808089921

Schmitt, N., Jiang, X., & Grabe, W. (2011). The percentage of words known in a text and reading comprehension. Modern Language Journal, 95(1), 26–43. https://doi.org/10.1111/j.1540-4781.2011.01146.x

Schmitt, N., & Schmitt, D. (2014). A reassessment of frequency and vocabulary size in L2 vocabulary teaching. Language Teaching, 47(4), 484–503. https://doi.org/10.1017/S0261444812000018

Stubbs, M. (1986). Lexical density: A computational technique and some findings. In M. Coulthard (Ed.), Talking about text (pp. 27–48). University of Birmingham.

Tegge, F. (2018). Pop songs in the classroom: time-filler or teaching tool? ELT Journal, 72(3), 274–284. https://doi.org/10.1093/elt/ccx071

Teng, F. (2015). EFL vocabulary learning through reading BBC news: An analysis based on the Involvement Load Hypothesis. English as a Global Language Education (EaGLE) Journal, 1(2), 63–90. https://doi.org/10.6294/EaGLE.2015.0102.03

Thornbury, S., & Slade, D. (2006). Conversation: from Description to Pedagogy. Cambridge University Press.

Treffers-Daller, J., Parslow, P., & Williams, S. (2018). Back to basics: How measures of lexical diversity can help discriminate between CEFR levels. Applied Linguistics, 39(3), 302–327. https://doi.org/10.1093/applin/amw009

Ure, J. (1971). Lexical density and category differentiation. ln G. E. Perren & J. L. M. Trim (Eds.), Applications of linguistics (pp. 443–452). Cambridge University Press.

van Zeeland, H., & Schmitt, N. (2013). Lexical coverage in L1 and L2 listening comprehension: The same or different from reading comprehension? Applied Linguistics, 34(4), 457–479. https://doi.org/10.1093/applin/ams074

Vuković-Stamatović, M., & Čarapić, D. (2024). Vocabulary profile, lexical density and speech rate in science podcasts: How appropriate are science podcasts for EAP and EST listening? Ibérica, 47, 201–226. https://doi.org/10.17398/2340-2784.47.201

Wingrove, P. (2017). How suitable are TED talks for academic listening? Journal of English for Academic Purposes, 30, 79–95. https://doi.org/10.1016/j.jeap.2017.10.010

West, M. (1953). A general service list of English words. Longman.