A Corpus Study of Language Simplification and Grammar in Graded Readers

Main Article Content

Azrifah Zakaria
Willy A Renandya
Vahid Aryadoust

Abstract

Studies on graded readers used in extensive reading have tended to focus on vocabulary. This study set out to investigate the linguistic profile of graded readers, taking into account both grammar and lexis. A corpus of 90 readers were tagged according to the variables in Biber’s Multidimensional (MD) analysis, using the Multidimensional Analysis Tagger (MAT). These variables were analysed using latent class cluster analysis to determine whether the graded readers can be grouped by similarity in linguistic features. While MAT analysis surfaced more similarities than differences within the corpus, latent class clustering produced an optimal 3-class model. Post-hoc concordance analyses showed that graded readers may be categorised as having three classes of complexity: beginner, transitional, and advanced. The findings in the study suggest that selection of reading materials for extensive reading should take into consideration grammatical complexity as well as lexis. The linguistic profiles compiled in this study detail the grammatical structures and the associated lexical items within the structures that teachers may expect their students to encounter when reading graded readers. In addition, the profiles may be of benefit to teachers seeking to supplement extensive reading with form-focused instruction.

Article Details

How to Cite
Zakaria, A., Renandya, W. A., & Aryadoust, V. (2023). A Corpus Study of Language Simplification and Grammar in Graded Readers. LEARN Journal: Language Education and Acquisition Research Network, 16(2), 130–153. Retrieved from https://so04.tci-thaijo.org/index.php/LEARN/article/view/266938
Section
Research Articles
Author Biographies

Azrifah Zakaria, English Language and Literature, National Institute of Education, Nanyang Technological University, Singapore

A PhD candidate at the National Institute of Education, Nanyang Technological University, Singapore. Her research interests are in corpus linguistics, language assessment and computer assisted language learning. Besides teaching, she has previously worked in early childhood education and intervention.

Willy A Renandya, English Language and Literature, National Institute of Education, Nanyang Technological University, Singapore

A language teacher educator with extensive teaching experience in Asia. He currently teaches applied linguistics courses at the National Institute of Education, Nanyang Technological University. He has given numerous keynote presentations at international ELT conferences and has published extensively in the area of second language education.

Vahid Aryadoust, English Language and Literature, National Institute of Education, Nanyang Technological University, Singapore

An Associate Professor of language assessment at the National Institute of Education of Nanyang Technological University. Vahid has published his research in Language Testing, Language Assessment Quarterly, Assessing Writing, Educational Assessment, Educational Psychology, and Computer Assisted Language Learning, etc. He has also (co)authored multiple book chapters and books published by Routledge, Cambridge University Press, Springer, Cambridge Scholar Publishing, Wiley Blackwell, etc. He teaches graduate courses on Oracy, language assessment, and research methods.

References

Allan, R. (2016). Lexical bundles in graded readers: To what extent does language restriction affect lexical patterning? System, 59, 61–72. https://doi.org/10.1016/j.system.2016.04.005

Aryadoust, V. (2020). Measureable dimensions of visual mental imagery and their relationship with listening comprehension: Evidence from forensic arts and latent class analysis. Imagination, Cognition and Personality, 39(3), 291–319. https://doi.org/10.1177/0276236619829879

Bamford, J., & Day, R. R. (2004). Extensive reading activities for teaching language. Cambridge University Press.

Berber Sardinha, T., Veirano Pinto, M., & Berserik, F. (Eds.). (2015). Multi-dimensional analysis, 25 years on a tribute to Douglas Biber. [electronic resource]. John Benjamins Publishing Company.

Berber Sardinha, T., & Veirano Pinto, M. (2019). Dimensions of variation across American television registers. International Journal of Corpus Linguistics, 24(1), 3–32. https://doi.org/10.1075/ijcl.15014.ber

Biber, D. (1986). Spoken and written textual dimensions in English: Resolving the contradictory findings. Language, 62(2), 384–414. https://doi.org/10.2307/414678

Biber, D. (1995). Variation across speech and writing. Cambridge University Press. (Original work published 1988).

Biber, D. (1989). A typology of English texts. Linguistics, 27(1), 3–43. https://doi.org/10.1515/ling.1989.27.1.3

Biber, D., & Gray, B. (2011). Grammatical change in the noun phrase: the influence of written language use. English Language and Linguistics, 15(2), 223–250. https://doi.org/10.1017/S1360674311000025

Burnham, K., & Anderson, D. (2004). Multimodel inference: Understanding AIC and BIC in model selection. Sociological Methods & Research, 33(2), 261–304. https://doi.org/10.1177/0049124104268644

Chiu, C.-Y., Douglas, J. A., & Li, X. (2009). Cluster analysis for

cognitive diagnosis: Theory and applications. Psychometrika, 74(4), 633–665. https://doi.org/10.1007/s11336-009-9125-0

Crossley, S. A., Allen, D. B., & McNamara, D. S. (2011). Text readability and intuitive simplification: A comparison of readability formulas. Reading in a Foreign Language, 23(1), 84–101. http://nflrc.hawaii.edu/rfl/April2011/articles/crossley.pdf

Crossley, S. A., Allen, D., & McNamara, D. S. (2012). Text simplification and comprehensible input: A case for an intuitive approach. Language Teaching Research, 16(1), 89-108. http://dx.doi.org/10.1177/1362168811423456

Crossley, S. A., Yang, H.S., & McNamara, D. S. (2014). What’s so simple about simplified texts? A computational and psycholinguistic investigation of text comprehension and text processing. Reading in a Foreign Language, 26(1), 92–113. http://nflrc.hawaii.edu/rfl/April2014/articles/crossley.pdf

Dor, D. (2005). Toward a semantic account of that-deletion in English. Linguistics, 43(2), 345–. https://doi.org/10.1515/ling.2005.43.2.345

Ellis, N.C. (2002). Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition. Studies in Second Language Acquisition, 24(2), 143–188. https://doi.org/10.1017/S0272263102002024

Friginal, E., Pearson, P., Di Ferrante, L., Pickering, L., & Bruce, C. (2013). Linguistic characteristics of AAC discourse in the workplace. Discourse Studies, 15(3), 279–298. https://doi.org/10.1177/1461445613480586

Graesser, A., McNamara, D., Cai, Z., Conley, M., Li, H., & Pennebaker, J. (2014). Coh-Metrix measures text characteristics at multiple levels of language and discourse. The Elementary School Journal, 115(2), 210–229. https://doi.org/10.1086/678293

Gray, B. (2013). More than discipline: uncovering multi-dimensional patterns of variation in academic research articles. Corpora, 8(2), 153–181. https://doi.org/10.3366/cor.2013.0039

Härdle, W., & Simar, L. (2015). Applied multivariate statistical analysis. (4th ed.). Springer.

Jeon, E.Y., & Day, R. R. (2016). The effectiveness of ER on reading proficiency: A meta-analysis. Reading in a Foreign Language, 28(2), 246–265. http://www.nflrc.hawaii.edu/rfl/October2016/articles/jeon.pdf

Kano, M. (2015). Revealing factors affecting learners’ sense of “difficulty” in extensive reading through reader corpora. Procedia – Social and Behavioral Sciences, 198, 211-217. https://doi.org/10.1016/j.sbspro.2015.07.438

Latent Gold (Version 4.5) [Computer software]. Available at: https://www.statisticalinnovations.com/

Laufer, B., & Ravenhorst-Kalovski, G. C. (2010). Lexical threshold revisited: Lexical text coverage, learners’ vocabulary size and reading comprehension. Reading in a Foreign Language, 22(1), 15–30. https://nflrc.hawaii.edu/rfl/April2010/articles/laufer.pdf

Martinez, R., & Schmitt, N. (2012). A phrasal expressions list. Applied Linguistics, 33(3), 299–320. https://doi.org/10.1093/applin/ams010

McDonough, K., & Trofimovich, P. (2016). Structural priming and the acquisition of novel form-meaning mappings. In T. Cadierno, S. Eskildsen, & A. Barraja-Rohan (Eds.). Usage-based perspectives on second language learning [electronic resource]. De Gruyter Mouton.

Nakanishi, T. (2015). A meta-analysis of extensive reading research. TESOL Quarterly, 49(1), 6–37. https://doi.org/10.1002/tesq.157

Nation, I. S. P. (2007). The four strands. Innovation in Language Learning and Teaching, 1(1), 2–13. https://doi.org/10.2167/illt039.0

Nation, I. S. P. (2006). How large a vocabulary is needed for reading and listening? Canadian Modern Language Review, 63, 59-82. https://doi.org/10.3138/cmlr.63.1.59

Nation, I. S. P. & Waring, R. (2020). Teaching extensive reading in another language. Routledge.

Nini, A. (2019). The Multi-Dimensional Analysis Tagger. In T. Berber Sardinha & M. Veirano Pinto (Eds.). Multi-dimensional analysis: Research methods and current issues, 67-94. Bloomsbury Academic. Advance online publication. https://niniandrea.files.wordpress.com/2019/06/pre-print-the-multidimensional-analysis-tagger.pdf

Nini, A. (2020). Multidimensional Analysis Tagger (Version 1.3.1) [Computer software]. Available at: https://sites.google.com/site/multidimensionaltagger/versions

Nylund, K. L., Asparouhov, T., & Muthén, B. O. (2007). Deciding on the number of classes in latent class analysis and growth mixture modelling: A Monte Carlo simulation study. Structural Equation Modeling, 14(4), 535–569. https://doi.org/10.1080/10705510701575396

Quirk, R., Greenbaum, S., Leech, G. & Svartvik, J. (1985). A comprehensive grammar of the English language. Longman.

Vermunt, J.K. & Magidson, J. (2002). Latent class cluster analysis. In J. Hagenaars & A. McCutcheon (Eds.). Applied latent class analysis [electronic resource]. Cambridge University Press.

Vermunt, J.K. & Magidson, J. (2005a). Latent Gold 4.0 user’s guide. Statistical Innovations Inc. Retrieved from: https://www.statisticalinnovations.com/wp-content/uploads/LGusersguide.pdf

Vermunt, J.K. & Magidson, J. (2005b). Technical guide for Latent GOLD Choice 4.0: Basic and advanced. Statistical Innovations Inc. Retrieved from: https://www.statisticalinnovations.com/wp-content/uploads/LGCtechnical.pdf

Vrieze, S. I. (2012). Model selection and psychological theory: A discussion of the differences between the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). Psychological Methods, 17(2), 228–243. https://doi.org/10.1037/a0027127

Wan-a-rom, U. (2008). Comparing the vocabulary of different graded-reading schemes. Reading in a Foreign Language, 20(1), 43–69. https://nflrc.hawaii.edu/rfl/April2008/wanarom/wanarom.pdf

Wordsmith Tools (Version 4.0) [Computer software]. Lexical Analysis Software and Oxford University Press. Available at: https://www.lexically.net/wordsmith/version4/