Corpus Linguistics and Cinematic Discourse: Lexical Bundles in Mainstream Film Scripts

Main Article Content

Runze Xu
Raksangob Wijitsopon


Hollywood blockbuster films have long attracted not only mass audiences but also scholarly attention. In line with contemporary applied linguistics interests in telecinematic discourse, the present study draws upon concepts and techniques in corpus linguistics to describe the language of American mainstream film scripts. The concept of lexical bundles was employed to identify linguistic patterns characteristic of scripts of American mainstream films produced by entertainment conglomerates, which are popular in the U.S. Results show that American mainstream film scripts are characterized mainly by spoken formulaic expressions. However, descriptive expressions, such as place-referential and action-related lexical bundles, also predominantly make up the given register. Further qualitative analysis reveals that these common multi-word expressions have functional contributions to film scripts in terms of creation of conflicts in plots, characterization, and building engagement with audiences.


Download data is not yet available.

Article Details

How to Cite
Xu, R., & Wijitsopon, R. (2023). Corpus Linguistics and Cinematic Discourse: Lexical Bundles in Mainstream Film Scripts. LEARN Journal: Language Education and Acquisition Research Network, 16(1), 545–574. Retrieved from
Research Articles
Author Biographies

Runze Xu, English as an International Language Program, Graduate School, Chulalongkorn University, Thailand

A MA student in the English as an International Language (EIL) Program, Graduate School, Chulalongkorn University. His research interests are telecinematic stylistics and corpus linguistics.

Raksangob Wijitsopon, Department of English and Corpus Linguistics for Digital Humanities Research Unit, Faculty of Arts, Chulalongkorn University, Thailand

An Associate Professor in the Department of English and Head of Corpus Linguistics for Digital Humanities Research Unit, Faculty of Arts, Chulalongkorn University. Her research interests are corpus linguistics, stylistics and discourse analysis.


‘A Quiet Place Part II’ Makes Serious Memorial Day Noise With A $48.4 Million Three-Day Bow; ‘Cruella’ Is Solid In Second With $21.3 Million. (2021, May 30). Retrieved June 2, 2021, from Box Office Mojo website:

Anthony, L. (2020). AntConc [Mac OS]. Tokyo, Japan: Waseda University. Retrieved from Available from

Baumbach, N. (Director). (2019). Marriage Story [Comedy, Drama, Romance]. Heyday Films, Netflix.

Bednarek, M. (2011). The stability of the televisual character. The stability of the televisual character: A corpus stylistic case study. In R. Piazza, M. Bednarek & F. Rossi (Eds), Telecinematic discourse: Approaches to the language of films and television series (pp. 185-204). John Benjamins:

Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman Grammar of Spoken and Written English. Longman.

Bousfield, D. (2010). Researching impoliteness and rudeness: Issues and definitions. In M. A. Locher and S. L. Graham (Eds.), Interpersonal pragmatics Vol. 6 (pp. 101–136). De Gruyter.

Bousfield, D., & McIntyre, D. (2018). Creative linguistic impoliteness as aggression in Stanley Kubrick’s Full Metal Jacket. Journal of Literary Semantics, 47, 43-65.

Brueggemann, T., & Brueggemann, T. (2019, March 6). How much would ‘Roma’ have grossed with a traditional release? We have an answer. Retrieved June 2, 2021, from IndieWire website:

Bucholtz, M., & Hall, K. (2005). Identity and interaction: A sociocultural linguistic approach. Discourse Studies, 7, 585–614.

Conrad, S. M., & Biber, D. (2005). The frequency and use of lexical bundles in conversation and academic prose. Lexicographica.

Crosthwaite, P., Ningrum, S., & Schweinberger, M. (2022). Research trends in corpus linguistics: A bibliometric analysis of two decades of Scopus-indexed corpus linguistics research in arts and humanities. International Journal of Corpus Linguistics. Retrieved from

Cuarón, A. (Director). (2018). ROMA.

Freddi, M. (2011). A phraseological approach to film dialogue: Film stylistics revisited. Yearbook of Phraseology, 2(1), 137–162.

Gemser, G., Van Oostrum, M., & Leenders, M. A. A. M. (2007). The impact of film reviews on the box office performance of art house versus mainstream motion pictures. Journal of Cultural Economics, 31(1), 43–63.

Gregory, M. (1967). Aspects of varieties differentiation. Journal of Linguistics, 3(2), 177–198.

Hyland, K. (2008). As can be seen: Lexical bundles and disciplinary variation. English for Specific Purposes, 27(1), 4–21.

Jautz, S., & Minow, V. (2020). ‘Drucilla, we: The formulaic nature of problem-oriented talk in soap operas. In C. Hoffmann & M. Kirner-Ludwig (Eds.), Telecinematic Stylistics (1st ed., pp. 63–86). Bloomsbury Academic. Bloomsbury Collections. Retrieved from

King, G. (2005). American independent cinema. I B Tauris.

Lee, C. (2015). Digital discourse@public space: Flows of language online and offline. In R. H. Jones, A. Chik, & C. A. Hafner (Eds.), Discourse and digital practices: Doing discourse analysis in the digital age (pp. 175–192). Routledge.

McIntyre, D. (2012). Prototypical characteristics of blockbuster movie dialogue: A corpus stylistic analysis. Texas Studies in Literature and Language, 54(3), 402–425.

Meyer, J., Song, R., & Ha, K. (2016). The effect of product placements on the evaluation of movies. European Journal of Marketing, 50(3/4), 530–549.

Newman, M. Z. (2011). Indie: An American film culture. Columbia University Press. JSTOR.

Nursanti, R. Y. (2015). A pragmatic analysis of maxim flouting in Hunger Games Movie [Unpublished thesis]. Yokyakarta State University.

Nuryani, E. (2016). A pragmatic analysis of politeness features of criticism in Joseph Mcginty’s This Means War. [Unpublished thesis]. Yokyakarta State University.

Pavesi, M. (2020). ‘I shouldn’t have let this happen’: Demonstratives in film dialogue and film representation. In C. Hoffmann & M. Kirner-Ludwig (Eds.), Telecinematic Stylistics (1st ed., pp. 19–38). Bloomsbury Academic. Bloomsbury Collections. Retrieved from

Reichelt, S. (2020). Innovation on screen: Marked affixation as characterization cue in Buffy the Vampire Slayer.

Sarris, A. (1963). The Auteur Theory and the Perils of Pauline. Film Quarterly, 16(4), 26–33. JSTOR.

Scorsese, M. (Director). (2019). The Irishman [Biography, crime, drama]. Tribeca Productions, Sikelia Productions, Winkler Films.

Shirazizadeh, M., & Amirfazlian, R. (2021). Lexical bundles in theses, articles and textbooks of applied linguistics: Investigating intradisciplinary uniformity and variation. Journal of English for Academic Purposes, 49, 100946.

Siricharoen, A., & Wijitsopon, R. (2020). A corpus-based comparative study of lexical bundles in authentic and textbook English business emails. LEARN Journal: Language Education and Acquisition Research Network, 13(2), 41–63.

Statham, S. (2015). ‘A guy in my position is a government target … You got to be extra, extra careful’: Participation and strategies in crime talk in The Sopranos. Language and Literature, 24(4), 322–337.

Thompson, K. (2003). The Classical Hollywood Cinema: Film Style and Mode of Production to 1960. Routledge.

Tzioumakis, Y. (2017). Introduction: Problems of definition and the discourse of American independent cinema. In Y. Tzioumakis (Ed.), American Independent Cinema: Second Edition (pp. 1–14). Edinburgh University Press. Cambridge Core. Retrieved from

Vivarelli, N., & Vivarelli, N. (2019, November 22). ‘The Irishman’ is Netflix’s biggest theatrical release at home and abroad. Retrieved June 2, 2021, from Variety website:

Zago, R. (2020). Film discourse. In E. Friginal & J. A. Hardy The routledge handbook of corpus approaches to discourse analysis (pp. 168–182). Routledge.

Zuckerman, E. W., & Kim, T. (2003). The critical trade‐off: Identity assignment and box‐office success in the feature film industry. Industrial and Corporate Change, 12(1), 27–67.