Detecting Latent Topics for Categorizing Educational Dissertations of Leading Universities in the United Staes and Thailand: Structural Topic Modeling Analysis
Main Article Content
Abstract
This research aimed to conduct a rigorous examination and comparison of latent topics of educational dissertations in leading universities in Thailand and the U.S. The dataset was acquired through web scraping, encompassing titles and abstracts of educational dissertations spanning a five-year period from 2019 to 2023. Specifically, we gathered 435 subjects in Thai from the Chulalongkorn University Institutional Repository (CUIR) and 363 subjects in English from the databases of the top 10 U.S. universities. The dataset was meticulously curated and processed utilizing Python's Natural Language Toolkit (NLTK) library. Subsequently, we performed an in-depth analysis employing Topic Modeling Analysis, with a focus on the Latent Dirichlet Allocation (LDA) method, BERTopic library, and Uniform Manifold Approximation and Projection (UMAP) model. The findings of this research underscore significant disparities in the topics explored within educational dissertations between Thailand and top-tier U.S. universities. In the context of Thailand, dissertation topics predominantly revolved around the development of teaching methodologies, student learning enhancement, and the resolution of teacher development issues. In contrast, dissertations from U.S. institutions encompassed a broader spectrum of themes, including area-based education, the development of ethnic and racial identity in students, and strategies for promoting equality among learners with diverse backgrounds. These outcomes highlight noteworthy variations in the academic emphasis and research priorities of these two distinct educational landscapes.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The Journal of Information and Learning is operated by the Office of Academic Resources, Prince of Songkla University. All articles published in the journal are protected by Thailand copyright law. This copyright covers the exclusive rights to share, reproduce and distribute the article, including in electronic forms, reprints, translations, photographic reproductions, or similar. Authors own copyrights in the works they have created as well as the Office of Academic Resources. The Journal reserves the right to edit the language of papers accepted for publication for clarity and correctness, as well as to make formal changes to ensure compliance with the journal's guidelines. All authors must take public responsibility for the content of their paper.
References
Al-Rawi, A., Al-Musalli, A., & Fakida, A. (2021). News values on Instagram: A comparative study of international news. Journalism and Media, 2(2), 305-320. https://doi.org/10.3390/journalmedia2020018
Amara, A., Hadj Taieb, M. A., & Ben Aouicha, M. (2021). Multilingual topic modeling for tracking COVID-19 trends based on Facebook data analysis. Applied Intelligence, 51, 3052-3073. https://doi.org/10.1007/s10489-020-02033-3
Aureli, S. (2017). A comparison of content analysis usage and text mining in CSR corporate disclosure. International Journal of Digital Accounting Research, 17, 1-32. http://doi.org/10.4192/1577-8517-v17_1
Bhattacharya, D. Sahoo, S., & Panda, B. N. (2021). Emerging educational research trends in 21st century. ResearchGate. https://www.researchgate.net/publication/359930009_Emerging_Educational_Research_Trends_in_21st_Century
Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77-84. https://doi.org/10.1145/2133806.2133826
Blei, D. M., Ng, A. Y., Jordan, M. I., & Lafferty, J. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(4/5), 993-1022.https://search.ebscohost.com/login.aspx?direct=true&db=asn&AN=12323372&site=ehost-live&scope=site
Chauhan, U., & Shah, A. (2021). Topic modeling using latent dirichlet allocation: A survey. ACM Computing Surveys, 54(7), 1-35. https://doi.org/10.1145/3462478
Ekin, C. C., Polat, E., & Hopcan, S. (2023). Drawing the big picture of games in education: A topic modeling-based review of past 55 years. Computers & Education, 194, 104700. https://doi.org/10.1016/j.compedu.2022.104700
Hamilton, R. N. (2020, December 15). Educational inequality in America: race and gender. Around Robin. https://www.aroundrobin.com/educational-inequality-in-america
Hanauer, N. (2019, July 15). Better schools won’t fix America. The Atlantic. https://www.theatlantic.com/magazine/archive/2019/07/education-isnt-enough/590611
Hong, L., & Davison, B. D. (2010, July 25-28). Empirical study of topic modeling in twitter [Conference session]. The 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C., USA. https://doi.org/10.1145/1964858.1964870
Hwang, S., Flavin, E., & Lee, J. E. (2023). Exploring research trends of technology use in mathematics education: A scoping review using topic modeling. Education and Information Technologies, 1-28. https://doi.org/10.1007/s10639-023-11603-0
Kaemanee, T. (2010). Sāt kānsō̜n: ʻongkhwāmrū phư̄a kānčhat krabūankān rīanrū thī mī prasitthiphāp [Science of teaching: Knowledge for effective learning organization]. Chulalongkorn University.
Karbasian, H., & Johri, A. (2020, March 11-14). Insights for curriculum development: Identifying emerging data science topics through analysis of Q&A communities [Conference session]. SIGCSE '20: The 51st ACM Technical Symposium on Computer Science Education, Oregon, USA. https://doi.org/10.1145/3328778.3366817
Mazumder, S., & Barui, T. (2021). Discovering topics from the titles of the Indian LIS theses. Library Philosophy and Practice (e-journal), 1, 5924. https://digitalcommons.unl.edu/libphilprac/5924
Mujahid, M., Lee, E., Rustam, F., Washington, P. B., Ullah, S., Reshi, A. A., & Ashraf, I. (2021). Sentiment analysis and topic modeling on tweets about online education during COVID-19. Applied Sciences, 11(18), 8438. https://doi.org/10.3390/app11188438
Murrugarra-Llerena, J., Alva-Manchego, F., & Murrugarra-Llerena, N. (2022, December 7-11). Improving embeddings representations for comparing higher education curricula: A use case in computing [Conference session]. The 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates. http://doi.org/10.18653/v1/2022.emnlp-main.776
Noklang, S. (2019). Alignment of curricula and theses in educational research graduate programs: Application of text mining of Thai and international databases [Doctoral dissertation, Chulalongkorn University]. CUIR Database. https://cuir.car.chula.ac.th/handle/123456789/70021
The QS World University Rankings. (2021). QS World university rankings by subject 2021: Education & Training. The QS World University Rankings. https://www.topuniversities.com/university-rankings/university-subject-rankings/2021/education-training?®ion=Asia&countries=th
Schaeffer, K. (2023, July 24). What federal education data shows about students with disabilities in the U.S. Pew Research Center. https://www.pewresearch.org/short-reads/2023/07/24/what-federal-education-data-shows-about-students-with-disabilities-in-the-us
Schmiedel, T., Müller, O., & Vom Brocke, J. (2019). Topic modeling as a strategy of inquiry in organizational research: A tutorial with an application example on organizational culture. Organizational Research Methods, 22(4), 941-968. https://doi.org/10.1177/1094428118773858
Sheridan, S. (2022, November 16). What is topic modeling? A beginner's guide. Levity. https://levity.ai/blog/what-is-topic-modeling
Shrader, C. B., Ravenscroft, S. P., Kaufmann, J. B., & Hansen, K. (2021). Collusion among accounting students: Data visualization and topic modeling of student interviews. Decision Sciences Journal of Innovative Education, 19(1), 40-62. https://doi.org/10.1111/dsji.12226
Siridhrungsri, P. (2009). Phāp kānsưksā Thai nai ʻanākhot sip-yīsip pī: Rāingān kānwičhai [Thailand education scenario in 10-20 years: research report]. Office of the Educational Council.
The Times Higher Education World University Rankings. (2020). World university rankings 2020 by subject. The Times Higher Education World University Rankings. https://www.timeshighereducation.com/world-university-rankings/2020/world-ranking#!/length/25/locations/THA/subjects/3108/sort_by/rank/sort_order/asc/cols/stats
U.S. News & World Report. (2023). 2023-2024 Best education schools. U.S. News & World Report. https://www.usnews.com/best-graduate-schools/top-education-schools/edu-rankings
van Eck, N. J., & Waltman, L. (2011). Text mining and visualization using VOSviewer. arXiv preprint arXiv: 1109.2058. https://doi.org/10.48550/arXiv.1109.2058
Vayansky, I., & Kumar, S. A. P. (2020). A review of topic modeling methods. Information Systems, 94, 1-15. https://doi.org/10.1016/j.is.2020.101582
Walker, R. M., Chandra, Y., Zhang, J., & Van Witteloostuijn, A. (2019). Topic modeling the research practice gap in public administration. Public Administration Review, 79(6), 931-937. https://doi.org/10.1111/puar.13095
Yang, T. I., Torget, A., & Mihalcea, R. (2011, June 24). Topic modeling on historical newspapers [Conference session]. The 5th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, Oregon, USA. https://aclanthology.org/W11-1513.pdf
Yun, E. (2020). Review of trends in physics education research using topic modeling. Journal of Baltic Science Education, 19(3), 388-400. http://dx.doi.org/10.33225/jbse/20.19.388