การใช้คำกริยาเป็นแกนในการสร้างกรอบมโนทัศน์ ภาษาไทยจากคลังข้อมูลด้านการเกษตร เพื่อการประมวลผลภาษาด้วยคอมพิวเตอร์

Main Article Content

Mukda Suktarachan
Puttachart Potibal
Natchanan Natpratan

Abstract

This paper presents a construction of Thai concept frames applied in language processing for agricultural domain based on the verb centric approach. Fillmore’s (1974) and Larson’s (1984) frame elements were selected to represent necessary scenarios in the domain. The source data comprised 5,784 Thai sentences which were preprocessed with automatic word segmentation, automatic POS tagging, manual shallow parsing, and manual semantic role labeling. The 106 most frequently used verbs were chosen for observation. Frame elements and syntactic-semantic relations for each of them were then constructed from the source data using an automatic capturing tool. As a result, 962 case frames were constructed, while 5,784 annotated sentences were formulated as syntactic-semantic constraints for easy application to language processing.

Article Details

Section
Research Articles

References

พุทธชาติ โปธิบาล และคณะ. (2557). โครงการวิจัยเรื่องการสร้างกรอบมโนทัศน์ภาษาไทยจากคลังข้อมูลด้านการเกษตรเพื่อเป็นฐานความรู้ในการสกัดข้อความอัตโนมัติสำหรับระบบถาม-ตอบภาษาไทย. กรุงเทพฯ: ศูนย์เทคโนโลยีอิเล็กทรอนิกส์และคอมพิวเตอร์แห่งชาติ สานักงานพัฒนาวิทยาศาสตร์และเทคโนโลยีแห่งชาติ.

ราชบัณฑิตยสถาน. (2553). พจนานุกรมศัพท์ภาษาศาสตร์ (ภาษาศาสตร์ประยุกต์). กรุงเทพฯ: รุ่งศิลป์การพิมพ์ (1977).

วิจินตน์ ภาณุพงศ์. (2532). โครงสร้างของภาษาไทย: ระบบไวยากรณ์ (พิมพ์ครั้งที่ 10). กรุงเทพฯ: มหาวิทยาลัยรามคำแหง.

Burchardt, A. et al. (2009). FrameNet for the semantic analysis of German: Annotation, representation and automation (preprint). In Hans C. Boas (Ed.), Multilingual FrameNets in Computational Lexicography: Methods and Applications. Mouton de Guyter.

Chafe, W. (1970). Meaning and the Structure of Language. Chicago: The University of Chicago Press.

Chomsky, N. (1981). Lectures on Government and Binding. Mouton.

Cook, Walter A., SJ. (1989). Case Grammar Theory. Washington, DC: Georgetown University Press.

Eva Lavric, et al. (2008). The Kicktionary: Combining corpus linguistics and lexical semantics for a multilingual football dictionary. In The Linguistics of Football (Language in Performance 38) (pp. 11-23). Tübingen: Gunter Narr.

Fellbaum C. (1998). WordNet: An Electronic Lexical Database. Bradford Books.

Fillmore, C. (1968). The Case for Case. In Emmon Bach and R.T. Harms (Eds.), Universals in Linguistic Theory. New York: Holt, Rinehart and Winston.

Fillmore, C. (1971). Some Problems for Case Grammar. In RJ, (ed.), O'Brien (pp. 35-56).

Fillmore C. (1976). Frame semantics and the nature of language. Annals of the New York Academy of Sciences.
Conference on the Origin and Development of Language and Speech, 280, 20-32.

Fillmore C. (1977). The need for a frame semantics in linguistics. In H. Karlgren Scriptor (Ed.), Statistical Methods in Linguistics.

Fillmore C. (1982). Frame semantics. In Linguistics in the Morning Calm (pp. 111-137). Seoul, South Korea: Hanshin Publishing.

Fillmore C. (1985). Frames and the semantics of understanding. Quaderni di Semantica, 6, 222-254.

Fillmore, C. and C. Baker. (2001). Frame Semantics for Text Understanding. Proceedings of WordNet and Other Lexical Resources Workshop, Pittsburgh. NAACL.

Fillmore, C. and C. Baker. (2010). A frames approach to semantic analysis. In Heine, B. and Narrog, H., (eds.), Oxford Handbook of Linguistic Analysis: (pp. 13-341).

Fillmore CJ, Petruck MRL, Ruppenhofer J, Wright A. Framenet in action: The case of attaching. International Journal of Lexicography, 16, 297-332.

Gruber, J. (1965). Studies in lexical relations. Massachusetts Institute of Technology.

H. V. D. Parunak. (1995). Case Grammar: A Linguistic Tool for Engineering Agent-Based Systems. ITI Technical Memorandum. ps, Industrial Technology Institute, Ann Arbor. Retrieved August 3, 2016 from https://www.iti.org/~van/casegram,

Jackendoff, R. (1985). The Role of Thematic Relations in Linguistic Theory. Symposium on Thematic Roles. Brandeis University.

Kawtrakul A., et al. (2004). MINECoP: An Integrated Visualization Tool for Corpus Mining, IJCNLP’2004, HAINAN Island, China.

Larson, M. (1984). Meaning-based translation: A guide to cross-language equivalence. Lanham, MD: University Press of America.

Leenoi, D., S. Jumpathong and T. Supnithi. (2010). Building Thai FrameNet through a Combination Approach. IALP 2010, 277-280.

Ohara, Kyoko Hirose et al. (2003). The Japanese FrameNet Project: A Preliminary Report. In Proceedings of Pacific Association for Computational Linguistics (PACLING'03) (pp. 249-254). Halifax, Canada. August, 2003.

Schmidt, T. (2009). Kictionary. The multilingual electronic dictionary of football league. Retrieved from https://www.kicktionary.de/index.html.

Subirats, Carlos; Petruck, Miriam. (2003). Surprise: Spanish FrameNet. International Congress of Linguists. Workshop on Frame Semantics, Prague (Czech Republic), July 2003.

Sudprasert S. and Kawtrakul A. (2003). “Thai Word Segmentation based on Global and Local Unsupervised Learning”, NCSEC’2003, Chonburi, Thailand.

Torrent et al. (2014). Copa 2014 FrameNet Brasil: a frame-based trilingual electronic dictionary for the Football World Cup. COLING 2014, (pp. 10-14). Dublin, Ireland, August 23-29.