Tambov
All-Russian academic journal
“Issues of Cognitive Linguistics”

CLUSTERING AS AN INSTRUMENT FOR ENGLISH VERBS CLASSIFICATION (based on the English Verbs of Putting)

CLUSTERING AS AN INSTRUMENT FOR ENGLISH VERBS CLASSIFICATION (based on the English Verbs of Putting)


Author:  R.R. Airapetyan (Markaryan), O.A. Alimuradov

Affiliation:  Pyatigorsk State Linguistic University

Abstract

We discuss in the article the process of Verbs of Putting classification by means of cluster analysis and other mathematical instruments as well as definitional and component semantic analyses as means for formalizing verb semantics. 
The growing interest among modern researchers in the application of mathematical instruments in linguistics accounts for the set of methods chosen for the given research. Particularly, we have applied B. Levin’s verb classification with the help of clustering method highlighted in a number of works.
The authors of the article suggest the detailed describing of the focused group of verbs clustering process providing it with the interpretation of the results. We also offer the metrics of classifications’ similarity for the formalized comparing of different approaches to the verbal classification application results. The authors come to the conclusion that the difference between methodological approaches (the usage of component and definitional analyses vs diathesis alternations) gives rise to the distinctly different results. 
So we can confidently tell about the significant role of the chosen methodology for getting it relevant for the aims of the research results.

Keywords:  verb, classification, clustering, semantic component, cluster, definitional analysis, component analysis

References
Ayrapetyan, R.R. Prototipicheskaya model' glagolov gruppy «Verbs of Putting», osnovannaya na komponentnom analize semantiki. Vestnik Pyatigorskogo gosudarstvennogo lingvisticheskogo universiteta, 2013, 4, 53-59.
Filippov, A.K. Interpretatsiya distributsiy glagol'nykh kontekstov v kachestve manifestatsii struktury leksiko-semanticheskikh grupp raznykh tipov: na primere gruppy glagolov polozheniya v prostranstve i glagolov myshleniya: dis. ... kand. filol. nauk. SPb., 2011.
Baeza-Yates, R.A. Introduction to data structures and algorithms related to information retrieval. Information Retrieval: Data Structures and Algorithms. URL: http://ru.scribd.com/doc/13742235/Information-Retrie-val-Data-Structures-Algorithms-William-B-Frakes
Brew, Ch., & Schulte im Walde, S. (2002). Spectral clustering for German verbs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 117-124). Philadelphia, PA. 
Carroll G., & Mats, R. (1998). Valence induction with a headlexicalized pcfg. In Proceedings of the 3rd Conference on Empirical Methods in Natural Language Processing (pp. 36-45).
Clark, S., & Curran, J.R. (2007). Formalism-independent parser evaluation with CCG and DepBank. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (pp. 248-255). 
Clear, J.H. (1993). The British national corpus. In The Digital Word: Text-based Computing in the Humanities (pp. 163-187). Cambridge, MA, USA: MIT Press. 
Cruse, D.A. (1986). Lexical Semantics. Cambridge, England: Cambridge University Press. 
Deodhare, D., Sharma, G., Srivastava, A., & Sharma, A. Semantically Driven Soft-clustering of Documents using Lexical Chains. In Proceedings of ICON-2010: 8th International Conference on Natural Language Processing. URL: http://www.academia.edu/1979496/Semantically_Driven_Soft-clustering_of_Documents_using_Lexi-cal_Chains
Dorr, B. (1997). Large-scale dictionary construction for foreign language tutoring and interlingual machine translation. Machine Translation, 12 (4), 271-325.
Genkin, A., Lewis, D.D., & Madigan, D. Large-scale Bayesian logistic regression for text categorization. URL: http://sydney.edu.au/engineering/it/~comp5318/survey/logisticregression.pdf
Graff, D. (2003). English Gigaword. Linguistic Data Consortium, Philadelphia.
Jain, A.K., & Dubes, R.C. (1988). Algorithms for Clustering Data. NJ: Prentice-Hall, Inc.
Joanis, E. (2002). Automatic verb classification using a general feature space: Master’s thesis. University of Toronto. 
Joanis, E., & Stevenson, S. (2003). A general feature space for automatic verb classification. In Proceedings of the 10th Conf. of the EACL (pp. 163-170).
Joanis, E., Stevenson, S., & James, D. (2006). A general feature space for automatic verb classification. Natural Language Engineering, 14 (03), 337-367.
King, B. Step-wise clustering procedures.
 J. Am. Stat. Assoc. 69, 1967. 
Kogan, J., Nicholas, C., & Teboulle, M. Clustering Large and High Dimensional Data. URL: http://www.csee.umbc.edu/nicholas/clustering/tuto-rial.pdf
Korhonen, A., Krymolowski, Y., & Marx, Z. Clustering polysemic subcategorization frame distributions semantically. URL:  http://aclweb.org/antho-logy//P/P03/P03-1009.pdf.
Korhonen, A., & Briscoe, T. (2004). Extended lexical-semantic classification of English verbs. In Workshop on Computational Lexical Semantics (pp. 38-45). Boston, Massachusetts, USA: Association for Computational Linguistics. 
Langacker, R. (1976). Semantic Representations and the Linguistic Relativity Hypothesis. In Foundations of Language (pp. 307-357).
Lapata, M., & Brew, C. (2004). Verb class disambiguation using informative priors. Computational Linguistics, 30 (2), 45-73.
Levin, B. (1993). English Verb Classes and Alternations: A Preliminary Investigation. Chicago, IL: University of Chicago Press. 
Li, J., & Brew, C. Which Are the Best Features for Automatic Verb Classi?cation. URL: http://www.aclweb.org/anthology/P/P08/P081050.pdf 
Li, J., & Brew, С. Disambiguating Levin verbs using untagged data. URL:   http://www.ling.ohio-state.edu/~jianguo/papers/LiBrew.pdf
Li, J., & Brew, C. Which are the best features for automatic verb classification. URL: http://aclweb.org/ anthology//P/P08/P08-1050.pdf
Medelyan, O. (2007). Computing Lexical Chains with Graph Clustering. In Proceedings of the ACL 2007. Student Research Workshop (pp. 85-90). Prague. 
Merlo, P., & Stevenson, S. (2001). Automatic verb classification based on statistical distribution of argument structure. Computational Linguistics, 27,  373-408.
Murtagh, F. (1984). A survey of recent advances in hierarchical clustering algorithms which use cluster centers. Computer Journal, 26, 354-359.
Nagy, G. (1968). State of the art in pattern recognition. In Proceedings of the Institute of Electrical and Electronics Engineers 56 (pp. 836-862).
O’Seaghdha, Padraig, G., & Marin, J.W. (1997). Mediated semantic phonological priming: Calling distant relatives. Journal of Memory and Language, 36 (2), 226-252.
Prasada, Sandeep, & Pinker, S. (1993). Generalisation of regular and irregular morphological patterns. Language and Cognitive Processes, 8 (1), 1-56. 
Radford, A. (1997). Syntactic theory and the structure of English: A minimalist approach . Cambridge, England: Cambridge University Press. 
Schulte im Walde, S., & Brew, C. (2002). Inducing German semantic verb classes from purely syntactic subcategorisation information. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (pp. 223-230). Philadelphia, PA. 
Schulte im Walde, S. (2003). Experiments on the choice of features for learning verb classes. In Proceedings of EACL (pp. 315-322). 
Schulte im Walde, S. (2000). Clustering verbs semantically according to their alternation behavior. In: Proc. of the 18th International Conference on Computational Linguistics  (pp. 747-753).
Sneath, P.H.A., & Sokal, R.R. (1973). Numerical Taxonomy. London, UK: Freeman.
Tishby, N., Pereira, F.C., & Bialek, W. (1999). The information bottleneck method. In Proceedings of the 37th Annual Allerton Conference on Communication, Control and Computing (pp. 368-377).

Pages:  96-109

Back to the list



Login:
Password: