Semantic patterns: Presenting a quantitative usage-based method for the description of grammatical and lexical meaning.
Dylan Glynn (Université Paris 8, Saint-Denis),
Jan 28, 2020,
This talk presents what is commonly the called the Behavioural Profile Approach. This approach is a simple methodology that grew out of early research in Cognitive Linguistics in Europe the 1980s and 1990s (Dirven et al. 1982; Rudzka-Ostyn 1989, 1995; Schmid 1993; Geeraerts et al. 1994, 1999; Gries 1999). In contrast to traditional approaches to corpus data that are restricted to formal patterns of relative association (equally true of collocations and collostructions as it is of latent semantic and high dimensional (vector) analysis), early cognitive linguists decided to adopt corpus methods but apply manual semantico-pragmatic analysis to the data. In other words, the approach looks for patterns in meaning rather than in form.
The method enjoys some popularity (q.v. Gries & Stefonwistch 2006; Stefonwistch & Gries 2006; Glynn & Fischer 2010; Glynn & Robinson 2014) and has since been developed independently in various disciplines and for various purposes (q.v. Glynn 2014). Within the cognitive framework, it has especially been applied to lexical polysemy, lexical near synonymy, morpho-syntactic alternations but also to non-observable language functions such as stance taking, mirativity or language encoded concepts such ANGER or FEMININITY. For sake of simplicity, the presentation will focus on the polysemy of a simple spatial preposition in English (over) (q.v. Glynn 2015). However, frequent asides as to the method's applicability to other linguistic phenomena such as syntax and morphology will be made.
The method is entirely bottom-up, usage-based and produces non-discrete quantitative descriptions of language that can be tested for predictive accuracy. This not only offers a way of countering the subjective nature of the analysis but permits the descriptive and predictive adequacy of competing functional explanations of language to be quantitatively compared.
The Behavioural Profile Approach
A. Methodological steps
a. If the object of study is formal, the method adopts standard corpus techniques save that, rather than examining an entire corpus, a subsample of occurrences are extracted.
b. If the object of study in functional, manual tokenisation is needed. This highly subjective and laborious task is a fundamental limiting factor but is appropriate in certain circumstances
Each occurrence is manually annotated for a range of features that are believed to explain the use of the object of study. The idea of features harks back to the Structuralist research on sematic components such as Apresjan (1974), Lutzeier (1981) and Lipka (1991). Importantly, the use of the linguistic form is analysed, not its meaning. Indeed, in situations where multiple forms are considered, the form can be hidden from the analyst to make sure this is the case. Various heuristics have been developed to assist in this stage such as the use of Likert scales, Kappa scores, the suppression of the linguistic form under investigation but also the application of NSM, Frame Semantics and even traditional tests such as substitution and so forth. The repeated analysis and manual annotation of contextualised examples results in a large array of semantico-pragmatic features of use. This multidimensional table represents a detailed description of the use (behaviour) of a linguistic form or function.
Due to its complexity, it is impossible to identify patterns (usage-based structure) in the resulting table, for this multivariate statistics are needed. Originally, the Cognitive Linguistic tradition drew heavily on the variationist sociolinguistic tradition in its use of multivariate statistics, especially in the form of Logistic Regression and various classification and dimension reduction techniques. Today, researchers within the Behavioural Profile Approach employ the full gamut of quantitative tools and arguably represent part of the vanguard in the application of statistics, especially categorical methods, to the social sciences.
B. Theoretical Assumptions
This method assumes the garden-path or usage-based model of grammar where (langue / competence) is largely result of usage (not the other way round).
Structure in language primarily lies in the function rather than the form. In other words, grammatical rules are not formal but functional (conceptual).
This method assumes that language structures tend to be non-discrete. This is largely a result of assumptions 1 and 2.
In situations where a discrete formal explanation (descriptively and predictively adequate) of a language phenomenon is possible, this method is not needed.
*C. Pros and Cons
Although there are many pros and cons with any method, two of the most important issues are the fact that the analysis / annotation is manual and subjective.
Relative to corpus linguistics and machine learning - a small sample size and therefore low representativity.
But relative to discourse analysis and traditional functional linguistics - a large sample size and therefore high representativity
Human error and basis can greatly affect the results.
But direct analysis of the object of study (its meaning) can be more precise and less "noisy" than indirect methods that rely on formal patterns. Moreover, these formal patterns, for a functional linguist are only indices (indirect) of structure and their interpretation in functional terms is necessarily highly subjective.
Apresjan, J. 1974. Лексическая Семантика. Синонимические средства языка [Lexical semantics: Synonymous foundations of language]. Moscow: Nauka.
Dirven, R., Goossens, L., Putseys, Y., & Vorlat, E. 1982. The Scene of Linguistic Action and its Perspectivization by speak, talk, say, and tell. Amsterdam: John Benjamins.
Geeraerts, D., Grondelaers, St., & Bakema, P. 1994. The Structure of Lexical Variation: Meaning, naming, and context. Berlin: Mouton de Gruyter.
Geeraerts, D., Grondelaers, St., & Speelman, D. 1999. Convergentie en divergentie in de Nederlandse woordenschat. Amsterdam: Meertens Instituut.
Glynn, D. 2015. Semasiology and onomasiology: Empirical questions between meaning, naming and context. In J. Daems, et al. (eds.), Change of paradigms – New Paradoxes: Recontextualizing Language and Linguistics, 47–79. Berlin: Mouton de Gruyter.
Glynn, D. 2014 - Polysemy and synonymy: Corpus method and cognitive theory. In D. Glynn & J. Robinson (eds.), Corpus Methods for Semantics: Quantitative studies in polysemy and synonymy, 7–38. Amsterdam: John Benjamins
Glynn, D. & Fischer, K. 2010. Quantitative Cognitive Semantics: Corpus-driven approaches. Berlin: Mouton de Gruyter
Glynn, D. & Robinson, J. 2014. Corpus Methods for Semantics: Quantitative studies in polysemy and synonymy. Amsterdam: John Benjamins.
Gries, St. Th. 1999. Particle movement: A cognitive and functional approach. Cognitive Linguistics 10: 105–145.
Gries, St. Th. 2003. Multifactorial Analysis in Corpus Linguistics: A study of particle placement. St: Continuum.
Gries, St. Th. & Stefanowitsch, A. 2006. Corpora in Cognitive Linguistics: Corpus based approaches to syntax and lexis. Berlin: Mouton de Gruyter.
Lipka, L. 1992. An outline of English lexicology. Tübingen: Max Niemeyer
Lutzeier, P. 1981. Wort und Feld. Wortsemantische Fragestellungen mit besonderer Berücksichtigung des Wortfeldbegriffs. Tübingen: Max Niemeyer.
Rudzka-Ostyn, B. 1989. Prototypes, schemas, and cross-category correspondences: The case of ask. In D. Geeraerts (Ed.), Prospects and Problems of Prototype Theory (pp. 613–661). Berlin: Mouton de Gruyter.
Rudzka-Ostyn, B. 1995. Metaphor, schema, invariance: The case of verbs of answering. In L. Goossens, et al. (eds), By Word of Mouth: Metaphor, metonymy, and linguistic action from a cognitive perspective (pp. 205–244). Amsterdam & Philadelphia: John Benjamins
Schmid, H.-J. 1993. Cottage, idea, start: Die Kategorisierung als Grundprinzip einer differenzierten Bedeutungsbeschreibung. Tübingen: Max Niemeyer.
Stefanowitsch, A. & Gries, St. Th. 2006. Corpus-Based Approaches to Metaphor and Metonymy. Berlin: Mouton de Gruyter.