Vocalization

From emcawiki
Jump to: navigation, search
Encyclopedia of Terminology for CA and IL: Vocalization
Author(s): Elisabeth Reber (University of Bonn & University of Würzburg, Germany)
To cite: Reber, Elisabeth. (2023). Vocalization. In Alexandra Gubina, Elliott M. Hoey & Chase Wesley Raymond (Eds.), Encyclopedia of Terminology for Conversation Analysis and Interactional Linguistics. International Society for Conversation Analysis (ISCA). DOI: []


In the seminal work by Erving Goffman, the term ‘vocalization’ is defined as type of “non-words” or “interjections” (Goffman 1978: 810). These vocalizations include, e.g., “revulsion sounds” (Eeuw!), “strain grunts”, “pain cries” (Ouch!, Oww!), and “audible glees” (Oooooo!, Wheee!). Filled pauses (e.g., ah, uh, um) are classified as “subvocalizations” (Goffman 1978: 806). Vocalizations and subvocalizations of this kind serve as “response cries”, i.e., conventionalized, ritualized non-verbal displays which speakers can deploy strategically for socio-communicative functions, independently of their inner states (Goffman 1978: 805-807).

Early studies in Conversation Analysis have followed up on Goffman’s work. The study of vocalizations in social interaction has provided empirical evidence for their situated sequential positioning and specific interactional and sequence-organizing functions (Schegloff 1982 on the continuers uh huh, mm hmm and yeah; see also Heritage 1984 on the change-of-state token oh; Jefferson 1984 on the acknowledgment tokens mm hmm, yeah). It is subject to debate whether vocalizations can be treated as full turns at talk (Keevallik 2014; Reber & Couper-Kuhlen 2020; Schegloff 1982).

The use of the term ‘vocalization’ sometimes overlaps with that of, e.g., ‘response token’, ‘particle’, ‘vocal tract sound’, ‘liminal signs’ or ‘sound object’; see Dingemanse (2020), Keevallik & Ogden (2020), and Reber (2012) for discussion. The term ‘response token’ includes vocalizations serving a responsive function, e.g., the continuers Mm hm and Uh huh (Gardner 2001). Interjections, e.g., oh, have been treated as kinds of particles (e.g., Heritage 1984; Thompson et al. 2015) but in linguistics, this classification is controversial (e.g., Ameka 1992). The notions ‘vocal tract sound’, ‘liminal signs’, and ‘sound object’ are indicative of diverse positions with respect to the potential conventionalization and arbitrariness of vocalizations, and more specifically as to whether non-lexical signs, like lexemes, can be described as arbitrary form-function pairings. The ‘vocal tract sound’ (rather than, e.g, the non-lexical sound) roughly corresponds to non-lexical vocalizations in Goffman’s sense but marks a position which “[chooses] to be agnostic about the linguistic status of the sounds produced in the human vocal tract” (Keevallik & Ogden 2020: 2). Crucially, ‘vocal tract sounds’, in contrast to words, do not involve “more or less arbitrary form-function packages” (Keevallik & Ogden 2020: 2). In comparison, the term ‘sound object’ (Couper-Kuhlen & Reber 2010, Reber 2009, 2012) is informed by the empirical observation that both lexical and non-lexical vocalizations can be produced with “a rather fixed bundle of prosodic-phonetic properties” (Reber 2012: 76) and context-specific interactional and social functions, i.e., recurrent pairings of form and function. In this vein, sound objects comprise a continuum of “spoken communicative signs” including nonlinguistic and lexical resources at the two extreme poles (Reber & Couper-Kuhlen 2020: 184). The notion of ‘liminal signs’ attempts to stress that there “are signs that derive interactional utility from being ambiguous with regard to conventionality, intentionality, and accountability” (Dingemanse 2020: 191). Liminal signs comprise vocal resources (e.g., coughs, sighs, and inbreaths) as well as visual resources (e.g., “winks, nosewrinkles, and “thinking” facial expressions”; Dingemanse 2020: 191). Recent work has revisited the study of non-lexical vocalizations in embodied interaction and addressed questions pertaining to their conventionalization. It was shown that non-lexical vocalizations may be composed of “syllables that express simultaneous body movement rather than abstract lexical content”, e.g., moans or qa qu qa: in dance classes to demonstrate the steps (Keevallik 2021; see also Hofstetter 2020; Keevallik & Ogden 2020). The meaning of these vocalizations is highly indexical, may be invented spontaneously and only in part be conventionalized. They can be functional in depicting bodily movements and tend to perform response cries in Goffman’s (1978) sense.

The following excerpt (1), taken from a board game interaction (Hofstetter 2020: 46), demonstrates a conventionalized vocalization oh (line 9) and, as Hofstetter argues, a relatively unconventionalized vocalization, a turn-initial moan (line 11). Both are produced in a slot after player AD says Blitz, ending the round (line 3). The analysis centers on the context-specific uses of moans, which show variation in their sound form but systematic sequential positioning as well as interactional and social functions (The conventionalized uses of affect-laden oh in terms of form and function are well-researched; e.g., Couper-Kuhlen 2009; Reber 2012).


(1) [180815 Dutch Blitz_0:8:03] (Hofstetter 2020: 46)

[(Hofstetter 2020: 46)]


The varying sound forms of moans comprise:

  • “multivocalic, using central and back, and open to midclose vowels (the loudest and most prolonged vowels in any given moan were produced close to one of [ɐ],[ɑ], or [o̞])”
  • “prolonged duration, creaky voice, and low and/or falling pitch contour.” (Hofstetter 2020: 45).

Sequential position:

  • “a sound produced after a game event” (Hofstetter 2020: 62)

Interactional function:

  • “receipts the event as complete and valid” (Hofstetter 2020: 62)

Social function:

  • “expression of suffering” (Hofstetter 2020: 62)

Turn expansion:

  • “often followed by a downgraded, lexical utterance that restates the reaction in a way that displays resistance to trouble or willingness to proceed” (Hofstetter 2020: 62)


Additional Related Entries:


Cited References:

Couper-Kuhlen, E. (2009). A sequential approach to affect: The case of ‘disappointment-‘ In Haakana, M., Laakso M. & Lindström, J. (Eds.), Talk in Interaction: Comparative Dimensions (pp. 94–123). Finnish Literature Society (SKS).

Dingemanse, M. (2020). Between sound and speech: Liminal signs in interaction. Research on Language and Social Interaction, 53, 188–196.

Gardner, R. (2001). When Listeners talk: Response Tokens and Listener Stance. John Benjamins.

Goffman, E. (1978). Response cries. Language, 54, 787–815.

Heritage, J. (1984). A change-of-state token and aspects of its sequential placement. In Atkinson, J. M. & Heritage, J. (Eds.), Structures of Social Action: Studies in Conversation Analysis (pp. 299–345). Cambridge University Press.

Hofstetter, E. (2020) Nonlexical “moans”: Response cries in board game interactions. Research on Language & Social Interaction, 53, 42–65.

Jefferson, G. (1984). Notes on a systematic deployment of the acknowledgement tokens “yeah” and “mm hm”. 'Papers in Linguistics, 17, 197–216.

Keevallik, L. (2014). Turn organization and bodily-vocal demonstrations. Journal of Pragmatics, 65, 103–120.

Keevallik, L. (2021), Vocalizations in dance classes teach body knowledge. Linguistics Vanguard, 7(4), 20200098.

Keevallik, L. & Ogden, R. (2020). Sounds on the margins of language at the heart of interaction. Research on Language and Social Interaction, 53, 1–18.

Schegloff, E. (1982). Discourse as an interactional achievement: some uses of ‘uh huh’ and other things that come between sentences.” In Tannen, D. (Ed.), Analyzing Discourse: Text and talk (pp. 71–93). Georgetown University Press.

Reber, E. (2009) Zur Affektivität in englischen Alltagsgesprächen. In M. Buss, Habscheid, St., Jautz, S. Liedtke, F. & Schneider, J. G. (Eds.), Theatralität des sprachlichen Handelns. Eine Metaphorik zwischen Linguistik und Kulturwissenschaften (pp. 193–215). Fink-Verlag.

Reber, E. (2012). Affectivity in Interaction: Sound objects in English. John Benjamins.

Reber, E. & Couper-Kuhlen, E. (2010). Interjektionen zwischen Lexikon und Vokalität: Lexem oder Lautobjekt? In Deppermann A. & Linke, A. (Eds.), Sprache intermedial: Stimme und Schrift, Bild und Ton (pp. 69–96). de Gruyter.

Reber, E. & Couper-Kuhlen, E. (2020). On ‘whistle‘ sound objects in English everyday conversation. Special Issue ‘Sounds on the Margins of Language’ (L. Keevallik & R. Ogden, eds) Research on Language and Social Interaction, 53, 164–187.


Additional References:

Brandenberger, Ch. & Hottiger, Ch. (2018). Sharing perception when using hands-on exhibits in science centres: The case of vocal depiction. Revue Tranel, 68, 59–68.

Li, X. (2020). Click-initiated self-repair in changing the sequential trajectory of actions-in-progress. Research on Language & Social Interaction, 53, 90–117.

Mondada, L. (2020) Audible sniffs: Smelling-in-interaction. Research on Language & Social Interaction, 53, 140–163.

Ogden, R. (2020). Audibly not saying something with clicks. Research on Language and Social Interaction, 53, 66–89.

Pehkonen, S. (2020). Response cries inviting an alignment: Finnish 'huh huh'. Research on Language & Social Interaction, 53, 19–41.

Tolins, J. (2013). Assessment and direction through nonlexical vocalizations in music instruction. Research on Language and Social Interaction, 46, 47–64.


EMCA Wiki Bibliography items tagged with 'vocalization'