Correlations between prosodic units and voice expression levels in speakers’ attitude (Massimo Moneglia – UniversitÓ di Firenze)
The paper present the results of a corpus based research on the Italian Spoken corpus C-ORAL-ROM (Cresti&Moneglia, 2005) plus some supplementary data provided by psychiatric interviews and focuses on the expression of speakers’ attitudes trough prosodic cues in these corpus data.
The research profits of the annotation of terminal and non terminal prosodic breaks in the corpus, which is independently motivated with respect to the domain under investigation. In this frame Terminal breaks are assumed to mark the utterance boundaries . Utterances are assumed t cope with a speech act i.e. the minimal linguistic entity with a pragmatic value. Non terminal breaks are assumed to mark the information units of the utterance i.e. the minimal linguistic entity determined by a semantic program which is also characterized by an information function (Cresti, 2000).
Speakers attitudes are distinguished in two main classes according to independently motivated neuropsychological features (Damasio, 2000):
Emotions: Collection of automatic, unconscious and uncontrolled neural answers to a stimulus forming a sub cortical representation in an organism;
Expression of affective states: Intentional activation of a sensory-motor schema in connection to a representation achieved by a subject in its relation with an interlocutor.
The paper details the main correlations among attitudes and prosodic entities marked in the corpus and focus on their prosodic, semantic and pragmatic characters.
Vocal feature of emotions are a properties of the utterance, since the utterance boundaries correlate specifically with the macro-prosodic properties characterizing the expression of emotions (High and low activation; Scherer, 2003,; Magno Caldognetto 2002). Emotional features of an utterance can be appreciated only in the context of the dialogic turn and are therefore relational. Emotional features are qualities that are shared by all information units and all words in the utterance. The utterance, from this point of view undergoes to a Coherence requirement from the emotional point of view.
From a semantic point of view emotions belongs to a restrict set of semantic types (closed tag set of Universal, Social and Basic emotions) and they record a highly inter-subjective recognition rate.
From a pragmatic point of view the expression of emotions does not modify the pragmatic value of the speech act , whose performance is not a function of the expressed emotional attitude; i.e the same speech act type can be performed regardless the emotional values it expresses.
From a quantitative point of view emotions are sporadic in the corpus and emerge only when the speech performance is connected to strong stimuli. In brief, the expression of the emotion is not a mean to reach ordinary communicative goals in human communication and does not characterize the spoken performance.
The Expression of the affective attitudes by the speaker are, on the contrary, a property of the information unit . One sole utterance can be marked by more than one affective attitudes.
From an acoustic point of view this marking regards one word( or a co-articulated word sequence) and is performed through a perceptively relevant prosodic variation in the syllabic structure (emphasis).
From a semantic point of view the value associated to the expression of affective attitudes is not well defined and, although different hearer in general agree of the existence of such a marking, they are unable to define precisely its nature (open tag set, low inter-rater agreement on value assignment).
From a pragmatic point of view, when the expression of affective states regards the information Units baring the illocutionary value of the utterance this can interact with the assignment of illocutionary value to the utterance. For instance the expressive or assertive nature of a speech act can be questioned.
From a quantitative point of view the prosodic marking of wording expressing affective values is a pervasive character of spoken communication. Since it is connected to the lexical choice performed by the speaker, this prosodic marking specify what is the speaker’s attitude toward the content of the information unit , or, more in general, its point of view on the represented content. In this sense the expression of affective attitude is strictly linked t the moralization of the utterance, however while modal categories are a restrict number defined in the language grammar, the set of prosodic “modalizations” is an open set.
When two wording in a continuous string are marked by different affective attitudes trough prosodic cues the hearer tends to perceive two separated prosodic units . This is coherent with the idea that information units are the minimal domain of modalization and that one information unit may have only one modal value (compositionality of modality within the locution, Tucci 2009)