Correlations between prosodic units and voice
expression levels in speakers’ attitude (Massimo Moneglia
– Università di Firenze)
The paper present the results of a corpus based
research on the Italian Spoken corpus C-ORAL-ROM (Cresti&Moneglia, 2005) plus
some supplementary data provided by psychiatric interviews and focuses on the
expression of speakers’ attitudes trough prosodic cues in these corpus data.
The research profits of the annotation of terminal and
non terminal prosodic breaks in the corpus, which is independently motivated
with respect to the domain under investigation. In this frame Terminal breaks
are assumed to mark the utterance boundaries . Utterances are assumed t cope with
a speech act i.e. the minimal linguistic entity with a pragmatic value. Non
terminal breaks are assumed to mark the information units of the utterance i.e.
the minimal linguistic entity determined by a semantic program which is also
characterized by an information function (Cresti, 2000).
Speakers attitudes are distinguished in two main
classes according to independently motivated neuropsychological features (Damasio,
2000):
Emotions: Collection of automatic, unconscious and uncontrolled neural answers
to a stimulus forming a sub cortical representation in an organism;
Expression
of affective states: Intentional activation
of a sensory-motor schema in connection to a representation achieved by a
subject in its relation with an interlocutor.
The paper details the main correlations among
attitudes and prosodic entities marked in the corpus and focus on their
prosodic, semantic and pragmatic characters.
Vocal feature of emotions are a properties of the
utterance, since the utterance boundaries correlate specifically with the macro-prosodic
properties characterizing the expression of emotions (High and low activation; Scherer,
2003,; Magno Caldognetto 2002). Emotional features of an utterance can be
appreciated only in the context of the dialogic turn and are therefore
relational. Emotional features are qualities that are shared by all information
units and all words in the utterance. The utterance, from this point of view
undergoes to a Coherence requirement
from the emotional point of view.
From a semantic point of view emotions belongs to a
restrict set of semantic types (closed tag set of Universal, Social and Basic emotions)
and they record a highly inter-subjective recognition rate.
From a pragmatic point of view the expression of
emotions does not modify the pragmatic value of the speech act , whose
performance is not a function of the expressed emotional attitude; i.e the same
speech act type can be performed regardless the emotional values it expresses.
From a quantitative point of view emotions are
sporadic in the corpus and emerge only when the speech performance is connected
to strong stimuli. In brief, the expression of the emotion is not a mean to
reach ordinary communicative goals in human communication and does not
characterize the spoken performance.
The Expression of the affective attitudes by the
speaker are, on the contrary, a property of the information unit . One sole
utterance can be marked by more than one affective attitudes.
From an acoustic point of view this marking regards
one word( or a co-articulated word sequence) and is performed through a perceptively
relevant prosodic variation in the syllabic structure (emphasis).
From a semantic point of view the value associated to
the expression of affective attitudes is not well defined and, although
different hearer in general agree of the existence of such a marking, they are unable
to define precisely its nature (open tag set, low inter-rater agreement on
value assignment).
From a pragmatic point of view, when the expression of
affective states regards the information Units baring the illocutionary value of
the utterance this can interact with the assignment of illocutionary value to
the utterance. For instance the expressive or assertive nature of a speech act
can be questioned.
From a quantitative point of view the prosodic marking
of wording expressing affective values is a pervasive character of spoken
communication. Since it is connected to the lexical choice performed by the
speaker, this prosodic marking specify what is the speaker’s attitude toward
the content of the information unit , or, more in general, its point of view on
the represented content. In this sense the expression of affective attitude is
strictly linked t the moralization of the utterance, however while modal
categories are a restrict number defined in the language grammar, the set of
prosodic “modalizations” is an open set.
When two wording in a continuous string are marked by
different affective attitudes trough prosodic cues the hearer tends to perceive
two separated prosodic units . This is coherent with the idea that information
units are the minimal domain of modalization and that one information unit may
have only one modal value (compositionality of modality within the locution,
Tucci 2009)