25 Aug

Films

温暖的电影基调:没有坏人和阻力
摄影师和哈萨克姑娘
主题、叙事结构
叙事结构、人物弧线
长镜头
人物关系:友谊
台湾本土电影
文化探秘,相遇、友谊
两个人、行走
展示传统文化
为唱歌来到这个世界
迁移羊群
中国(新疆)淘金客的故事
小说电影
真实事件改编
Voltando Para Casa 归来(预告片)
归来
侯孝贤:海上花 Flowers Of Shanghai
不见不散(1998) / Be There or Be Square
侯孝贤《海上花》
纪录片样本:Supreme Revenge | FRONTLINE [ Link ]
台湾公路电影《少年阿霸士》

贾玲《你好,李焕英》:时空穿越的叙事方法

31 May

Acoustic and linguistic factors affecting perceptual dissimilarity judgments of voices

PDF: Acoustic and linguistic factors affecting perceptual dissimilarity judgments of voices

[URL Link]

Abstract

The human voice is a complex acoustic signal that conveys talker identity via individual differences in numerous features, including vocal source acoustics, vocal tract resonances, and dynamic articulations during speech. It remains poorly understood how differences in these features contribute to perceptual dissimilarity of voices and, moreover, whether linguistic differences between listeners and talkers interact during perceptual judgments of voices. Here, native English- and Mandarin-speaking listeners rated the perceptual dissimilarity of voices speaking English or Mandarin from either forward or time-reversed speech. The language spoken by talkers, but not listeners, principally influenced perceptual judgments of voices. Perceptual dissimilarity judgments of voices were always highly correlated between listener groups and forward/time-reversed speech. Representational similarity analyses that explored how acoustic features (fundamental frequency mean and variation, jitter, harmonics-to-noise ratio, speech rate, and formant dispersion) contributed to listeners’ perceptual dissimilarity judgments, including how talker- and listener-language affected these relationships, found the largest effects relating to voice pitch. Overall, these data suggest that, while linguistic factors may influence perceptual judgments of voices, the magnitude of such effects tends to be very small. Perceptual judgments of voices by listeners of different native language backgrounds tend to be more alike than different.

Go to:

I.INTRODUCTION

The human voice is a complex auditory stimulus that conveys a variety of information about a talker, most prominently who they are (their identity) and what they are saying (their linguistic message). Voices are a ubiquitous social and communicative signal, and substantial literature has made forays into understanding how the various acoustic properties of the voice contribute to listeners’ perceptual representation of a talker (e.g., Schweinberger and Zaske, 2018). However, little is known about whether the perceptual representation of voices, including the acoustic features that underlie talker identity, are affected by the language understood by listeners, the language spoken by talkers, or the interaction between them.

In the present work, we evaluate the hypothesis that listeners’ perceptual space for voices is affected by their lifelong linguistic experiences (Fleming et al., 2014). From birth, listeners are inundated with experiences of voices speaking their native language and have extensive practice recognizing talkers of the same. However, listeners’ experience with voices speaking in a foreign language is considerably less, and may be effectively nonexistent for certain foreign languages. Do different cultural experiences shape listeners’ expertise with voice acoustics in such a way that they gain heightened perceptual sensitivity to subjective distinctions among the voices of individuals speaking in their native language compared to a foreign language? Effects of cultural experience have similarly been noted for perceptual sensitivity to own- vs other-race faces (Meissner and Brigham, 2001), native vs foreign phonetic contrasts (Werker and Tees, 1984), and pathological vs healthy voice qualities (Kreiman et al., 1990), among many other domains of perceptual expertise. Ultimately, we are interested in whether experience-related differences in perceptual dissimilarity judgments of voices can help us discern the cognitive foundations of the language-familiarity effect in talker identification; that is, why listeners are more accurate at learning to identify talkers in their native language than in a foreign one (Goggin et al., 1991Perrachione and Wong, 2007).

Previous research on voice processing has attempted to delimit the perceptual space for voices. Early work in this domain was based on highly subjective and qualitative judgments about vocal qualities, such as whether a talker sounds, for example, “harsh,” “shrill,” “monotonous,” or “nasal” (reviewed in Kreiman et al., 2005). Contemporary clinical assessments of voice quality are born of this heritage with voices rated on scales such as roughness, breathiness, strain, pitch, and loudness (Kempster et al., 2009). While the validity and reliability of clinical voice assessments are perhaps the most carefully scrutinized qualitative descriptors of the voice (e.g., Zraick et al., 2011Karnell et al., 2007), they serve the specific purpose of helping clinicians identify the perceptual correlates of vocal pathology, rather than the perceptual correlates of individuals’ vocal identity. More sophisticated efforts to identify the perceptual features that give rise to a talker’s unique vocal identity come from studies looking for structure in listeners’ perceptual similarity ratings (e.g., Baumann and Belin, 2010Remez et al., 2007) and their relationship to voice acoustics. However, it does not strictly follow that, just because some acoustic dimensions are related to subjective judgments of voice dissimilarity, these same features need to be the ones that listeners use when recognizing a voice as familiar or when identifying a talker as a particular individual (Levi, 2018Fecher and Johnson, 2018Van Lancker and Kreiman, 1987Perrachione et al., 2014).

The question of perceived voice dissimilarity has recently been applied to studying the language-familiarity effect in talker identification (Fleming et al., 2014). In this extensively replicated phenomenon (reviewed in Perrachione, 2018), the ability to identify talkers is more accurate when listening to one’s native language than a foreign or less-familiar language. What makes the identity of native-language voices more memorable? A variety of factors have been suggested, including experience-specific prototypes for voices (Goggin et al., 1991), memories for voices abstracted from memories for speech (McLaughlin et al., 2015), and increased sensitivity to between-talker phonetic variation in one’s native language (Perrachione et al., 2011). In their recent report, Fleming and colleagues (2014) suggested that the interaction between the languages spoken by listeners and talkers extends to listeners’ dissimilarity judgments of voices—that is, their subjective, qualitative rating of how alike two voices sound—and, furthermore, that this language-familiarity effect in perceptual dissimilarity judgments was present even when voices had been time-reversed, rendering them incomprehensible. The authors of that study took this difference to mean that listeners are sufficiently sensitive to the phonological features of their native language, such that even in time-reversal, where the ability to identify wordforms is effectively eliminated, sufficient non-lexical but phonological information is preserved to facilitate native-language talker dissimilarity judgments.

However, this claim stipulates, but does not demonstrate, the persistence of language-specific phonological features in time-reversed speech. Whether such features persist or not is an unanswered empirical question, but there are many reasons to think that, if they do, they exist in a much more impoverished form than originally suggested. First, time-reversal does not preserve a language’s phonological structure. When speech is time-reversed, the statistical relationships among phonemes and their order are demolished. For example, the time-reversed order of segments in the sentence, “A rod is used to catch pink salmon” [one of the Harvard Sentences (IEEE, 1969), which are regularly used in talker identification experiments] is [nmæs kŋɪp ʃtɛk ət dzuj zɪ dɑr ə], which contains numerous instances of segmental sequences that are phonologically unattested in English. Time-reversal of speech also destroys the temporal organization of subtle phonetic features, such as voice onset time, that contribute to the perception of voice identity (Ganugapati and Theodore, 2019). For instance, the time-reversed version of “pink” in the sentence above will subject a listener to a physiologically impossible sequence of voicing: aspiration, burst, and then silence.

Thus, instead of the persistence of salient language-specific phonological or phonetic features in time-reversed speech, the observation of a language-familiarity effect for time-reversed talker dissimilarity ratings may instead implicate systematic differences between talkers of different languages in low-level acoustic factors that are sufficiently independent of speech content to be preserved in time-reversed stimuli. Listeners of the two languages therefore must have, through their lifelong experience with languages like either English or Mandarin, gained increased sensitivity to the relevant low-level acoustic features found in their native language, while losing sensitivity to the acoustic feature space of the other language. That is, listeners of one language putatively must not have the necessary experience-related perceptual sensitivity to access the distinguishing, non-linguistic, low-level acoustic differences between speakers of the other language.

It is not unreasonable to suspect that speakers of different languages will evince different low-level acoustic features broadly across their speech such that those differences will be preserved under time-reversal. English and Mandarin in particular are likely to differ on such basic acoustic dimensions as mean voice fundamental frequency and fundamental frequency variability, owing to the presence of syllable-level lexical tone contours in the latter but not the former (Shih, 1988). Similarly, Mandarin and English differ in their prosodic organization, such that Mandarin is a syllable-timed language, whereas English is stress-timed (e.g., Mok, 2008), leading to differences in not only the duration of syllables but also their relative amplitude, both of which are low-level, non-segmental features that would be preserved in time-reversal. Listeners of different language backgrounds may also be differentially sensitive to acoustic correlates of voice quality (Keating and Esposito, 2007Kreiman et al., 2010); Mandarin and English make differential use of creaky voice to signal the third (dipping) tone in Mandarin (Davison, 1991) and either utterance finality or an allophone of /t/ in English (Slifka, 2007). Thus, if language-specific phonological features are not the source of listeners’ biases in perceiving talker differences in time-reversed speech, perhaps these language-based differences derive from different sensitivity to low-level acoustic features.

Finally, the results of the Fleming and colleagues (2014) report also imply two additional hypotheses: that the language-familiarity effect in perceptual dissimilarity judgments should be stronger for time-forward voices (where additional language-specific acoustic and phonetic details are present, thus giving listeners a stronger perceived difference in their native language), and that listeners of different language backgrounds should rely on different low-level acoustic features when making perceptual dissimilarity judgments of voices.

Our aims for this report are, therefore, to replicate and extend the results of Fleming and colleagues (2014) in several ways. First, we attempted a veridical replication of the prior report by repeating their methods and statistical analyses as closely as possible, while using new stimuli and new participants. Second, we wanted to explore how this perceptual dissimilarity space differed between time-reversed voices (a fairly unnatural stimulus) and time-forward ones. Third, we wanted to understand whether listeners’ native language affected their dissimilarity judgments when two recordings came from the same speaker (cf. Lavan et al., 2018Lavan et al., 2019aLavan et al., 2019b), not just when they came from different speakers. Fourth, we aimed to look at not just whether listeners’ native language backgrounds led to differences in their perception of talker dissimilarity, but also whether there were any fundamental similarities in the perceptual space for talkers among listeners of different language backgrounds. Finally, we wanted to explore the acoustic-phonetic factors that may affect perception of voice dissimilarity, and see how these factors differ (1) across talkers’ languages, (2) across listeners’ languages, (3) in forward vs time-reversed speech, and (4) in any interaction between these levels.

In this study, we asked native English- and Mandarin-speaking listeners to rate the perceptual dissimilarity of pairs of voices speaking either English or Mandarin, which were played either forward or time-reversed. We also made acoustic measurements on these voices for both vocal source features [fundamental frequency (f0) mean and variation, and voice quality (jitter and harmonics-to-noise ratio (HNR)] and vocal filter/articulatory features (formant dispersion and speech rate). The choice of these features was motivated by the likelihood of their being preserved between forward and time-reversed speech, as well as their potential to be attested—or attended to—differently between Mandarin and English. We then examined both the divergence and convergence of listeners’ perceptual dissimilarity judgments across differences in talkers’ and listeners’ linguistic background. We further assessed the relationship between listeners’ perceptual dissimilarity judgments and the acoustic features of the voice samples, including how these were affected by talker language, listener language, and their interaction.

Ultimately, the results of this study reveal that listeners’ perceptual dissimilarity judgments of voices are highly similar regardless of the native language of the listeners, the native language of the talkers, or whether the voices are played time-forward or time-reversed. Furthermore, the primary acoustic feature associated with perceptual dissimilarity judgments is the highly salient, reversal-invariant, language-independent mean fundamental frequency of a talker’s voice. These results reveal that the perceptual dissimilarity space for voices tends to be conserved across listeners of different language backgrounds, even for highly disruptive manipulations of the voice stimuli, such as what language they speak or whether they are time-reversed. This outcome calls into question the potential contribution that studying perceptual dissimilarity judgments of voices can make toward revealing the cognitive foundations of the language-familiarity effect in talker identification. Thus, we conclude with a discussion of the strengths and weaknesses of perceptual dissimilarity judgments as a paradigm for studying voice cognition, and we propose a framework for future research in this domain.

Go to:

II.METHODS

In this study, two groups of listeners (native speakers of American English or Mandarin Chinese) listened to pairs of recordings of speech in English and/or Mandarin and rated the perceptual dissimilarity of the voices in the two recordings. Listeners completed this task under either of two conditions: time-reversed speech (cf. Fleming et al., 2014), in which the recordings were played backwards and were incomprehensible, and forward speech, in which natural speech was heard. Acoustic properties of talkers’ speech were also measured. Listeners’ perceptual dissimilarity judgments were analyzed for effects of listener- and talker-language, and time-reversal, as well as their relationship to acoustic features.

A. Participants

Participants in this study included native speakers of American English (N = 40; 35 female, 5 male; ages 18–24 years, mean = 20.3 years) and native speakers of Mandarin Chinese (N = 40; 32 female, 8 male; ages 18–37 years, mean = 21.2 years). The native English speakers had no familiarity with Mandarin; the native Mandarin speakers, who were born and raised in China but currently living or studying in the United States, were bilingual in English. Mandarin participants reported exposure to English beginning on average at age 6 years (±2.5 years, range 1–12 years old) and having, on average, 13.3 ± 4.3 years of English-language study (range 3–30 years). Of the Mandarin participants, 33 reported currently using Mandarin more than English, despite living in the United States, and 3 reported using the 2 languages equally. All participants indicated a history free from speech, language, or hearing disorders. All participants provided informed, written consent prior to undertaking the experiment. This study was approved and overseen by the Institutional Review Board at Boston University. Of the 40 participants in each group, 20 completed the task in the time-reversed speech condition and 20 completed the task in the forward speech condition. The size of both the participant and item samples in each condition were identical to those of Fleming and colleagues (2014).

B. Stimuli

A total of 400 recordings from 40 female speakers were presented to listeners in this study. We recorded female native speakers of American English (N = 20; ages 18–29 years, mean = 23.1 years) reading list 2 of the phonetically balanced English “Harvard Sentences” (IEEE, 1969) and 20 female native speakers of Mandarin (N = 20; ages 18–30 years, mean = 23.2 years) reading sentences 1–4 and 6–11 of the Mandarin Speech Perception Test (Fu et al., 2011). Speakers were recorded in a sound attenuated booth using a Shure MX153 earset microphone, a Behringer Ultragain Pro MIC2200 two-channel tube microphone preamplifier, and Roland Quad Capture USB audio interface with a sampling rate of 44.1 kHz and 16-bit digitization.

The recording of each sentence was cut to 1250 ms from its onset (following Fleming et al., 2014) with a linear amplitude ramp applied to the final 125 ms to avoid the abrupt, unnatural sound of a cut recording. Cut recordings were root-mean-square (RMS) amplitude normalized to 68 dB sound pressure level (SPL). In the time-reversed speech condition, recordings of stimuli were presented backwards; in the forward speech condition, the natural speech recordings were presented. Stimulus editing was completed in Praat.

C. Procedure

The study took place in a sound attenuated booth. Stimulus presentation was controlled using PsychoPy v1.83.03 (Peirce, 2007). Recordings were presented via Sennheiser HD 380 Pro circumaural headphones. On each trial, the listener heard a pair of recordings and was asked to rate how similar or different the voices sounded on an analog sliding scale ranging from 0 (indicating absolute certainty that the voices were the same) to 1 (indicating absolute certainty that the voices were different). Listeners were encouraged to use the full extent of the scale and not just the endpoints; they were told that we were studying the voices themselves and not listeners’ accuracy at telling them apart. Participants were given the option to replay the pair of recordings on each trial as many times as they needed before submitting their response. Participants had no prior exposure to the stimuli or voices used in this study. The experimental procedure is schematized in Fig. 

Fig.1

1.

FIG. 1.

Perceptual dissimilarity rating paradigm. On each trial, participants heard recordings of two different speech samples and indicated how dissimilar the voices in the two recordings sounded by sliding a selector along an ordinal scale from 0 (definitely the same voice) to 1 (definitely different voices). Participants were encouraged to use the entire scale. Recordings came from either the same talker in one language, different talkers in the same language, or different talkers in different languages. For half of the listeners, the recordings were time-reversed as in Fleming et al. (2014; indicated here by mirror-reversal of the text), and for the other half of the listeners, the recordings were presented naturally.

Listeners heard all possible combinations of talkers, resulting in a total of 820 trials (40 same-identity trials, in which the 2 recordings were spoken by the same talker; 190 native-language trials, in which pairs of recordings from all combinations of the 20 talkers in the listener’s native language (English or Mandarin) were presented; 190 foreign-language trials, in which pairs of recordings from all possible combinations of the 20 talkers in the listener’s non-native language were presented; and 400 cross-language pairs, in which pairs of recordings from all combinations of talkers from the two languages were presented).1 For all pairs of recordings, the sentences spoken by the two talkers were different. All conditions were presented during each session, and the order of trials was randomized. Each participant heard a unique set of talker-sentence pairs, as well as a unique order of trials. All audio recordings, experimental paradigms, behavioral data, and analysis scripts are available via this project’s online archive: https://open.bu.edu/handle/2144/16460.

The study was self-paced and took approximately 2 h to complete. The program was broken into 4 sessions, consisting of 205 trials each. Listeners were allowed to take a break between each session. Participants were allowed to complete the study across two consecutive days, completing two sessions during each visit. The task was completed by 21 participants in 1 visit, and by 59 participants in 2 visits.

In conducting this study, we aimed to replicate the design of Fleming and colleagues (2014) exactly. To the best of our ability, we have done so with the following exceptions or additions: (i) Additional groups of Mandarin and English listeners also rated the dissimilarity of talkers from forward speech, whereas Fleming and colleagues used only time-reversed speech; (ii) listeners in our study had no prior exposure to the voices, whereas listeners in the 2014 study had been exposed to those voice stimuli during a prior experiment of unspecified design; (iii) we investigated listeners’ dissimilarity biases from trials in which the same talker was heard twice, not just trials in which different talkers were heard; (iv) we performed additional analyses looking into the similarities, not just differences, between listeners’ perceptual judgments across native language backgrounds; and (v) we performed representational similarity analyses to ascertain whether listeners’ perceptual dissimilarity judgments were related to a variety of acoustic features, and whether these differed with respect to language or time-reversal.

D. Acoustic measurements

To investigate the relationship between listeners’ perceptual dissimilarity judgments and the acoustic properties of talkers’ speech, we analyzed a number of acoustic features that can reflect differences in vocal source acoustics, vocal filter acoustics, and speech articulation. These features were selected based on their previous implication as potentially perceptually distinguishing acoustic features of voices (e.g., Baumann and Belin, 2010Latinus and Belin, 2011bLatinus et al., 2013Remez et al., 2007Schweinberger et al., 2014), because they may have differential attestation based on talker language—or differential attention based on listener language—between Mandarin and English, and because they are likely preserved between forward and time-reversed speech. A number of these features (mean f0f0 range, formant dispersion, and HNR) were also reported for the speakers in the stimuli used by Fleming and colleagues (2014); however, the relationship between speech acoustics and perceptual dissimilarity ratings was not examined in that report. All acoustic measurements were made in Praat.

1. Fundamental frequency (f0We measured the mean and standard deviation of talkers’ voice fundamental frequency from each stimulus recording. The standard pitch tracking parameters in Praat were used, unless these resulted in pitch tracking errors, such as pitch doubling or pitch halving, owing to misidentification of the relevant waveform peak by the autocorrelation function. The pitch contour of every recording was visually inspected overlaid on the spectrogram to identify any such errors, in which case the minimum and maximum pitch range for that particular recording was adjusted to eliminate the error.

2. Voice quality Two measures of voice quality were obtained for each recording: jitter and HNR. Jitter is an acoustic correlate of temporal perturbation in vocal fold vibration and perceptually related to the creakiness of a voice (Karnell et al., 2007). We used the five-point period perturbation quotient algorithm to estimate jitter in each recording, as this algorithm provides an estimate of vocal temporal perturbation that is robust to ongoing pitch dynamics in natural speech (Davis, 1981). Jitter is expressed as a mean percent difference in cycle-to-cycle periodicity. HNR is an acoustic correlate of voice quality that reflects relative energy of the periodic and aperiodic components of the voice, expressed in dB. Briefly, both jitter and HNR provide indices of the extent to which a talker’s voice quality differs from the modal voice. Voice quality measurements were made simultaneously with, and using the same pitch settings as, the fundamental frequency measurements.

3. Speech rate As each recording was truncated at 1.25 s, sometimes mid-syllable, we determined talkers’ speech rate as the number of full syllables listeners heard in each recording, divided by the time it took to produce those syllables. Counting up until the time of the end of the last full syllable in each recording, we calculated talkers’ speech rate for each utterance in syllables per second.

4. Formant dispersion Formant dispersion is a measure of vocal tract length, with shorter vocal tracts producing higher frequency resonances and thus greater frequency distance between formants (Fitch, 1997). Formants were measured using the standard settings in Praat, adjusted as needed on a recording-by-recording basis to obtain good formant tracking. Values for each of the first four formants (F1–F4) were extracted across the entire recording. Because Praat is prone to formant tracking errors at the transition between the open and closed vocal tract at syllable boundaries in natural, running speech, we weighted the contribution of each sample by the degree of vocal tract opening at that time point. We calculated the mean frequency of each formant across the utterance weighted by the intensity contour, such that formant values measured during the maximally open vocal tract (i.e., high-intensity speech) contributed more to the average than those measured proximal to closures (i.e., low-intensity speech). This method allowed us to make use of the entire utterance heard by listeners during the voice judgment task while maximizing the signal-to-noise ratio of formant measurements by reducing the amount of measurement error due to poor formant tracking during vocal tract closures. Formant dispersion was calculated using the mean of the differences of adjacent formants (Fitch, 1997).

E. Statistical analyses

Data were analyzed in R v3.5.1 using the packages ez for repeated-measures analyses of variance (ANOVAs) and lme4 and lmerTest for linear mixed-effects models. The fixed and random effects structure of each model is described below. The significance of factors in the models was determined using type-III ANOVAs incorporating Satterthwaite’s method for approximating denominator degrees of freedom. Post hoc comparisons to identify the direction and source of main and interaction effects from the ANOVA were conducted using difference of least-square means implemented via the function difflsmeans. We adopted a significance criterion of α = 0.05 for ANOVAs and other planned comparisons, and applied Bonferroni-corrected alpha criteria when assessing significance of post hoc tests on each model.

Go to:

III.RESULTS

We first attempted to replicate the observation of a language-familiarity effect for talker dissimilarity ratings from time-reversed speech reported by Fleming and colleagues (2014). Using only the data from listeners who made judgments of time-reversed recordings, we performed the same statistical analyses described in the prior report. Second, we analyzed the full dataset we collected, including all 65600 dissimilarity judgments performed by English- and Mandarin-speaking listeners on all voice pairs from both time-reversed and forward speech, for differences related to talker- or listener-specific factors. Third, we examined whether listeners’ perceptual dissimilarity judgments across the entire dataset were related to acoustic differences between the pairs of recordings and, if so, whether these relationships differed across talker language, listener language, or time-reversal.

A. Attempted replication of a language-familiarity effect for dissimilarity judgments of time-reversed recordings

First, a repeated-measures ANOVA was conducted on the dependent measure of listeners’ dissimilarity ratings with the within-subject factor of pair type (same-talker pairs, foreign-language pairs, native-language pairs, and cross-language pairs). This analysis revealed a significant effect of pair type [F(3,117) = 346.95, p ≪ 0.0001, 

?

2

?

 = 0.789], paralleling the prior report [Fig. 2(A)]. Same-talker pairs were rated the least dissimilar (larger numbers indicate greater mean dissimilarity; mean±across-participants standard error: 0.13±0.02), then foreign-language pairs (0.63±0.02), native-language pairs (0.65±0.02), and finally cross-language pairs were rated most dissimilar (0.78±0.02).

FIG. 2.

(Color online) Perceptual dissimilarity of time-reversed talker pairs. (A) shows the mean dissimilarity rating for each talker pair type across both listener groups for time-reversed recordings, following the conventions of Fleming et al. (2014). Listeners rated same talker pairs as least dissimilar and cross-language talker pairs as most dissimilar. Different-talker pairs speaking the same language were rated similarly, regardless of whether they were speaking listeners’ native or foreign language. (B) shows dissimilarity ratings of English (EV) and Mandarin (MV) talker pairs separately for native English- and Mandarin-speaking participants. Both groups found Mandarin talker pairs more dissimilar, and there was no talker language ×listener language interaction, suggesting the language-familiarity effect does not influence talker dissimilarity ratings from time-reversed speech.

Post hoc tests revealed that cross-language pairs were rated as significantly more dissimilar than native-language pairs, foreign language pairs, and same-talker pairs [all paired t(39) > 8.95, all p ≪ 0.0001]. However, in contrast to the prior report, native-language pairs were not rated as significantly more dissimilar than foreign-language pairs [t(39) = 0.91, p = 0.37], although they were more dissimilar than same-talker pairs [t(39) = 18.68, p ≪ 0.0001]. Foreign-language pairs were also rated as significantly more dissimilar than same-talker pairs [t(39) = 18.68, p ≪ 0.0001].

Second, we analyzed the native- and foreign-language talker pairs separately for native speakers of English (native pairs, 0.62±0.03; foreign pairs, 0.68±0.02) and Mandarin [native pairs, 0.68±0.04; foreign pairs, 0.59±0.04; Fig. 2(B)]. These data were submitted to a 2×2 repeated-measures ANOVA, with listener language (English, Mandarin) as the between-subjects factor and talker language (English, Mandarin) as the within-subjects factor. Like Fleming and colleagues (2014), we found no main effect of listener language [F(1,38) = 0.073, p = 0.79, 

?

2

?

 = 0.0018]. However, unlike the prior report, we did not find a listener language × talker language interaction [F(1,38) = 2.44, p = 0.13, 

?

2

?

 = 0.0025], but we did find a main effect of talker language [F(1,38) = 75.91, p ≪ 0.0001, 

?

2

?

 = 0.071]. Post hoc tests revealed that the Mandarin talkers were perceived as more dissimilar than the English talkers by both the Mandarin listeners [paired t(19) = 6.31, p ≪ 0.0001] and the English listeners [paired t(19) = 6.16, p ≪ 0.0001].

B. Dissimilarity judgments of forward and time-reversed talkers

We next analyzed the full dataset, including all 65600 dissimilarity judgments participants made (including for all time-reversed and forward recordings, all talker pairs, and by listeners of both native languages). The pattern of mean dissimilarity judgments across participants for each pair of talkers is shown in Figs. 3(A) (for time-reversed recordings) and 3(B) (for forward recordings). Each cell of a dissimilarity matrix corresponds to a unique pair of talkers, and the mean dissimilarity rating for that pair is indicated by the amount of shading, from 0 (most similar, darkest) to 1 (most dissimilar, lightest); the quadrants of the matrices correspond to English-speaking voice pairs (top left), Mandarin-speaking voice pairs (bottom right), and cross-language pairs (bottom left and top right). Same-talker pairs are depicted along the diagonal.

FIG. 3.

(Color online) Patterns of perceptual dissimilarity judgments for all talker pairs across time-reversed or forward speech and listeners’ native language. In the matrices at the top, each row and column correspond to an individual talker, such that each cell indicates the perceived dissimilarity between each pair of talkers from 0 (maximally similar) to 1 (maximally dissimilar). Same-identity talker pairs occur along the diagonal; the top left quadrant contains pairs of English-speaking talkers (EV), the bottom right quadrant contains pairs of Mandarin-speaking talkers (MV), and the top right and bottom left quadrants contain cross-language talker pairs. Panels in (A) show the pattern of dissimilarity ratings across all time-reversed talker pairs for native English (left) and Mandarin (right) listeners. Panels in (B) show the corresponding pattern of dissimilarity ratings across all time-forward talker pairs. The color scale indicates the mean dissimilarity across listeners for a talker pair with darker colors corresponding to less dissimilarity and lighter colors corresponding to greater dissimilarity. Panels in (C) show the probability density functions of participants’ dissimilarity ratings for each pair type corresponding to the matrix directly above. The area under each curve is 1. Panels in (D) show the mean dissimilarity rating for each condition corresponding to the matrix and density plot above. Error bars are ±standard error of the mean across participants. Note the low dissimilarity rating along the diagonal of each matrix for the same-talker pairs, reflected in the leftmost peak (grey lines) in the density plots and the low mean dissimilarity in the barplots. Note also the relatively higher dissimilarity rating for talker pairs in the cross-language quadrants (top right, bottom left), and the corresponding rightmost peak of the cross-language (green) distribution below. Reading across the matrices, note the strikingly similar pattern of cell-level similarity ratings between the two listener groups and between time-reversed and forward speech. These similarities are considered quantitatively in Fig. 

Fig.5

5.

Perceptual dissimilarity judgments are inherently non-Gaussian, given the distribution is bounded by 0 and 1. Inspection of the distribution of responses further revealed substantial deviation from normality [Fig. 3(C)] with responses clustered near 0 and 1 (Anderson-Darling test of normality, A = 6395.4, p ≪ 0.0001). Correspondingly, we applied arcsine transformation (Studebaker, 1985) to the dissimilarity rating data prior to inferential statistics, after which the data were not as skewed toward the extrema of the scale, but were nonetheless still not normally distributed (A = 4094.5, p ≪ 0.0001). Listeners exhibited a strong preference for dissimilarity rankings at the extrema of the range: Across all the different-talker trials (foreign, native, and cross-language) in our data, fully 50% of responses had dissimilarity ratings of ≥0.93 (where 1 meant “definitely different”), and nearly 42% of responses were “1” exactly. For same-talker trials, 56% received a dissimilarity rating of “0” exactly (where 0 meant “definitely the same”), and only 23% of same-talker trials had a rating >0.1. Only 17% of all trials were rated within the middle half of the range (0.25–0.75).

Participants’ arcsine-transformed dissimilarity ratings were submitted to a linear mixed-effects model with fixed factors including pair type (same-talker pairs, foreign language different-talker pairs, native language different-talker pairs, and cross-language pairs), listener native language (English, Mandarin) and recording direction (time-reversed, forward). Random factors in the model included by-participant intercepts and by-participant slopes for the within-subject effect of pair type, as well as random item intercepts for each (unordered) pair of talkers.

An ANOVA of this model revealed a significant main effect of pair type, and significant pair type × listener native language and pair type × recording direction two-way interactions (Table 

(TableI).

I). The other main, two-, and three-way interaction effect terms were not significant. Post hoc pairwise tests revealed that same-talker pairs were rated as less dissimilar than foreign, native, or cross-language pairs (all t >19.64, p0.0001). Likewise, cross-language pairs were rated as more dissimilar than native or foreign pairs (both t >6.96, p0.0001). Native-language pairs were also rated as significantly more dissimilar than foreign-language pairs (t=−5.28, p0.0001). However, given that two other factors had significant interactions with pair type, these simple effects require further elaboration.

TABLE I.

Linguistic factors affecting perceptual dissimilarity judgments of voices (all data) df(n,d) is degrees of freedom (numerator, denominator).

Fixed factor

F

df (n,d)

p-value

Pair type

172.41

(3,121)

≪0.0001

Listener native language

0.34

(1,76)

0.56

Recording direction

3.73

(1,76)

0.057

Pair type × listener native language

10.71

(3,114)

≪0.0001

Pair type × recording direction

6.93

(3,76)

0.00035

Listener native language × recording direction

1.26

(1,76)

0.27

Pair type × listener native language × recording direction

0.86

(3,76)

0.46

Open in a separate window

The pair type × recording direction interaction was driven by participants’ tendency to give higher dissimilarity ratings for native-language talkers when hearing forward vs time-reversed recordings (t=−3.29, p = 0.0015), but less so for cross-language pairs (t=−1.53, p=0.13), foreign-language pairs (t=−1.83, p = 0.071), or same-talker pairs recordings (t = 1.43, p = 0.16)—the latter of which had the opposite tendency, with even lower dissimilarity ratings from forward speech [Fig. 3(D)].

The pair type × listener native language interaction was driven by a tendency for higher dissimilarity ratings on native-language pairs by native Mandarin listeners compared to native English listeners (t=−2.98, p = 0.0037), but not for same-talker pairs (t = 0.36, p = 0.72), foreign-language pairs (t = 0.63, p = 0.53), or cross-language pairs (t=0.45, p =0.65). Mandarin listeners tended to rate native-language pairs as more dissimilar than foreign-language pairs (t=−6.76, p0.0001), whereas English listeners’ ratings tended to be in the opposite direction (t = 1.72, p = 0.086). The two listener groups did not differ overall in their perceived dissimilarity of English talker pairs (English native pairs vs Mandarin foreign pairs; t=−0.11, p = 0.91), but did differ in their perceived dissimilarity of Mandarin talker pairs (t=−2.38, p = 0.02), such that Mandarin listeners judged these talker pairs as more dissimilar than English listeners did.

1. The language-familiarity effect Organizing the data by pair type may obscure some potentially interesting relationships between talker language, listener language, and talker identity. In particular, treating languages as “native” or “foreign” may miss main effects due to language (English vs Mandarin) or listener by talker language interactions. Similarly, by treating all same-talker pairs as a single category, we forego the ability to detect effects of talkers’ or listeners’ language on how listeners “tell voices together” (Lavan et al., 2019aLavan et al., 2019b) compared to telling them apart. We therefore performed two additional planned analyses on listeners’ dissimilarity ratings, the first from only the native- and foreign-language pairs, now organized by talker language, and the second for only the same-talker pairs, likewise organized by language.

a. Different-talker pairs. Arcsine transformed dissimilarity ratings for pairs of different English- or Mandarin-speaking talkers were submitted to a linear mixed effects model with fixed factors including talker language (English, Mandarin), listener native language (English, Mandarin), and recording direction (time-reversed, forward). Random factors included by-participant intercepts and by-participant slopes for the within-subject effect of talker language, as well as random item intercepts for each pair of talkers.

An ANOVA of this model revealed significant main effects of talker language and recording direction and a significant talker language × listener native language interaction (Table 

(TableII).

II). There was also a significant three-way talker language × listener native language × recording direction interaction. The other main effects and interactions were not significant.

TABLE II.

Linguistic factors affecting perceptual dissimilarity judgments of different-talker pairs.

Fixed factor

F

df (n,d)

p-value

Talker language

15.61

(1,414)

≪ 0.0001

Listener native language

1.63

(1,76)

0.21

Recording direction

6.89

(1,76)

0.010

Talker language × listener native language

27.88

(1,76)

≪0.0001

Talker language × recording direction

1.68

(1,76)

0.20

Listener native language × Recording direction

1.41

(1,76)

0.24

Talker language × listener native language × recording direction

11.97

(1,76)

0.0009

Open in a separate window

The main effect of talker language was driven by overall higher dissimilarity ratings for Mandarin talker pairs (t=3.95, p0.0001). The main effect of recording direction was driven by overall higher dissimilarity ratings for forward vs time-reversed recordings (t = 2.63, p = 0.010). The talker language × listener native language interaction represents the same differences giving rise to the pair type ×listener native language interaction in the previous model—namely, the listener groups differed in their judgments of Mandarin-speaking voices, but not English-speaking ones.

Exploring the three-way interaction reveals the lack of consistent attestation of a language-familiarity effect in talker dissimilarity ratings [Fig. 4(A)]. If language familiarity affects listeners’ perceptual dissimilarity space for talkers, then native-language talker pairs should reliably be rated as more dissimilar than foreign-language talker pairs. However, post hoc pairwise tests revealed that this was not always the case: Contrary to the language-familiarity hypothesis, English listeners actually rated Mandarin-speaking talkers as more dissimilar than English-talker pairs when listening to time-reversed speech (t=2.94, p=0.0035), and they did not differ in their ratings of talkers of the two languages from forward speech (t=−0.41, p=0.68). Mandarin listeners, however, did rate Mandarin-talker pairs as more dissimilar than English-talker pairs from both time-reversed (t=4.23, p0.0001) and forward recordings (t =5.75, p0.0001).

FIG. 4.

(Color online) Patterns of between-language dissimilarity rating differences are infrequently consistent with the predictions of the language-familiarity effect. (A) Predicted (left) and measured (right) dissimilarity ratings for different-talker pairs in each language, recording direction, and listener group. (B) Predicted (left) and measured (right) dissimilarity ratings for same-talker pairs in each language, recording direction, and listener group. Legend: Points represent mean dissimilarity ratings of individual participants in each condition; lines connect points from the same participant to show the direction of the effect. Red points indicate Mandarin talkers, blue points indicate English talkers; points are partially transparent to reveal overlap. Larger white points show the mean dissimilarity rating across participants in each condition. Symbols & abbreviations: LFE, language-familiarity effect; n.s. p > 0.0125; *** p < 0.005 in the direction predicted by the language-familiarity effect; ††† p < 0.005 in the opposite direction of the predictions of the language-familiarity effect.

b. Same-talker pairs. Participants’ arcsine-transformed dissimilarity ratings for pairs of recordings from the same English or Mandarin talkers were submitted to a linear mixed-effects model with the same structure as that for different-talker pairs. An ANOVA of this model revealed only a significant three-way talker language × listener native language × recording direction interaction (Table III). All the other main and interaction effects were not significant.

TABLE III.

Linguistic factors affecting perceptual dissimilarity judgments of same-talker pairs.

Fixed factor

F

df (n,d)

p-value

Talker language

0.59

(1,40)

0.45

Listener native language

0.13

(1,76)

0.72

Recording direction

2.05

(1,76)

0.16

Talker language × listener native language

1.18

(1,76)

0.28

Talker language × recording direction

0.35

(1,76)

0.55

Listener native language × recording direction

0.09

(1,76)

0.76

Talker language × listener native language × recording direction

12.14

(1,76)

0.0008

Open in a separate window

A language-familiarity effect on listeners’ perceptual dissimilarity judgments for same-talker pairs makes the opposite prediction for their judgments of different-talker pairs, namely, listeners should be more sensitive to the fact that two recordings come from the same talker in their native language, thus providing lower dissimilarity ratings for same-talker pairs in their native language than in a foreign language. Exploring the three-way interaction provides little evidence for an effect of language familiarity on judging the same talker to sound more similar to herself [Fig. 4(B)]. Post hoc pairwise tests revealed that, contrary to the language-familiarity hypothesis, Mandarin listeners actually tended to rate pairs of recordings from a single Mandarin-speaking talker as more dissimilar than pairs of recordings from a single English-speaking talker when listening to time-reversed speech (t = 1.81, p = 0.074), but did tend to rate same-talker pairs in the expected direction for time-forward speech (t =−1.76, p = 0.082), although not significantly so in either case. English listeners also did not exhibit a pattern of dissimilarity ratings predicted by language familiarity: they did not rate English and Mandarin same-talker pairs differently either for time-reversed recordings (t=−0.29, p = 0.78) or forward speech [t = 2.24, p = 0.028 (αBonf. = 0.0125)].

C. Consistent perceptual dissimilarity judgments across listener groups and time-reversal

Despite the subtle differences in magnitude noted above, the overall patterns of perceptual dissimilarity judgments for native- and foreign-language talkers tended to be highly similar across listener groups and time-reversal. Bivariate correlations of the mean perceptual dissimilarity across listeners of each different-talker pair revealed that the pattern of perceptual dissimilarity for time-reversed English voices [i.e., the lower triangle of the English-English quadrants in Fig. 3(A); shown in Fig. 5(A)] was significantly correlated between English and Mandarin listeners (r188=0.85, p0.0001). Likewise, the corresponding pattern of perceptual judgments for the time-reversed Mandarin voices was also highly correlated across listener groups (r188=0.78, p0.0001). For the forward voices, the two listener groups likewise exhibited extremely similar patterns of dissimilarity judgment for pairs of English (r188=0.80, p0.0001) and Mandarin (r188=0.65, p0.0001) talkers.

FIG. 5.

(Color online) Correlation between dissimilarity judgments of voice pairs across listener groups and forward/time-reversed speech conditions. (A) Perceptual dissimilarity of voices was highly consistent across English and Mandarin listeners. Points indicate the mean perceived dissimilarity for each pair of voices for English (ordinate) and Mandarin (abscissa) listener groups. Both English and Mandarin listener groups tended to find the same voices more similar/dissimilar, as indicated by the high degree of correlation in these points. (B) Perceptual dissimilarity of voices was also highly consistent across forward and time-reversed speech, regardless of talker or listener language. Points indicate the mean perceived dissimilarity for each pair of voices for listeners who heard time-reversed (ordinate) or forward (abscissa) speech. The high degree of correlation for these points indicates that listeners found the same voices more similar/dissimilar regardless of whether the signal had been time-reversed or not.

The pattern of listeners’ dissimilarity judgments was also significantly related across the time-reversal manipulation. Native English-speaking listeners tended to find the same English-speaking talkers to be more or less dissimilar regardless of whether their speech was comprehensible or not [i.e., the lower triangles of the English-English quadrants in Figs. 3(A) and 3(B); shown in Fig. 5(B)r188=0.79, p0.0001]. The pattern of dissimilarity judgments elicited from English listeners was also highly correlated for Mandarin voices, regardless of time-reversal (r188=0.75, p0.0001). For native Mandarin-speaking listeners, as well, the same English-speaking voices were more or less dissimilar regardless of time-reversal (r188=0.81, p0.0001), and so too for the Mandarin voices (r188=0.69, p0.0001).

D. Representational similarity analyses of speech acoustics and perceptual dissimilarity judgments

We next sought to identify which acoustic factors, were related to perceptual dissimilarity judgments as a function of talker language, listener language, and time-reversal.

1. Acoustics of our Mandarin- and English-speaking talkers Our two talker groups differed on a number of measures (Table 

(TableIV,

IV, Fig. 

Fig.6).

6). For instance, the English-speaking talkers, on average, had lower vocal pitch and smaller formant dispersion, perhaps suggesting smaller overall body size in our Mandarin-speaking sample (cf. Pisanski et al., 2014). The English-speaking talkers also tended to have more nonmodal voice quality (higher jitter and lower HNR) compared to the Mandarin talkers. Finally, English talkers tended to speak faster than Mandarin talkers, perhaps due to more unstressed syllables and function words in the English recordings (IEEE, 1969) compared to the content-word heavy Mandarin sentences (Fu et al., 2011).

TABLE IV.

Mean ± standard deviation and two-sample difference of acoustic measures from English- and Mandarin-speaking talkers.

Group averagea (mean ± standard deviation)

Group difference

Acoustic feature

English talkers

Mandarin talkers

t(38) =

p <

Cohen’s d =

f0 mean (Hz)

212.35 ± 19.25

233.53 ± 29.02

−2.73

0.01

0.88

f0 variation 

(?/

?

)

0.152 ± 0.042

0.160 ± 0.025

−0.75

0.46

0.24

Jitter (%)

1.034 ± 0.152

0.760 ± 0.132

6.06

0.0001

1.97

HNR (dB)

13.20 ± 1.362

15.02 ± 1.708

−3.72

0.001

1.21

Speech rate (syllable/s)

5.305 ± 0.366

4.824 ± 0.449

3.71

0.001

1.20

Formant dispersion (Hz)

1101.4 ± 53.4

1198.2 ± 85.0

−7.99

0.0001

2.59

Open in a separate window

an = 20 in each talker group.

FIG. 6.

(Color online) Acoustic features of English and Mandarin talkers. Histograms display the number of talkers falling within a particular range for each of the acoustic features measured. Means, standard deviations, and differences between groups are reported in Table 

TableI

I.

2. Data analysis Many of the acoustic features we measured are subject to nonlinear mapping between physical and perceptual space. Consequently, when applicable we scaled these values into the corresponding perceptual space for comparison to listeners’ behavior: Talkers’ f0 and formant frequencies were converted to mels (Stevens et al., 1937). Variation in f0 was scaled with respect to mean f0 using the coefficient of variation 

(?/

?

)

. Perception of jitter and HNR are roughly linear over the range of values measured in our speakers (Hillenbrand, 1988), and so these values were not transformed. Dissimilarity judgment data were arcsin transformed prior to inclusion in the representational similarity analysis models (Studebaker, 1985).

Using a representational similarity analysis technique (Kriegeskorte et al., 2008), we performed a backward stepwise regression on a linear mixed-effects model that estimated the effect of the six acoustic measures (f0 meanf0 variationjitterHNRspeech rate, and formant dispersion) on each listener’s perceptual dissimilarity judgment on each within-language, different-talker trial (i.e., all the native and foreign talker pairs, but not cross-language or same-talker pairs). The model additionally contained all two-, three-, and four-way interaction terms between these continuous measures and the categorical factors of listener native language (English, Mandarin), talker language (English, Mandarin), and recording direction (time-reversed, forward). The random effects structure included by-participant intercepts and slopes for the within-subject categorical fixed effect talker language, as well as random by-item intercepts for each talker pair. A deviation contrast coding scheme was applied to all categorical factors. Backward stepwise regression was performed using the function step in the package lmerTest for both fixed and random effects, with the criterion for factor inclusion of α=0.05.

3. Acoustic factors affecting perceptual dissimilarity judgments of voices Following the backward stepwise regression analysis, a number of acoustic factors and their interactions with talker- and listener-specific linguistic factors had significant effects on listeners’ perceptual dissimilarity judgments (Table 

(TableV).

V). With respect to the continuous acoustic factors, there were significant overall effects of the difference in mean f0HNR, and formant dispersion between talkers in a pair on listeners’ dissimilarity ratings, such that larger differences in these features tended to lead to higher dissimilarity ratings. Speech rate also played a role in the model, but only in the context of its interactions with categorical factors (see below). With respect to the categorical factors there were, like before, significant effects of talker language (with higher dissimilarity ratings overall for Mandarin than English talker pairs) and recording direction (with higher dissimilarity ratings overall for forward than time-reversed recordings), but again no overall effect of listener native language.

TABLE V.

Acoustic and linguistic factors affecting perceptual dissimilarity judgments of voices.

Model term

β

s.e.

t

df

p-value

Δf0 mean (mel)

0.0024

0.0002

9.62

25370

≪0.0001

ΔHNR (dB)

0.0087

0.0029

3.01

30206

0.0027

Δ Speech rate (syllable/s)

−0.0111

0.0113

−0.99

29896

0.32

Δ Formant dispersion (mel)

0.0007

0.0003

2.64

30103

0.0083

Talker language

0.0654

0.0244

2.68

514

0.0075

Listener native language

−0.0609

0.0482

−1.26

79

0.21

Recording direction

−0.1422

0.0482

−2.95

79

0.0042

Δ Speech rate × talker language

0.0075

0.0113

0.67

29879

0.51

Δ Speech rate × listener native language

0.0029

0.0096

0.31

29909

0.76

Talker language × listener native language

−0.0703

0.0125

−5.62

154

≪0.0001

Δ HNR × recording direction

0.0061

0.0025

2.48

29880

0.013

Talker language × recording direction

0.0117

0.0105

1.11

76

0.27

Listener native language × recording direction

0.0565

0.0477

1.18

76

0.24

Δ Speech rate × talker language × listener native language

0.0210

0.0096

2.18

29904

0.029

Talker language × listener native language × recording direction

0.0359

0.0105

3.43

76

0.0010

Open in a separate window

The final representational similarity analysis model included no significant two-way interactions between acoustic differences and talker or listener language. There was a significant two-way HNR × recording direction interaction, indicating that differences between talkers’ HNR had a larger influence on dissimilarity ratings of time-reversed speech than forward speech. The significant two-way interaction between the categorical variables of talker language and listener language recapitulates the effect seen in Table 

TableII

II and Fig. 4(A). No other two-way interactions were significant.

The reduced model included only one significant three-way interaction involving differences in speech acoustics: speech rate × talker language × listener language, such that both Mandarin and English listeners tended to rate talker pairs in their native language as more dissimilar when the difference in speech rate was smaller, but English listeners more strongly exhibited the opposite pattern in their foreign language, rating Mandarin talker pairs as more dissimilar with larger differences in speech rate. The three-way talker language × listener language × recording direction interaction parallels that in Table 

TableII

II.

Go to:

IV.DISCUSSION

A. Perceptual dissimilarity ratings

Listeners’ perceptual dissimilarity judgments evinced a number of consistent patterns that transverse both listeners’ native language background and manipulation via time-reversal. On average, listeners judged cross-language pairs to be most dissimilar, suggesting that listeners may generally expect speech in different languages to come from different talkers, and that, even when time-reversed, the acoustic qualities of speech in English and Mandarin are sufficiently different to give the consistent impression that these recordings come from different talkers. This is further supported by the numerous significant differences in low-level acoustic features between the two talker groups—including mean vocal pitch, voice quality measures, formant dispersion, and speech rate (Table 

(TableIV).

IV). Differences in the low-level acoustic measurements between the two talker groups appear to be reflected in listeners’ heightened perceptual dissimilarity judgments across languages [Fig. 3(D)].

It is also important to note that, while differing significantly in mean dissimilarity, the distributions of dissimilarity ratings for cross-language voice pairs and same-language voice pairs are nonetheless highly overlapping [Fig. 3(C)]. This indicates that mere differences in the language being spoken do not uniquely determine the extent to which pairs of voices will be heard as sounding similar—even for natural speech stimuli—and that the features giving rise to perceptual dissimilarity of voices must transcend the linguistic typology or content of speech. While these observations expand on those reported by Fleming and colleagues (2014), who also used English and Mandarin voices, future work remains needed to explore how these observations generalize to pairings of other languages, as well as how listeners might judge dissimilarity of same-talker/cross-language pairs, which were not included in the design of Fleming and colleagues or this replication (cf. Winters et al., 2008).

Relatedly, listeners gave the lowest perceptual dissimilarity ratings to pairs of recordings coming from the same talker, regardless of talker language, listener language, or whether the recordings had been time-reversed. This strongly suggests that, for any given speaker, the individuating acoustic features of their voice are largely robust to disruption by time-reversal and speech content, and are largely preserved across linguistic differences in listeners’ experiences with voices. Moreover, the distributions of mean dissimilarity ratings for same-voice pairs and cross-voice pairs are essentially non-overlapping across listeners and languages [Fig. 3(C)], further suggesting that listeners’ perception of the distinct acoustic features associated with a particular talker are, in fact, highly discriminative, even when asked to judge subjective similarity. That is, across recordings and even distortion via time-reversal, any given talker is, on average, much more likely to sound like herself than like any other talker, even talkers of the same language, and moreover even when that language is completely unfamiliar to listeners. This reveals that listeners are highly perceptually sensitive to the individuating acoustic features of voices, even while they are simultaneously challenged in their ability to learn to associate those features with a particular talker’s identity (Perrachione, 2018Perrachione et al., 2015). This observation adds compelling further evidence that the discrimination and identification of voices are, by and large, two fundamentally separate abilities (Van Lancker and Kreiman, 1987Perrachione et al., 2014Fecher and Johnson, 2018), which should raise caution in ascribing causal mechanisms to phenomena in talker identification from data derived from different tasks. For instance, the language-familiarity effect is widely attested in talker identification (e.g., Goggin et al., 1991Perrachione and Wong, 2007Bregman and Creel, 2014Xie and Myers, 2015inter alia), but it is not clear that we can use results from other tasks, such as talker discrimination (Johnson et al., 2011Wester, 2012) or perceptual dissimilarity judgments (Fleming et al., 2014), to identify the causal mechanisms behind superior native-language talker identification abilities. Effects from these other tasks are unlikely to yield dispositive evidence about the underlying cognitive or perceptual mechanisms at play in talker identification (Levi, 2018). This view is further endorsed by a recent study of the development of the language-familiarity effect, showing that even when listeners exhibit a robust native-language bias in talker identification, they may show no effect of language on talker discrimination (Fecher and Johnson, 2018).

Finally, unlike the prior report of Fleming and colleagues (2014), we did not observe a consistent difference in listeners’ perceptual dissimilarity judgments for native- vs foreign-language talker pairs. While this overall effect was newly found in natural speech recordings (albeit inconsistently between the two groups; see below), we failed to replicate the prior observation of higher dissimilarity ratings for recordings of time-reversed native-language talkers compared to time-reversed foreign-language ones. This result is inconsistent with the view that the phonological features giving rise to the language-familiarity effect are reliably present even in time-reversed speech. Instead, this result parallels the observation that talker identification from time-reversed voices is also less susceptible to the language-familiarity effect (Perrachione et al., 2015), suggesting instead that listeners’ perceptual and mnemonic processing of time-reversed voices may be largely independent of any linguistic (i.e., phonological or lexical) features in the speech of talkers or language-specific representations in the minds of listeners. Because we failed to replicate the core finding of Fleming and colleagues (2014), in Sec. IVB, we explore the patterns of listener- and talker-language effects in perceptual dissimilarity judgments of voices in greater detail.

B. A language-familiarity effect in perceptual judgments of voice dissimilarity?

Although prior reports have suggested that listeners’ perceptual dissimilarity judgments of voices reveal a phonological basis for the language-familiarity effect, there is no direct evidence for preservation of language-specific phonological features in time-reversed speech. Our observation that time-reversed native-language voices are not consistently judged to be more dissimilar than foreign-language ones also calls this interpretation into question, particularly since the language-familiarity effect has otherwise been so widely replicated in talker identification tasks (as reviewed in Perrachione, 2018Levi, 2018). How, then, might linguistic factors affect perceptual dissimilarity judgments of voices?

Native speakers of both Mandarin and English judged time-reversed Mandarin voices to be more dissimilar than time-reversed English voices [Fig. 4(A)]. If language familiarity affects perceptual dissimilarity judgments and talker identification abilities in the same way, then this pattern of perceptual dissimilarity judgments by English-speaking listeners is unexpected and inconsistent with that hypothesis. For the time-forward voices, too, English-speaking listeners found English-speaking voices no more dissimilar than Mandarin-speaking ones, notwithstanding their widely reported difficulty learning to identify Mandarin talkers (Perrachione and Wong, 2007Perrachione et al., 2009Perrachione et al., 2011Perrachione et al., 2015McLaughlin et al., 2015Xie and Myers, 2015Zarate et al., 2015McLaughlin et al., 2019). Instead, this result suggests that perceptual dissimilarity judgments of voices may ultimately have more to do with acoustic differences in the voices of talkers than with linguistic differences in the minds of listeners.

Turning to our novel analysis of perceptual dissimilarity judgments of same-talker voice pairs, our results again do not suggest a language-familiarity effect in perceptual dissimilarity judgments. Mandarin- and English-speaking listeners do not reliably judge foreign-language, same-talker voice pairs as sounding more dissimilar than native-language, same-talker voice pairs—the pattern we would expect if listeners were more sensitive to the distinguishing features of voices in their native language. This pattern of results is observed in both time-reversed and natural recordings, where talker identity should have been more easily ascertained (Sheffert et al., 2002Remez et al., 2007Perrachione et al., 2014Perrachione et al., 2015): neither English- nor Mandarin-speaking listeners appear to favor their native language, contrary to the predictions of a language-familiarity effect for voice dissimilarity judgments. This suggests that listeners are actually quite good at “telling together” foreign-language voices (e.g., Lavan et al., 2019aLavan et al., 2019b), even when they otherwise struggle to learn to identify those same voices (e.g., McLaughlin et al., 2019).

Taken together, these results provide little evidence for a language-familiarity effect on perceptual dissimilarity judgments of voices. The expected pattern of results—that different-talker voice pairs will be more dissimilar, and same-talker voice pairs will be more similar, in a listener’s native language—is infrequently attested and the exact opposite pattern of results is sometimes found instead. Even where the pattern seems consistent with the hypothesis—for instance, in the Mandarin listeners’ judgments of greater perceptual dissimilarity for time-reversed Mandarin voices—the hypothesis is rejected by other data, namely the identical-but-unexpected pattern of judgment by English listeners. Instead, an alternative hypothesis—that perceptual dissimilarity judgments of voices depend more on the acoustic features of the voices themselves than on perceptual biases arising from listeners’ long-term experiences—appears more tenable.

C. Consistency of perceived dissimilarity across listeners and comprehensibility

In the language-familiarity effect, listeners are not only less accurate at learning to identify voices speaking a foreign language, they also exhibit different patterns of talker identification errors than native-language listeners (Perrachione and Wong, 2007). If listeners identify voices based on the same features they use when judging their perceptual dissimilarity, we would expect to find divergence between the dissimilarity judgments of listeners of different backgrounds. Instead, we found these judgments to be remarkably consistent across listeners of different language backgrounds (Fig. 

(Fig.5).

5). Pairs of talkers thought to sound more similar by Mandarin-speaking listeners were also thought to sound more similar by English-speaking listeners, regardless of whether those voices were speaking English or Mandarin, and regardless of whether those voices were forward or time-reversed. Indeed, perception of talker dissimilarity was also robust to time-reversal, with pairs of voices judged to sound more similar in time-forward speech also judged to sound more similar in time-reversed speech. Together, these results suggest that perceptual dissimilarity judgments of voices are likely to be made on the basis of acoustic features of speech and voice that are independent of speech comprehensibility (or even naturalness), and which are largely unaffected by the linguistic structure of speech. Correspondingly, we next explored the possibility that language-independent acoustic features form the basis for perceptual dissimilarity judgments of voices.

D. Acoustic features in perceptual dissimilarity judgments of voices

Across both listener groups, both talker languages, and both forward and time-reversed speech, differences in talkers’ mean fundamental frequency were most strongly related to listeners’ dissimilarity judgments. For a pair of recordings in which talkers exhibited greater differences in mean f0, listeners were more likely to rate that pair of voices as sounding dissimilar, all other factors notwithstanding. Other acoustic factors were also related to listeners’ judgments of talker dissimilarity, including HNR and formant dispersion. As acoustic measures, HNR may capture perceptual correlates of voice quality related to periodicity in the glottal cycle, while formant dispersion indexes vocal tract length—both individuating acoustic features of the voice that should be preserved across time-reversal and readily salient in both languages.

If language familiarity affects listeners’ dissimilarity judgments of voices through differential familiarity with the pattern of low-level acoustic features in their native language vs a foreign language, we would expect to see significant interactions between listeners’ native language and the various acoustic measures affecting perceptual dissimilarity judgments. In fact, we observed only one such interaction, relating not to a low-level acoustic property of talkers’ voices, but rather to speech rate: both Mandarin and English listeners tended to actively discount differences in speech rate as an index of dissimilarity in their native languages; however, English listeners appeared to change their strategy and rely on this as a measure of dissimilarity when listening to Mandarin. Interestingly, this difference did not appear to be affected by time-reversal, suggesting it is not related to comprehensibility—that is, Mandarin listeners could understand the natural speech in both languages, whereas English listeners would find only natural English speech comprehensible.

The only other acoustic factor that interacted with our linguistic manipulations was HNR, which played a larger role in dissimilarity judgments for time-reversed voices than time-forward ones. The HNR measurement should capture perceptual qualities related to the periodicity of the vocal source, which should be preserved under time-reversal, even while other phonetic and phonological relationships are obfuscated by that manipulation. That HNR played a greater role in perceptual dissimilarity judgments under time-reversal in both languages may suggest that listeners increased their reliance on this cue in the absence of other phonetic features on which they would usually base their judgments. This may be related to the observation that judgments of dissimilarity were more extreme for natural recordings compared to time-reversed ones, where listeners would have found fewer familiar phonetic features from which to make their judgment.

E. What can we learn about perception of voices from subjective judgments?

If perceptual dissimilarity judgments of voices are unlikely to reveal the cognitive bases of the language-familiarity effect in talker identification, then what else can we learn about human voice processing from this method? Our experience conducting this study suggests that this kind of paradigm has several limitations that constrain its utility in answering theoretical questions about voice processing.

First, listeners appear predisposed to make judgments at the extreme ends of the response range [Fig. 3(B)]. Listeners are, overall, very good at discriminating whether two voices are the same or different (e.g., Wester, 2012), and, consequently, their judgments of dissimilarity often take a binary form: Listeners’ responses primarily encode whether they believe two stimuli are the same voice or not, even when asked to judge voice dissimilarity on a continuous scale. Thus, this paradigm has limited sensitivity for identifying cognitive or acoustic factors in voice perception.

Second, listeners’ dissimilarity judgments may not be based on the same features for every trial. For example, listeners may largely rely on mean f0 in dissimilarity judgments, but then lean on differences in other features like speech rate or formant dispersion only when differences in mean f0 are particularly small. Listeners may also exhibit nonlinear correspondences between acoustic features and their dissimilarity judgments, such that a linear model—even accounting for perceptual warping of acoustic space as we have done—will fail to adequately model how listeners map perceptual space to subjective dissimilarity for a complex acoustic stimulus like a voice (e.g., Kreiman and Sidtis, 2011).

Third, the interpretability of experimental paradigms involving time-reversal of speech is limited by the extent to which this manipulation is unnatural, both in terms of listeners’ experience with such stimuli and the ecological validity of the resulting acoustic features themselves. Ecologically invalid acoustic features of time-reversed speech include the physiologically impossible reversed shape of the glottal waveform, the unnatural trajectory of formant transitions, the unnatural trajectory of the amplitude contour, the presence of uncommon or phonotactically impermissible phoneme sequences, and so on. In the present work, we found that listeners appeared to rely primarily on salient, non-articulatory acoustic cues (e.g., mean pitch) when judging perceived dissimilarity for speech. Reliance on such cues, which convey only a fraction of talker identity (e.g., Remez et al., 1997Perrachione et al., 2014), may suggest why listeners tend to perform so poorly when learning to identify talkers from time-reversed speech (Sheffert et al., 2002) and why they may fail to exhibit a language-familiarity effect from such stimuli (Perrachione et al., 2015).

Fourth, just because listeners believe two voices sound similar—or dissimilar—does not necessarily mean they will be more or less likely to confuse those voices during discrimination, recognition, or identification tasks. It is possible that the acoustic cues that listeners prioritize when rating voice dissimilarity differ from those that underlie the holistic auditory gestalt that gives rise to their perception of voice identity (e.g., Kreiman and Sidtis, 2011). It remains to be seen whether subjective judgments of voice dissimilarity correspond to objective measurements of listeners’ talker discrimination or talker identification skills.

Finally, it is worth noting that the relatively large number of cross-language pairs may have introduced bias in listeners’ dissimilarity judgments. Because cross-language pairs were consistently more different than within-language pairs—both in terms of their acoustics and listeners’ judgments of them—their relative overrepresentation in this design may have “anchored” listeners’ expectations about the magnitude of differences they could expect, leading them to report greater within-language similarity than they otherwise might have. Likewise, following Fleming and colleagues (2014), we did not include cross-language/same-talker pairs in our design. How listeners would rate the dissimilarity of recordings of the same talker speaking different languages remains to be seen. (A change in language between talker familiarization and later recognition does appear to reduce recognition accuracy, revealing that some cues to talker identity are likely to be language-specific; Winters et al., 2008.)

F. Potential alternatives for more effective measurements of talker dissimilarity judgments

If subjective perceptual dissimilarity ratings are encumbered by limitations on their generalizability and ecological validity, and if other laboratory tasks, such as talker identification or discrimination, are also limited in their ability to tell us about ecological voice perception due to their disproportionate demands on long-term memory or low-level acoustic analysis, respectively, then what other options may be available to develop a more ecological understanding of voice dissimilarity judgments in future studies? A key approach will be to treat voices in the laboratory the same way we do psychologically: holistically. Rather than force listeners into paradigms where they seem to make decisions based on salient, low-level acoustic features, future paradigms can require listeners to make judgments based on the vocal gestalt. For instance, a recently developed tasks asks listeners to indicate when a talker changes (not just whether), allowing the sensitive measure of response time to reveal subtle differences in how listeners detect differences between voices (Sharma et al., 2019). Combining ratings of perceived voice dissimilarity with a change detection task can reveal whether and how dissimilarity measures are related to ecological voice perception behaviors, such as switching attention to a new talker in a mixed-talker setting.

In applying a more holistic approach to studying perceived dissimilarity, listeners could be presented with two pairs of voices and be required to indicate which pair is more dissimilar—thus reducing their tendency to provide an extreme dissimilarity rating for any pair of voices separately. Responses can be analyzed using approaches from psychological models of comparative judgment (Thurstone, 1927) and then related to stimulus acoustics. Beyond a winner-takes-all approach, the relative dissimilarity of two pairs of voices can be analyzed using magnitude estimation methods borrowed from psychophysics (e.g., Poulton, 1968), which have seen similar utility in other fields of linguistics (e.g., Bard et al., 1996) where listeners are also otherwise biased toward the extreme end of a bounded rating scale. Finally, researchers can undertake targeted examinations of acoustic features, in isolation or combination, which they hypothesize underlie dissimilarity judgments of voices through sophisticated acoustic resynthesis as available in software packages like STRAIGHT (Kawahara et al., 2008). Hypothesis-driven approaches to acoustic manipulation have already done much to inform the psychological foundations of voice perception (e.g., Latinus and Belin, 2011aLatinus et al., 2013); future work will be able to use similar approaches to examine how acoustic and linguistic factors—including differences between the languages spoken by talkers and listeners—interact in listeners’ perception of voices.

Go to:

V.CONCLUSIONS

(i) Listeners’ perceptual dissimilarity judgments of voices provide weak and inconsistent evidence of a language-familiarity effect in voice processing, especially compared to the effect sizes reported in prior literature using talker identification or talker discrimination tasks. (ii) Overall, listeners of different language backgrounds tend to make perceptual judgments of voice dissimilarity that are more alike than different, regardless of whether they are listening to voices in their native language or whether the voices had been rendered incomprehensible by time-reversal. (iii) Of the acoustic features analyzed here, mean f0 tended to have the greatest effect on listeners’ judgments of voice dissimilarity regardless of the language spoken by talkers or listeners; however, listeners’ judgments of voice dissimilarity do not appear to map neatly onto isolated acoustic features. (iv) These results ultimately suggest that the language-familiarity effect in voice processing is more likely to be due to linguistic or mnemonic bases than perceptual ones.

Go to:

ACKNOWLEDGMENTS

We thank Deirdre McLaughlin, Gabriel Cler, Yaminah Carter, Sara Dougherty, Jennifer Golditch, Andrea Chang, Cecilia Cheng, and Sung-Joo Lim. Research reported in this article was supported by the National Institute on Deafness and Other Communication Disorders (NIDCD) of the National Institutes of Health under Grant No. R03DC014045 and a Young Investigator Award from the Brain and Behavior Research Foundation to T.P.

Go to:

Footnotes

1As noted during peer review, a counterbalancing error for one participant led to three too many same-identity trials, one too few cross-language trials, and two too few native-language trials. Unique items were nonetheless heard on all trials, and the correct total number of trials was heard. This error affected 0.006% of the data collected.

Go to:

References

1. Bard, E. G. , Robertson, D. , and Sorace, A. (1996). “ Magnitude estimation of linguistic acceptability,” Language 72, 32–68. 10.2307/416793 [CrossRef] [Google Scholar]

2. Baumann, O. , and Belin, P. (2010). “ Perceptual scaling of voice identity: Common dimensions for different vowels and speakers,” Psychol. Res. 74, 110–120. 10.1007/s00426-008-0185-z [PubMed] [CrossRef] [Google Scholar]

3. Bregman, M. R. , and Creel, S. C. (2014). “ Gradient language dominance affects talker learning,” Cognition 130, 85–95. 10.1016/j.cognition.2013.09.010 [PubMed] [CrossRef] [Google Scholar]

4. Davis, S. B. (1981). “ Acoustical characteristics of normal and pathological voices,” ASHA Rep. 11, 97–115. [Google Scholar]

5. Davison, D. S. (1991). “ An acoustic study of so-called creaky voice in Tianjin Mandarin,” UCLA Work. Pap. Phonetics 78, 50–57. [Google Scholar]

6. Fecher, N. , and Johnson, E. K. (2018). “ Effects of language experience and task demands on talker recognition by children and adults,” J. Acoust. Soc. Am. 143, 2409–2418. 10.1121/1.5032199 [PubMed] [CrossRef] [Google Scholar]

7. Fitch, W. T. (1997). “ Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques,” J. Acoust. Soc. Am. 102, 1213–1222. 10.1121/1.421048 [PubMed] [CrossRef] [Google Scholar]

8. Fleming, D. , Giordano, B. L. , Caldara, R. , and Belin, P. (2014). “ A language-familiarity effect for speaker discrimination without comprehension,” Proc. Natl. Acad. Sci. 111, 13795–13798. 10.1073/pnas.1401383111 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

9. Fu, Q.-J. , Zhu, M. , and Wang, X. (2011). “ Development and validation of the Mandarin speech perception test,” J. Acoust. Soc. Am. 129(6), EL267–EL273. 10.1121/1.3590739 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

10. Ganugapati, D. , and Theodore, R. M. (2019). “ Structured phonetic variation facilitates talker identification,” J. Acoust. Soc. Am. 145, EL469–EL475. 10.1121/1.5100166 [PubMed] [CrossRef] [Google Scholar]

11. Goggin, J. P. , Thompson, C. P. , Strube, G. , and Simental, L. R. (1991). “ The role of language familiarity in voice identification,” Mem. Cognit. 19, 448–458. 10.3758/BF03199567 [PubMed] [CrossRef] [Google Scholar]

12. Hillenbrand, J. (1988). “ Perception of aperiodicities in synthetically generated voices,” J. Acoust. Soc. Am. 83, 2361–2371. 10.1121/1.396367 [PubMed] [CrossRef] [Google Scholar]

13. IEEE (1969). “ IEEE recommended practice for speech quality measurements,” IEEE Trans. Audio Electroacoust. 17(3), 225–246. 10.1109/TAU.1969.1162058 [CrossRef] [Google Scholar]

14. Johnson, E. K. , Westrek, E. , Nazzi, T. , and Cutler, A. (2011). “ Infant ability to tell voices apart rests on language experience,” Dev. Sci. 14, 1002–1011. 10.1111/j.1467-7687.2011.01052.x [PubMed] [CrossRef] [Google Scholar]

15. Karnell, M. P. , Melton, S. D. , Childes, J. M. , Coleman, T. C. , Dailey, S. A. , and Hoffman, H. T. (2007). “ Reliability of clinician-based (GRBAS and CAPE-V) and patient-based (V-RQOL and IPVI) documentation of voice disorders,” J. Voice 21, 576–590. 10.1016/j.jvoice.2006.05.001 [PubMed] [CrossRef] [Google Scholar]

16. Kawahara, H. , Morise, M. , Takahashi, T. , Nisimura, R. , Irino, T. , and Banno, H. (2008). “ Tandem-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation,” in Proc. ICASSP 2008, Las Vegas, pp. 3933–3936. [Google Scholar]

17. Keating, P. , and Esposito, C. (2007). “ Linguistic voice quality,” UCLA Work. Pap. Phonetics 105, 85–91. [Google Scholar]

18. Kempster, G. B. , Gerratt, B. R. , Verdolini Abbott, K. , Barkmeier-Kramer, J. , and Hillman, R. E. (2009). “ Consensus auditory-perceptual evaluation of voice: Development of a standardized clinical protocol,” Am. J. Speech-Lang. Pathol. 18, 124–132. 10.1044/1058-0360(2008/08-0017) [PubMed] [CrossRef] [Google Scholar]

19. Kreiman, J. , Gerratt, B. R. , and Khan, S. U. (2010). “ Effects of native language on perception of voice quality,” J. Phonetics 38, 588–593. 10.1016/j.wocn.2010.08.004 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

20. Kreiman, J. , Gerratt, B. R. , and Precoda, K. (1990). “ Listener experience and perception of voice quality,” J. Speech Hear. Res. 33, 103–115. 10.1044/jshr.3301.103 [PubMed] [CrossRef] [Google Scholar]

21. Kreiman, J. , and Sidtis, D. (2011). Foundations of Voice Studies: An Interdisciplinary Approach to Voice Production and Perception ( Wiley-Blackwell, Malden, MA: ). [Google Scholar]

22. Kreiman, J. , Vanlancker-Sidtis, D. , and Gerratt, B. (2005). “ Perception of voice quality,” in The Handbook of Speech Perception, edited by Pisoni D. B. and Remez R. E. ( Blackwell, Malden, MA: ). [Google Scholar]

23. Kriegeskorte, N. , Mur, M. , and Bandettini, P. (2008). “ Representational similarity analysis—Connecting the branches of systems neuroscience,” Front. Syst. Neurosci. 2, 4. 10.3389/neuro.01.016.2008 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

24. Latinus, M. , and Belin, P. (2011a). “ Anti-voice adaptation suggests prototype-based coding of voice identity,” Front. Psychol. 2, 175. 10.3389/fpsyg.2011.00175 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

25. Latinus, M. , and Belin, P. (2011b). “ Human voice perception,” Curr. Biol. 21(4), R143–R145. 10.1016/j.cub.2010.12.033 [PubMed] [CrossRef] [Google Scholar]

26. Latinus, M. , McAleer, P. , Bestelmeyer, P. E. G. , and Belin, P. (2013). “ Norm-based coding of voice identity in human auditory cortex,” Curr. Biol. 23, 1075–1080. 10.1016/j.cub.2013.04.055 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

27. Lavan, N. , Burston, L. F. K. , and Garrido, L. (2019a). “ How many voices did you hear? Natural variability disrupts identity perception from unfamiliar voices,” Br. J. Psychol. 110, 576–593. 10.1111/bjop.12348 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

28. Lavan, N. , Burton, A. M. , Scott, S. K. , and McGettigan, C. (2019b). “ Flexible voices: Identity perception from variable vocal signals,” Psychonom. Bull. Rev. 26, 90–102. 10.3758/s13423-018-1497-7 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

29. Lavan, N. , Merriman, S. E. , Ladwa, P. , Burston, L. F. K. , Knight, S. , and McGettigan, C. (2018). “‘ Please sort these sounds into 2 identities’: Effects of task instructions on performance invoice sorting studies,” Br. J. Pyschol. (published online). 10.1111/bjop.12416 [PubMed] [CrossRef] [Google Scholar]

30. Levi, S. V. (2018). “ Methodological considerations for interpreting the language familiarity effect in talker processing,” WIREs Cogn. Sci. 10, e1483. 10.1002/wcs [PubMed] [CrossRef] [Google Scholar]

31. McLaughlin, D. E. , Carter, Y. D. , Cheng, C. C. , and Perrachione, T. K. (2019). “ Hierarchical contributions of linguistic knowledge to talker identification: Phonological versus lexical familiarity,” Atten. Percept. Psychophys. 81, 1088–1107. 10.3758/s13414-019-01778-5 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

32. McLaughlin, D. E. , Dougherty, S. C. , Lember, R. A. , and Perrachione, T. K. (2015). “ Episodic memory for words enhances the language familiarity effect in talker identification,” in 18th International Congress of Phonetic Sciences, August, Glasgow. [Google Scholar]

33. Meissner, C. A. , and Brigham, J. C. (2001). “ Thirty years of investigating the own-race bias in memory for faces: A meta-analytic review,” Psychol., Public Policy Law 7, 3–35. 10.1037/1076-8971.7.1.3 [CrossRef] [Google Scholar]

34. Mok, P. (2008). “ On the syllable-timing of Cantonese and Beijing Mandarin,” in Proceedings of the 8th Phonetics Conference of China (PCC 2008) and the International Symposium on Phonetic Frontiers (ISPF 2008), Beijing. [Google Scholar]

35. Peirce, J. W. (2007). “ PsychoPy—Psychophysics software in Python,” J. Neurosci. Methods 162(1), 8–13. 10.1016/j.jneumeth.2006.11.017 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

36. Perrachione, T. K. (2018). “ Recognizing speakers across languages,” in The Oxford Handbook of Voice Perception, edited by Frühholz S. and Belin P. ( Oxford University Press, Oxford: ). [Google Scholar]

37. Perrachione, T. K. , Del Tufo, S. N. , and Gabrieli, J. D. E. (2011). “ Human voice recognition depends on language ability,” Science 333, 595. 10.1126/science.1207327 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

38. Perrachione, T. K. , Dougherty, S. C. , McLaughlin, D. E. , and Lember, R. A. (2015). “ The effects of speech perception and speech comprehension on talker identification,” in 18th International Congress of Phonetic Sciences, August, Glasgow. [Google Scholar]

39. Perrachione, T. K. , Pierrehumbert, J. B. , and Wong, P. C. M. (2009). “ Differential neural contributions to native- and foreign-language talker identification,” J. Exp. Psychol. Hum. Percept. Perform. 35, 1950–1960. 10.1037/a0015869 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

40. Perrachione, T. K. , Stepp, C. E. , Hillman, R. E. , and Wong, P. C. M. (2014). “ Talker identification across source mechanisms: Experiments with laryngeal and electrolarynx speech,” J. Speech Lang. Hear. Res. 57, 1651–1665. 10.1044/2014_JSLHR-S-13-0161 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

41. Perrachione, T. K. , and Wong, P. C. M. (2007). “ Learning to recognize speakers of a non-native language: Implications for the functional organization of human auditory cortex,” Neuropsychologia 45, 1899–1910. 10.1016/j.neuropsychologia.2006.11.015 [PubMed] [CrossRef] [Google Scholar]

42. Pisanski, K. , Fraccaro, P. J. , Tigue, C. C. , O’Connor, J. J. M. , Roder, S. , Andrews, P. W. , Fink, B. , DeBruine, L. M. , Jones, B. C. , and Feinberg, D. R. (2014). “ Vocal indicators of body size in men and women: A meta-analysis,” Anim. Behav. 95, 89–99. 10.1016/j.anbehav.2014.06.011 [CrossRef] [Google Scholar]

43. Poulton, E. C. (1968). “ The new psychophysics: Six models for magnitude estimation,” Psycholog. Bull. 69, 1–19. 10.1037/h0025267 [CrossRef] [Google Scholar]

44. Remez, R. E. , Fellowes, J. M. , and Nagel, D. S. (2007). “ On the perception of similarity among talkers,” J. Acoust. Soc. Am. 122, 3688–3696. 10.1121/1.2799903 [PubMed] [CrossRef] [Google Scholar]

45. Remez, R. E. , Fellowes, J. M. , and Rubin, P. E. (1997). “ Talker identification based on phonetic information,” J. Exp. Psychol. Hum. Percept. Perform. 23, 651–666. 10.1037/0096-1523.23.3.651 [PubMed] [CrossRef] [Google Scholar]

46. Schweinberger, S. R. , Kawahara, H. , Simpson, A. P. , Skuk, V. G. , and Zäske, R. (2014). “ Speaker perception,” WIREs Cogn. Sci. 5, 15–25. 10.1002/wcs.1261 [PubMed] [CrossRef] [Google Scholar]

47. Schweinberger, S. R. , and Zaske, R. (2018). “ Perceiving speaker identity from the voice,” in The Oxford Handbook of Voice Perception, edited by Fruhholz S. and Belin P. ( Oxford University Press, Oxford: ). [Google Scholar]

48. Sharma, N. K. , Ganesh, S. , Ganapathy, S. , and Holt, L. L. (2019). “ Talker change detection: A comparison of human and machine performance,” J. Acoust. Soc. Am. 145, 131–142. 10.1121/1.5084044 [PubMed] [CrossRef] [Google Scholar]

49. Sheffert, S. M. , Pisoni, D. B. , Fellowes, J. M. , and Remez, R. E. (2002). “ Learning to recognize talkers form natural, sinewave, and reversed speech samples,” J. Exp. Psychol. Hum. Percept. Perform. 28, 1447–1469. 10.1037/0096-1523.28.6.1447 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

50. Shih, C. L. (1988). “ Tone and intonation in Mandarin,” Work Pap Cornell Phonetic Lab. 3, 83–109. [Google Scholar]

51. Slifka, J. (2007). “ Irregular phonation and its preferred role as a cue to silence in phonological systems,” in 16th International Congress of Phonetic Sciences, August, Saarbrucken. [Google Scholar]

52. Stevens, S. S. , Volkmann, J. , and Newman, E. B. (1937). “ A scale for the measurement of the psychological magnitude pitch,” J. Acoust. Soc. Am. 8, 185–190. 10.1121/1.1915893 [CrossRef] [Google Scholar]

53. Studebaker, G. A. (1985). “ A ‘rationalized’ arcsine transform,” J. Speech Hear. Res. 28, 455–462. 10.1044/jshr.2803.455 [PubMed] [CrossRef] [Google Scholar]

54. Thurstone, L. L. (1927). “ A law of comparative judgment,” Psychol. Rev. 34, 273–286. 10.1037/h0070288 [CrossRef] [Google Scholar]

55. Van Lancker, D. , and Kreiman, J. (1987). “ Voice discrimination and recognition are separate abilities,” Neuropsychologia 25, 829–834. 10.1016/0028-3932(87)90120-5 [PubMed] [CrossRef] [Google Scholar]

56. Werker, J. F. , and Tees, R. C. (1984). “ Cross-language speech perception: Evidence for perceptual reorganization during the first year of life,” Infant Behav. Dev. 7, 49–63. 10.1016/S0163-6383(84)80022-3 [CrossRef] [Google Scholar]

57. Wester, M. (2012). “ Talker discrimination across languages,” Speech Commun. 54, 781–790. 10.1016/j.specom.2012.01.006 [CrossRef] [Google Scholar]

58. Winters, S. J. , Levi, S. V. , and Pisoni, D. B. (2008). “ Identification and discrimination of talkers across languages,” J. Acoust. Soc. Am. 123, 4524–4538. 10.1121/1.2913046 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

59. Xie, X. , and Myers, E. (2015). “ The impact of musical training and tone language experience on talker identification,” J. Acoust. Soc. Am. 137, 419–432. 10.1121/1.4904699 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

60. Zarate, J. M. , Tian, X. , Woods, K. J. P. , and Poeppel, D. (2015). “ Multiple levels of linguistic and paralinguistic features contribute to voice recognition,” Sci. Rep. 5, 11475. 10.1038/srep11475 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

61. Zraick, R. I. , Kempster, G. B. , Connor, N. P. , Thibeault, S. , Klaben, B. K. , Bursac, Z. , Thrush, C. R. , and Glaze, L. E. (2011). “ Establishing validity of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V),” Am. J. Speech-Lang. Pathol. 20, 14–22. 10.1044/1058-0360(2010/09-0105) [PubMed] [CrossRef] [Google Scholar]

17 Jan

Comitiva de boiadeiros no Pantanal

COMITIVA DE BOIADEIROS NO PANTANAL – MATO GROSSENSE DO SUL:
MODO DE VIDA E LEITURA DA PAISAGEM

Aos boiadeiros do Pantanal, que tanto me inspiraram no trajeto desta pesquisa, por sua beleza, sabedoria e coragem.

Para ser grande, sê inteiro: nada Teu exagera ou exclui Sê todo em cada coisa. Põe quanto és No mínimo que fazes. Assim em cada lago a lua toda Brilha, porque alta vive.

Fig. 1 – Sr. Zé Preto atravessando a boiada no rio Cerradinho. Abobral. Acompanhamento segunda Comitiva.

RESUMO

Esta dissertação aborda o modo de vida e a leitura da paisagem dos boiadeiros no Complexo Pantanal Sul-Mato-Grossense. Os boiadeiros representam parte dos trabalhadores da pecuária, uma importante atividade econômica nesta região. Montados em burros, atravessam diversas paisagens viajando até meses, conduzindo grande quantidade de gado pertencente a pecuaristas. Devido à escassez de material disponível na literatura foram coletados relatos, principalmente, de entrevistas com interlocutores locais, suas histórias de vida e através do acompanhamento presencial de Comitivas de boiadeiros. Para compreensão do tema adotou- se a concepção de paisagem como lugar no contexto de populações tradicionais, considerando o significado dado pelas experiências vividas e representações simbólicas. A descrição contextualizada de Geertz (1989) trouxe contribuições metodológicas para fundamentar o trabalho de campo e auxiliar na interpretação dos dados. Deste modo, buscou-se esboçar o universo cultural do boiadeiro, descrevendo a estrutura e o cotidiano desta atividade, que segue o ritmo das águas do Pantanal, estabelecendo as fases de enchentes, cheias, vazantes e estiagens. Além disto, por meio de relatos de boiadeiros foram elaborados mapas de alguns dos roteiros destas viagens, identificando-se os marcos referenciais da paisagem cultural e um matiz de linguagens como estratégias de orientação. A interpretação de dados proporcionou uma discussão sobre as contradições e adaptações no modo de vida dos boiadeiros frente às mudanças econômicas e sociais, reconhecendo sua persistência, singularidade e complexidade como um conhecimento extreitamente integrado às paisagens pantaneiras. As reflexões nesta pesquisa pretendem apontar uma diferente perspectiva, de acordo com a importância do valor cultural dos boiadeiros pantaneiros.

LISTA DE FIGURAS [1]

Fig. 1 – Sr. Zé Preto atravessando a boiada no rio Cerradinho. Abobral. Acompanhamento segunda Comitiva.
Fig. 2 – Vó Olívia, eu e minha irmã Denise (à direita)
Fig. 3 – Fazenda Sanharão (avôs maternos)
Fig. 4 – Vô Basílio, minha irmã Denise e prima Telma (à direita)
Fig. 5 – Refúgio Ecológico Caiman. Miranda-MS. (Fonte: Refúgio Ecológico Caiman)
Fig. 6 – Trabalhando como guia (de costas, explicando sobre a palmeira Acuri): Trilha Cordilheira do X.
Fig. 7 – Trabalhando como guia (em pé, próxima a baía), informando sobre o passeio de canoa.
Fig. 8 – Saída da Comitiva na Fazenda Caiman. Primeiro acompanhamento presencial de uma Comitiva de boiadeiros (ao meu lado direito está o Condutor Sr. Ramon Miranda, logo atrás está o seu pai, Sr. Alfredo, e ao fundo estão os Meeiros, Fiadores e um acompanhador do Retiro Santa Vóia, Fazenda Caiman).
Fig. 9 – Ciclo das águas e boiadeiros no Pantanal-MS. (À esquerda seguindo o sentido da seta: 1. Enchente: Ponte sobre o Rio Miranda. Segunda Comitiva. 2.Cheia: Travessia Rio Cerradinho. Segunda Comitiva. 3. Vazante: Ponteiro Morcego. Primeira Comitiva. 4. Seca: Saída de Comitiva da Fazenda Fátima). Montagem das fotos: Juliana Moreno.
Fig. 10 – Observação participante (primeira comitiva). À minha esquerda, os boiadeiros Vô Alfredo, Ramon, Morcego e Zumba
Fig. 11 – À minha esquerda, Zumba e à direita Morcego, com berrante. Primeira Comitiva
Fig. 12 – Sapo, minha montaria. Terceira Comitiva.
Fig. 13 – Sr. Alfredo Miranda, pai de Ramon
Fig. 14 – Cozinheiro anônimo seguindo viagem. Faz. Nossa Sra do Carmo
Fig. 15 – Sr. Zé Preto trabalhando na estação da cheia. Fonte: Pousada Xaraés
Fig. 16 – Juarez Rodrigues da Silva.
Fig. 17 – Sebastião Rolon
Fig. 18 – Luis Martins (Biguá)
Fig. 19 – José Aparecido F. da Silva (Barriga). Fonte: Pousada Xaraés.
Fig. 20 – Quadro Colaboradores.
Fig. 21 – Comitiva da Fazenda Redenção no ponto de pouso da Fazenda Nossa Senhora do Carmo.
Fig. 22 – Rádio em ponto de parada, na Comitiva da Fazenda Redenção
Fig. 23 – Juarez. Fonte: Mari Baldissera
Fig. 24 – Seu Zé Preto tomando tereré
Fig. 25 – Bomba
Fig. 26 – Guampa e bomba amarradas a traia.
Fig. 27 – Sr. Jair (Beto Carreiro), Wilson e Barba tomando tereré durante a marcha
Fig. 28 – Isopor (apelido). Detalhe do chapéu enfeitado com lacres de latas de alumínio
Fig. 29 – Sr. Zé Preto trabalhando com o couro de vaca para uso na própria tralha. Fonte: Pousada Xaraés
Fig. 30 – Ramon. Detalhe para acessórios. Fonte: Thiago Rocha
Fig. 31 – Boiadeiro anônimo. Ponto de pouso, fazenda Nossa Senhora do Carmo
Fig. 32 – Ponteiro Luís com o arreiador, “surrando” o animal. (terceira Comitiva)
Fig. 33 – Uso do reio por Ramon Miranda. Fonte: Thiago Rocha
Fig. 34 – Saída da terceira Comitiva. Cozinheiro e tropa cargueira passando à frente da boiada.
Fig. 35 – Sr. Geraldo dirigindo trator até o local de saída da primeira Comitiva acompanhada. Zumba (boiadiero) à direita
Fig. 36 – Simulação das funções dos boiadeiros em Comitiva
Fig. 37 – Ponteiro Luís tocando o berrante.
Fig. 38 – Ponteiro Morcego na Comitiva Fazenda Caiman ( 2005). Fonte: Thiago Rocha
Fig. 39 – Contagem de bois pelo Condutor. Terceira Comitiva
Fig. 40 – Acompanhador de fazenda e Cozinheiro Dourado
Fig. 41 – Cozinheiro Dourado encilhando burro cargueiro (bruacas em baixo, dobros dispostos sobre a mesma e lona para cobri-los).
Fig. 42 – Burro cargueiro encilhado. Comitiva Caiman. Fonte: Thiago Rocha
Fig. 43 – Mula cargueira encilhada. Comitiva Caiman. Fonte: Thiago Rocha
Fig. 44 – Ponto de pouso Fazenda Buriti. Terceira Comitiva.
Fig. 45 – Ponto de pouso. Redes armadas. Fonte: Csaba Gődény
Fig. 46 – Tropa “formada” (em fila organizada)
Fig. 47 – Marcas dos boiadeiros em ponto de parada (cinzas e postes para redes)
Fig. 48 – Cozinheiro e sua cozinha. Fonte: Csaba Gődény
Fig. 49 – Organização da cozinha. Pesquisadora e Ramon Miranda.
Fig. 50 – Cozinheiro Gilberto preparando arroz carreteiro. Comitiva Caima. Fonte: Thiago Rocha (2005)
Fig. 51 – Cozinheiro Gilberto preparando almoço. Comitiva Caiman. Fonte: Thiago Rocha (2005)
Fig. 52 – Organização da cozinha. Panelas de comida sobre trempe e o fogo. Outros utensílios sobre pequena mesa de madeira.
Fig. 53 – Bule de café e coador. Panela com água fervida, colher de concha e canecas de café.
Fig. 54 – Latas d‟ água penduradas em figueira (Fícus sp), colheres de concha, caneca maior para pegar água, menores para bebê-la
Fig. 55 – Poeira no estradão: terceira Comitiva.
Fig. 56 – Estouro de boiada na travessia do Rio Abobral. Comitiva da Nossa Senhora de Fátima.
Fig. 57 – Amanhecer no ponto de pouso da fazenda Nossa Senhora do Carmo. Comitiva desconhecida
Fig. 58 – Canto de cerca. Fazenda São Bento.
Fig. 59 – Porteira de varas. Fazenda Nossa Senhora do Carmo.
Fig. 60 – Simbra. Fazenda Nossa Senhora do Carmo.
Fig. 61 – Portão. Fazenda Nossa Senhora do Carmo.
Fig. 62 – Mata- burro. Faz. Nossa Senhora do Carmo.
Fig. 63 – Cocho. Faz. Nossa Senhora do Carmo.
Fig. 64 – Ponte sobre o Rio Abobral. Segunda Comitiva. Pousada Xaraés.
Fig. 65 – Comitiva Caiman. Fonte: Thiago Rocha.
Fig. 66 – Poço na invernada Antena. Faz. Nossa Senhora do Carmo. Terceira Comitiva.
Fig. 67 – Corredor Faz. São Bento. Região Abobral
Fig. 68 – Aterro. Faz. Nossa Senhora do Carmo
Fig. 69 – Boiadeira Central. Faz. São Carlos (seta branca indica estrada)
Fig. 70 – Estrada d‟água. Faz. Nossa Senhora do Carmo.
Fig. 71 – Batida de Boiada. Região Abobral
Fig. 72 – Estrada de cascalho. Região Nabileque.
Fig. 73 – Magro (apelido) na Comitiva da Fazenda Caiman. Fonte: Thiago Rocha.
Fig. 74 – Asfalto. BR164. Região Nabileque
Fig. 75 – Marca de boiadeiro em árvore.
Fig. 76 – Escrito de boiadeiro em ponto de pouso.
Fig. 77 – Escrito boiadeiro em pouso
Fig. 78 – Restos de cinza em ponto de pouso
Fig. 79 – Lixo em pontos de pouso (montagem)
Fig. 80 – Rabo de burro (A. bicornis). Região Abobral.
Fig. 81 – Pasto formado com humidícula. Região do abobral.
Fig. 82 – Carandazal (Copernicia Alba)
Fig. 83 – Estrada com mato fechado. Primeira Comitiva. Região Aquidauana/
Fig. 84 – Campina. Faz. Nossa Senhora do Carmo
Fig. 86 – Cordilheira. Faz. Nossa Senhora do Carmo.
Fig. 87 – Capão. Refúgio Ecológico Caiman
Fig. 88 – Raque e pecíolo de Acuri como espeto de churrasco
Fig. 85 – Campina
Fig. 89 – Fedegoso (Cassia occidentalis L.):
Fig. 90 – Erva de Santa Luzia (Euphorbia hirta L.):
Fig. 91 – Cânfora (Bacopa monnierioides):
Fig. 92 – Caramujo Aruá
Fig. 93 – Tachã
Fig. 94 – Saracura Três-
Fig. 95 – Bugio.
Fig. 96 – Tropa de burros (Equus asinus)
Fig. 97 – Cupins
Fig. 98 – Areião. Retiro Santo Onofre. Faz. Santa Filomena
Fig. 99 – Morro do Azeite. Fonte: Eric de Vito (2009).
Fig. 100 – Campo aberto. Estrada Parque
Fig. 101 – Bola pé. Travessia boiada no rio Cerradinho. Segunda Comitiva. Fazenda Fátima.
Fig. 102 – Vazante Cerradinho. Faz. Nossa Senhora do Carmo
Fig. 103 – Rio Paraguai. Porto da Manga. Embarcadouro de gado.
Fig. 104 – Corixo do inferno. Faz. Nossa Senhora do Carmo.
Fig. 105 – Marcos Antonio Vaca (Babuíno). Segunda Comitva. Carandazal
Fig. 106 – Orelhas do Sapo. Fazenda Santa Filomena. Segunda Comitiva.

MAPAS

Mapa 1 – Sub- Regiões ou “pantanais” do Pantanal: Bacia do Alto Paraguai no Brasil. Fonte: Silva; Abdon (1998).
Mapa 2– Mapa ilustrativo: Fazendas Pantanal- MS e roteiros das três Comitivas acompanhadas. Fonte: EMBRAPA (modificado).
Mapa 3 – Mapa falado por Biguá (2009) do roteiro de Comitiva de Aquidauana a Fazenda Central.

TABELAS

Tabela 1 – Acompanhamento de Comitivas
Tabela 2 – Entrevistas
Tabela 3 – Simulação de custos para o comprador de gado na contratação do serviço de uma Comitiva com duração de 11 marchas
Tabela 4 – Simulação de custos do Condutor pela prestação do serviço de uma Comitiva de 11 marchas.
Tabela 5 – Marcos referenciais da paisagem: paisagens da fazenda
Tabela 6 – Marcos referenciais na leitura da paisagem: marcas e escritos de boiadeiros
Tabela 7 – Marcos referenciais na leitura da paisagem: vegetação
Tabela 8 – Exemplos de plantas medicinais e formas de utilização citadas pelos boiadeiros.
Tabela 9 – Marcos Referenciais na leitura da paisagem: exemplos de animais
Tabela 10 – Marcos referenciais na leitura da paisagem: solos e relevo
Tabela 11 – Marcos referenciais na leitura da paisagem: paisagens aquáticas
Tabela 12 – Diferenças entre o ciclo das águas (cheia e seca) e seus significados para boiadeiros

SUMÁRIO

INTRODUÇÃO
Mundo – vida: Um conto que eu conto
Uma pesquisadora no ambiente de trabalho masculino
Estrutura dos capítulos
CAPÍTULO 1 – O CAMINHO TRAÇADO NA PESQUISA
1.1 Contextualização do tema de estudo
1.1.1 O Pantanal
1.1.2 O homem pantaneiro e a pecuária
1.2 Marco conceitual: A interpretação da paisagem como lugar no contexto de populações tradicionais
1.2.1 Populações tradicionais
1.3 Trajetória Metodológica
1.3.1 Os Colaboradores
1.3.2 Construção dos Resultados
CAPÍTULO 2. COMITIVA DE BOIADEIROS: MODO DE VIDA
2.1 Viajantes do estradão
2.2 Na batida das Comitivas de boiadeiros
2.3 Puxando a boiada
CAPÍTULO 3 – COMITIVA PANTANEIRA: LEITURAS DAS PAISAGENS
3.1 Na batida do Estradão – marcos referenciais na paisagem
3.2 No ritmo das águas
CAPÍTULO 4: APROXIMAÇÕES PARA UMA CONCLUSÃO
CONSIDERAÇÕES FINAIS
APÊNDICE
REFERÊNCIA BIBLIOGRÁFICA

INTRODUÇÃO

Mundo – vida: Um conto que eu conto

Faz-se necessário, como parte da trajetória metodológica [2] escolhida para esta pesquisa, discorrer sobre as razões pessoais que motivaram este trabalho. Expor um pouco da minha história de vida através de memórias, imaginação, percepções e antecipações.

Talvez a inspiração para esta pesquisa tenha se iniciado quando pequena no convívio com minha família materna, em uma fazenda na região do Vale do Ribeira, Mata Atlântica, no município de Barra do Turvo, São Paulo (Fig.2, 3 e 4). Meus avôs eram produtores rurais, meu avô, mesmo analfabeto, negociava e viajava transportando gado e conduzindo porcos a pé. Coisas vividas que contadas nos caminhos da pesquisa renderam boas risadas com alguns boiadeiros, pois no Pantanal são acostumados apenas a conduzir gado a cavalo. Tocar porco a pé soa muito esquisito! Foram anos marcantes de minha vida, dos quais guardo lembranças e ouço histórias contadas e re-contadas na família que aguçam minha curiosidade até os dias de hoje sobre o modo de viver, sentir e trabalhar na pecuária e agricultura.

Fig. 2 – Vó Olívia, eu e minha irmã Denise (à direita).
Fig. 3 – Fazenda Sanharão (avôs maternos).
Fig. 4- Vô Basílio, minha irmã Denise e prima Telma (à direita).

Dessas vivências, credito o surgimento do interesse pelo modo de vida rural e o interesse pela pesquisa da vida da gente do campo. Um pouco difícil, porém, tem sido relacionar emoção e razão ou coração e cientificidade. Desenvolver o mestrado para mim foi algo quase que visceral e apesar de haver tantas regras formais nessa trajetória, ainda acredito que não é necessário se perder a paixão. De qualquer forma, compreendo que há muita responsabilidade em escrever sobre outros modos de vida, outras visões de mundo, que são diferentes de minha experiência, portanto o cuidado científico proporcionou uma segurança necessária durante a elaboração deste trajeto acadêmico.

Este estudo é a continuidade de uma experiência de pesquisa que realizei na monografia de conclusão da graduação em Ecologia na Universidade Estadual de São Paulo (UNESP- Rio Claro) em 2002 [3]. Naquele momento, buscava compreender a relação entre homem e ambiente por meio do espaço vivido por moradores limítrofes às áreas naturais protegidas na região do Vale do Ribeira, no mesmo município onde residiam meus avôs maternos. Meu interesse foi buscar compreender como viviam populações estreitamente dependentes dos ritmos da natureza, quais saberes ou conhecimentos emergiam dessa relação e como têm se mantido diante da realidade atual.

Após esta experiência com a pesquisa acadêmica vivi uma curta experiência trabalhando em São Paulo, quando surgiu uma oportunidade para trabalhar como guia de ecoturismo em uma pousada no Pantanal (Refúgio Ecológico Caiman- Fig. 5, 6 e 7). A entrevista foi feita em São Paulo e acho que fiquei o tempo todo olhando e refletindo, de certo modo encantada com um quadro que mostrava a fotografia da pousada à beira de uma baía imensa. Fui ao encontro da paisagem do quadro… Assim, pude apaixonar-me pelo Pantanal e aos poucos, aproximar- me do ritmo da região, das estações e da cultura pantaneira.

Foi desta convivência que surgiu a chance, em 2005, de acompanhar uma Comitiva de boiadeiros (Fig. 8), onde o intuito era o de transportar cerca de 500 vacas da Fazenda Estância Caiman para outra fazenda, do mesmo proprietário [4].

Fig. 5 – Refúgio Ecológico Caiman. Miranda-MS. (Fonte: Refúgio Ecológico Caiman).
Fig. 6 – Trabalhando como guia (de costas, explicando sobre a palmeira Acuri): Trilha Cordilheira do X.
Fig. 7 – Trabalhando como guia (em pé, próxima a baía), informando sobre o passeio de canoa.

Acompanhei esta viagem durante quatro dias e quando retornei acabei escrevendo um pouco sobre minha experiência [5], mais como uma primeira reflexão que queria partilhar.

Fig. 8 – Saída da Comitiva na Fazenda Caiman. Primeiro acompanhamento presencial de uma Comitiva de boiadeiros (ao meu lado direito está o Condutor Sr. Ramon Miranda, logo atrás está o seu pai, Sr. Alfredo, e ao fundo estão os Meeiros, Fiadores e um acompanhador do Retiro Santa Vóia, Fazenda Caiman).

Naquele momento não havia intenções conceituais de pesquisa acadêmica, porém, pouco tempo depois, conversando com uns amigos sobre meu entusiasmo com o trabalho das Comitivas, trouxeram-me uma reportagem, capa da revista Terra. O título dizia: “Pantaneiro, um ser em extinção” (FRUET, 2004). O senhor que aparecia na capa era o pai da pessoa que me mostrava. O que me chamou a atenção foi que, na mesma época, em outra revista, li o comentário de pesquisador do Grupo de Estudos de Agronegócios da UFMS (Universidade Federal do Mato Grosso do Sul) afirmando que “Não há dados disponíveis, mas as comitivas de boiadeiros estão diminuindo e, no futuro, deixarão de existir”. (BRUM, 1998).

Daí em diante foram mais e mais investigações, sempre constatando a falta de dados sobre os boiadeiros, principalmente, no que se refere às publicações científicas. E não obstante seja possível encontrar pesquisas sobre modos de vida de peões de fazenda pantaneiros, com similaridades ao modo de vida dos boiadeiros, estes executam outros trabalhos e possuem costumes diferentes [6].

Como o boiadeiro costuma trabalhar informalmente (sem contrato de trabalho ou registro em carteira) e as Comitivas são itinerantes, é difícil obter dados estatísticos sobre sua ocorrência e, além disto, não costumam ser foco das problemáticas debatidas. Aparecem envolvidos em uma conjuntura econômica centralizada na discussão sobre o desenvolvimento da pecuária.
Em uma pesquisa historiográfica, onde foram analisadas as Comitivas de boiadeiros no Pantanal afirmou-se que, embora os boiadeiros ocupassem – e ainda ocupam – papel destacável na introdução e expansão da pecuária, sua presença na história é precariamente tratada, as informações são esparsas e pouco expressivas. O autor expõe, retoricamente, que apesar de ser tema recorrente entre poesias e músicas, é de forma indireta que a maior parte da bibliografia se apresenta: é comum encontrar boiadas, não boiadeiros (LEITE, 2003).

Estes dados chamam atenção por evidenciarem a escassez de dados disponíveis, mas também se apresenta como assunto emergente devido à ocorrência de mudanças que podem acarretar na perda do conhecimento deste segmento culturalmente diferenciado das populações tradicionais brasileiras. Acredita-se que o assunto pesquisado possui significativo valor no que diz respeito a uma forma de manejo [7] exercida por um conhecimento tradicional, aplicado há centenas de anos, e que no Pantanal, devido a seu regime de alagamento é, muitas vezes, a única alternativa de transportar o gado de uma região para outra.

Em referência à importância de pesquisas sobre populações tradicionais e os motivos pelo quais devemos estar atentos a esse conhecimento, podemos citar Marques (1999, p. 141), que conclui sobre seus estudos referentes a populações tradicionais:

[…] o foco das minhas preocupações, neste agora, concentra-se no fato de que esse conhecimento – chamemo-lo de nativo, tradicional, indígena ou como queiramos! – existe, resiste e está ameaçado. Esse conhecimento, além de extremamente útil, revela compatibilidade como a nossa ecologia – e no que ele não for compatível, muitas vezes trata-se apenas de uma questão de incomensurabilidade. Pois bem, esse conhecimento pode desaparecer. (…). Trata-se, na realidade, de um conjunto de sistemas de conhecimento altamente ameaçado de extinção e é isto o que mais me preocupa.

Em março de 2007, acredito que devido, principalmente, ao enfoque desta pesquisa, ganhei uma bolsa de estudos para o curso de um mês em um colégio na Inglaterra – Schumacher College [8], cujo tema era “Indigenous peoples & the natural world: Is ancient wisdom important to the modern world?”. Participaram pessoas de diversos países: Índia, Noruega, Austrália, EUA, Alemanha, Bélgica, Filipinas, entre outros. Só a existência deste curso e a representação de tantos países, já remete a relevância da discussão.

Um dos palestrantes, fundador do Fórum Social Mundial, Jerry Mander, colocou que embora a globalização exerça forte pressão para homogeneização do conhecimento, e o conhecimento indígena/tradicional [9] signifique assim, uma visão atrasada na ótica do capitalismo e até mesmo um impedimento ao “progresso”, ele afirma que a diversidade é a chave da vitalidade, resiliência e capacidade inovativa de qualquer sistema vivo. Isto vale também para sociedades humanas (informação verbal)10. Ainda segundo, Cavanagh; Mander (2004, p. 89):

The rich variety of human experience and potential is reflected in cultural diversity (grifo do autor), which provides a sort of cultural gene pool to spur innovation toward ever higher levels of social, intellectual, and spiritual accomplishment and creates a sense of identity, community, and meaning.[11]

No caso, a cultura pantaneira e em particular as Comitivas de boiadeiros representam uma atividade em que se realiza o transporte de espécies exóticas, o gado, inserida em determinadas paisagens [12]. Estão expostas as influências do mundo exterior; mudanças ocorridas em seu meio, que podem alterar seus valores e atitudes e ao mesmo tempo, mudanças que podem advir do próprio homem, da sua criação, pois é um ir e vir que faz do sujeito a sua existência, estando no mundo e com o mundo.

Compreende-se que estas relações construídas entre homem e ambiente muitas vezes são contraditórias e exprimem práticas que podem tanto contribuir para conservação como degradar o meio em que vivemos. Admite-se então, que há impactos ambientais gerados pela atividade pecuária, assim como pelo movimento destas boiadas, mesmo no Pantanal, onde há extensas áreas de pastos nativos. Entretanto, nesta pesquisa não se pretende aprofundar sobre este tema, mas expor um pouco da complexidade do conhecimento dos boiadeiros que ocorre através do convívio com as paisagens pantaneiras.

Face às diferentes visões do homem, se buscou inserir neste fenômeno e perceber uma forma de manejo tradicional, como prática que está diretamente conectada ao ciclo das águas do Pantanal. Procurou-se descrever sobre o modo de vida dos boiadeiros e a estrutura desta atividade ligada a uma forma de leitura das diferentes paisagens do Pantanal, levando em conta a temporalidade dos acontecimentos e a dinâmica da sociedade.

O acolhimento deste projeto no Programa de Pós-Graduação em Ciência Ambiental (PROCAM) pode me auxiliar justamente na visão interdisciplinar de pesquisa que o entendimento deste tipo cultural – o boiadeiro do pantanal – poderia ter. Pela minha formação em ecologia e crescente interesse em ciências humanas, o diálogo entre esses campos foi favorável ao tema pesquisado.

Este trabalho era para ser fundamentado através do acompanhamento presencial de Comitivas, mas no segundo semestre de 2007 sofri um grave acidente a cavalo e tive que interromper meus estudos por um ano e meio. No início do ano de 2009 renovei minha matrícula, mas por causa do meu estado de saúde, infelizmente, não foi possível acompanhar outras Comitivas, acarretando algumas alterações nos objetivos iniciais da pesquisa.

Uma pesquisadora no ambiente de trabalho masculino

Quando recebi a sugestão do comitê do PROCAM para escrever sobre o desafio da pesquisadora num contexto de pesquisa tipicamente masculino, apesar de saber da sua relevância, senti-me um pouco constrangida. Talvez pelo respeito com que os boiadeiros sempre me trataram ou talvez mesmo pela curiosidade latente e decorrência do trabalho, não havia parado para pensar sobre isso. Porém esta pergunta era recorrente quando expunha a pesquisa em diferentes âmbitos acadêmicos, afinal numa pesquisa com métodos qualitativos e dialógicos, essa questão pode ter fundamento, uma vez que se considera que a intersubjetividade é um assunto essencial.

A questão central da pergunta era pertinente, principalmente no que se refere à operacionalidade do acompanhamento das Comitivas e a interação/ tensão pesquisador, pesquisado durante o convívio e entrevistas com os boiadeiros. Como seria pra eles relatar o que vivem ou sentem, para uma mulher, e como seria se o fosse para um homem?

Acredito que por esta condição perdi algumas histórias e relatos, mas sei também que ganhei outros. O respeito que tive por eles foi sempre correspondido, e se em um primeiro momento eram mais fechados, no decorrer da Comitiva ou da entrevista ficavam cada mais familiarizados comigo e com meu compromisso de valorizar os saberes que relatavam, falando mais dos familiares e das dificuldades em suas vidas.

Sempre muito cuidadosos, davam-me o burro mais manso da tropa para montar e mesmo tendo o hábito de revezar seus burros para descanso, não quiseram, em nenhum momento trocar minha montaria. Apesar de estar acostumada a encilhar cavalos, nas viagens eu somente os auxiliava, pois queriam encilhar os animais para que estes estivessem bem seguros. Na primeira Comitiva, este cuidado foi tanto, que preocupados que eu sentisse dor por permanecer tanto tempo sobre o cavalo e com a intenção de deixar meu arreio mais confortável, ao invés de colocarem apenas um pelego [13] sobre o mesmo (como de costume), quiseram colocar dois e infelizmente o efeito foi o oposto. Então, no ponto de almoço, pedi gentilmente para que retirassem um dos pelegos e mesmo não estando acostumada a andar o dia inteiro a cavalo, como andava com frequência, fiquei cansada, mas não tive nenhuma indisposição física.

Por eu querer conhecer um pouco de cada função na Comitiva, procurei não concentrar a atenção em uma só pessoa, a não ser que fosse alguém com mais experiência, mais velha, normalmente líder do grupo. Apenas durante a primeira Comitiva, não fui a única mulher que estava viajando, pois uma amiga, Elizabeth Leite (Bete), que também trabalhava na Pousada Caiman, quis ir conosco e assim, pudemos compartilhar algumas situações.

Acabei por participar de poucas Comitivas, por motivos alheios a minha vontade e talvez, muito destes momentos tenham ocorrido com certa naturalidade por meu interesse nesta pesquisa ter surgido da relação com a experiência de meu avô materno e por já conviver, um pouco com a cultura dos peões pantaneiros. No que se refere às relações de classe, talvez por este motivo, também não senti que houvesse distanciamento ou diferenciação por ser pesquisadora. Na primeira Comitiva, realmente não estava nesta condição, mas mesmo durante as outras Comitivas, o que pude observar foi uma diferenciação cultural por ser de outro Estado, ou por ser “da cidade”, e em alguns momentos notei que buscavam explicar-se melhor para que eu pudesse compreendê-los.

Porém é interessante colocar, que minha relação com os boiadeiros foi mais marcada pela relação de gênero. O trabalho que executam é predominantemente ocupado pela mão de obra masculina [14], e pode ser que pela falta de costume com a presença feminina neste ambiente, havia todo o tempo, um excesso de zêlo e uma visão fragilizada da mulher. E assim, ficavam também surpresos por eu conseguir acompanhá-los.

Sobre questões mais difíceis de compreender para quem não tem uma imagem sobre a vida dos boiadeiros gostaria de partilhar um pouco desta relação assimétrica e heterogênea entre pesquisadora e pesquisados.

Para dormir numa comitiva, como dormem todos juntos, em redes individuais, não houve nenhum problema e estranhamento, mas para necessidades fisiológicas, como era ao ar livre, eu apenas esperava a Comitiva seguir, ficando para trás, buscando alguma moita e cuidando bem para meu burro não fugir! Já para tomar banho, talvez tenha sido o momento mais delicado. Fui preparada, levando traje de banho discreto, para tomar banho com eles em algum açude, rio, ou onde quer em que houvesse água disponível. Mas percebi que eles não queriam que eu fosse junto, pediam sempre para que eu fosse antes, que assim seria melhor. Por muitas vezes, também, quando estávamos chegando ao pouso, e se ocorria de estarmos próximos a alguma sede de fazenda, eles acabavam perguntando ao praieiro [15] se havia algum banheiro disponível para banho, e antes mesmo de conversar comigo, já ficava tudo combinado.

Procurei aceitar o que me estavam orientando, pois eles ficariam mais à vontade e eu não os incomodaria. E assim, com cuidado, respeito e delicadeza, essas questões foram sendo resolvidas. Nos capítulos que seguem, um pouco mais sobre o perfil destes homens será relatado.

Estrutura dos capítulos

Para organização desta pesquisa, optou-se por dividí-la em capítulos. No primeiro capítulo apresenta-se breve contextualização do Pantanal e a formação do homem pantaneiro por meio da revisão da literatura sobre a região de estudo. Para maior familiarização ao assunto, foi feita uma introdução sobre estas paisagens relacionadas ao ciclo das águas, o que influencia diretamente na definição de roteiros das Comitivas. Em seguida, é retratado, de forma sucinta, o processo de ocupação e a consolidação da pecuária no Pantanal.

Ainda neste primeiro capítulo, busca-se retratar o marco conceitual e o caminho traçado neste estudo. O marco conceitual foi elaboradao a partir de uma abordagem sobre a interpretação cultural da paisagem como lugar no contexto de populações tradicionais. Já a trajetória metodológica se deu inicialmente, a partir de interrogações [16] voltadas aos sujeitos que vivenciam o fenômeno [17], ou seja, os boiadeiros no Pantanal Sul Matogrosssense. Posteriormente, por meio de coletas de entrevistas, histórias de vida, acompanhamento presencial de Comitivas, estes dados foram sendo construídos, analisados e tematizados (capítulos II, III, IV), compondo os elementos para buscar esboçar o universo cultural do boiadeiro de acordo com o recorte ao que se pretendeu pesquisar, ou seja, sobre seu modo de vida e as leituras das paisagens pantaneiras.

O segundo capítulo: Comitiva de boiadeiros – modo de vida está dividido em três subtemas. No primeiro, Viajantes do estradão foi feita uma descrição sobre o modo de ser boiadeiro. O segundo tema: Na batida das Comitivas de boiadeiros, trata-se de como ocorrem estas Comitivas, e o terceiro: Puxando a boiada, atenta-se para a divisão de ofícios nas Comitivas.

No terceiro capítulo: Comitiva pantaneira é dada a descrição sobre a leitura da paisagem. A partir do tema: Na Batida do estradão: Marcos referenciais nas paisagens, são tratados os significados atribuídos às paisagens pantaneiras. Já no tema: No ritmo das águas, são abordados os significados dados às estações sazonais, de acordo com a definição de trajetos nas Comitivas.
No quarto capítulo propõe-se Aproximações para uma conclusão, incluindo algumas reflexões acerca dos dados reunidos, bem como a importância e valorização do conhecimento dos boiadeiros. Por ser um assunto identificado como recorrente, também se procurou tratar sobre quais motivos têm levado às transformações recentes neste trabalho humano ou até mesmo o seu declínio, suas consequências e contradições. No último capítulo estão apresentadas as considerações finais, onde se procurou apontar as contribuições e limites deste trabalho, sugerindo novas linhas de pesquisa sobre o tema.

Todos estes temas e capítulos se interpenetram, porém são focados em grandes áreas, que procuram adentrar aos poucos ao mundo dos boiadeiros. Mundo este que se torna utópico a ser desvendado à medida que se conhecem cada vez mais as habilidades exigidas para este trabalho e suas dificuldades, mas não menos passível de apreender elementos que demonstrem uma relação de interdependência entre homem e ambiente.

CAPÍTULO 1 – O CAMINHO TRAÇADO NA PESQUISA

No pantanal ninguém pode passar a régua. Sobre muito quando chove. A régua é existidura de limite. E o pantanal não tem limites. (…).
O mundo foi renovado, durante a noite, com as chuvas. Sai o garoto pelo piquete com olho de descobrir. Choveu tanto que há ruas de água. Sem placas, sem nome, sem esquinas. (…).
A pelagem do gado está limpa. A alma do fazendeiro está limpa.
Manoel de Barros (1990: 237).

Fig. 9 – Ciclo das águas e boiadeiros no Pantanal-MS. (À esquerda seguindo o sentido da seta: 1. Enchente: Ponte sobre o Rio Miranda. Segunda Comitiva. 2.Cheia: Travessia Rio Cerradinho. Segunda Comitiva. 3. Vazante: Ponteiro Morcego. Primeira Comitiva. 4. Seca: Saída de Comitiva da Fazenda Fátima). Montagem das fotos: Juliana Moreno.

1.1 Contextualização do tema de estudo

1.1.1 O Pantanal

É fundamental explanar sobre a dinâmica complexa nas paisagens do Pantanal, para que também se desvele o modo de vida e a leitura da paisagem pelos boiadeiros, pois estes são assuntos considerados interdependentes. É assim que afirma Proença (1997, p.72):

No Pantanal tudo depende das águas. São elas que condicionam os diversos tipos de lida, levam o homem a ter necessidade de mudanças nas grandes enchentes, modificam os solos, obrigam certas aves a migrar para outros lugares do planeta, empurrando o gado para cima das cordilheiras, quebram a monotonia da planície, ilhando muitas fazendas.

O Pantanal é a maior planície inundável do mundo. Sua área total é de 210.000 Km2, abrangendo o Brasil, a Bolívia e o Paraguai. Deste total, 138.183 Km2 estão no Brasil, ou seja, cerca de 70% ocorrem distribuídos entre os Estados do Mato Grosso e Mato Grosso do Sul. (ALHO; LACHER JUNIOR; GONCALVES, 1988). Neste último Estado, presente área de estudo, o Pantanal corresponde a 89.318 km2, equivalendo a 64,64% da área total do Pantanal no Brasil (ABDON e SILVA, 1998).

Ab’Saber (1988), discorre sobre a origem do Pantanal Matogrossense, propondo a teoria de que o que hoje é uma depressão teria sido no passado uma vasta abóbada de escudo, que funcionava como área de fornecimento de materiais detríticos para as bacias sedimentares do Grupo Bauru (Alto Paraná) e Parecis, formada até o Cretáceo. Durante o soerguimento pós-cretáceo teria ocorrido então, uma desestabilização tectônica, devido a falhamentos estruturais facilitando seu aplainamento e assim, comportando-se, como anticlinal esvaziada. Atualmente, o Pantanal Matogrossense se caracteriza por extensas planícies de acumulação de sedimentos fluviais.

A planície pantaneira faz parte da Bacia do Alto Paraguai, que possui área de 496.000 km2, sendo ainda parte integrante da Bacia do Prata. Está sujeita a um regime das águas fortemente sazonal, com precipitação média de 1.396 mm, variando entre 800 e 1.600 mm. A declividade dos rios é de 0,1 a 0,3 m/km com um gradiente topográfico de 0,3-0,5 m/km na direção leste-oeste e 0,03-0,15 m/km na direção norte-sul. As altitudes na planície variam de 80 a 150 metros (AGÊNCIA NACIONAL DAS ÁGUAS, 2003).

De acordo com a classificação de Köeppen o tipo climático desta região é Aw, apresentando dois períodos distintos: chuvoso (outubro a março), quando ocorre cerca de 80% do total anual das chuvas e seco (abril a setembro). A temperatura média anual do ar é de 25,5oC, com médias mínimas e máximas de 20oC e 32oC, respectivamente (SORIANO, 2002).

Existe um atraso de aproximadamente quatro meses entre o pico da cheia do norte e do sul do Pantanal, o que faz com que a estação seca vigore na porção norte do Pantanal enquanto o nível das águas atinge seu pico na porção sul. Os níveis da água no norte são extremamente variáveis, subindo e descendo em resposta direta ao volume de chuvas. Os níveis da água no sul, por outro lado, aumentam e diminuem mais suavemente ao longo dos anos, devido à retenção natural da inundação que amortece as flutuações causadas pelas chuvas intensas Heckman [18] (1999 apud HARRIS et al., 2005).

Os períodos mais frios, bem como a duração da estiagem são diferentes e imprevisíveis de ano em ano, resultando em fortes pressões sobre as populações animais e vegetais. Apesar disso, o solo hidromórfico e a forte inundação anual, que estende bastante dentro da seca, amenizam os efeitos dessas variações, pelo menos para parte dessas populações. (BROWN JUNIOR, 1984). Ou seja, enquanto algumas espécies se adaptam à constante mudança e sobrevivem às extremas condições, outras definem seus ciclos de vida de acordo com as estações.

Mapa 1 – Sub- Regiões ou “pantanais” do Pantanal: Bacia do Alto Paraguai no Brasil. Fonte: Silva; Abdon (1998).

A vegetação é heterogênea e influenciada por quatro biomas: Floresta Amazônica, Cerrado (predominante), Chaco e Floresta Atlântica. Adamoli [19] (1981 apud HARRIS et al., 2005). Segundo Silva et al. [20] (2000 apud HARRIS et al, 2005), um levantamento aéreo do Pantanal brasileiro identificou 16 classes de vegetação com base nas fitofisionomias, sendo os campos a fisionomia mais representativa (31%), seguida do cerradão (22%), cerrado (14%), campos inundáveis (7%), floresta semidecídua (4%), mata de galeria (2,4%) e tapetes de vegetação flutuante ou „baceiros‟ (2,4%).

É devido a este mosaico de fisionomias vegetais que a região é considerada como Complexo Pantanal, sendo declarado Patrimônio Natural da Humanidade e Reserva da Biosfera (ORGANIZAÇÃO DAS NAÇÕES UNIDAS PARA A EDUCAÇÃO, A CIÊNCIA E A CULTURA, 2009). Sua importância também está estabelecida na Constituição brasileira, no artigo 225, § 4o, sendo reconhecido como Patrimônio Nacional.

As principais razões pelas quais o Pantanal merece este reconhecimento internacional podem ser elencadas em: trata-se de um complexo de ecossistemas únicos no mundo; constitui o habitat de espécies animais e vegetais diversificadas, muitas delas consideradas raras e algumas em processo de extinção; é protegido nacionalmente; pertence e tem influência sobre mais de um país; revela em muitos aspectos uma sociodiversidade peculiar dada ao processo histórico de formação sócio-espacial. Essa formação é conhecida popularmente como a cultura do pantaneiro – por seu trabalho, culinária, vestuário, costumes, festas, suas manifestações artísticas e religiosas. (WERTHEIN, 2000).

1.1.2 O homem pantaneiro e a pecuária

 

[ Versão em PDF ]

You can enable/disable content protection from Theme Options and customize this message too.