Academic journal article Psychomusicology

Perceptual Evaluation of Musicological Cues for Automatic Song Segmentation

Academic journal article Psychomusicology

Perceptual Evaluation of Musicological Cues for Automatic Song Segmentation

Article excerpt

The present study evaluated how well boundaries predicted by nine rules, each of them relying on one particular musical cue, can predict perceptual boundaries. The rules were taken from two models: the local boundary detection model (LBDM) by Cambouropoulos (2001) and the generative theory of tonal music (GTTM) by Lerdahl and Jackendoff (1983) in the form quantified by Frankland and Cohen (2004). Furthermore we added the cue timbre change. The predicted boundary profiles from each rule were correlated with perceptual boundary profiles of six songs obtained in a previous study (Bruderer, McKinney, & Kohlrausch, 2009). The individual rule having the highest correlation with the perceptual boundaries was the LBDM-onset. The optimal combination of three rules results in the combination of LBDM-onset, GTTM-rest, and timbre change, yielding a physical correlation of 0.80 to 0.89 between perceptual and model boundary profiles. Analysis of the perceptual cues given for salient boundaries not predicted by the model suggests that incorporating tempo change and harmonic progression could improve the model predictions. The optimal rule combination for segmentation profiles of polyphonic versions of the same songs as obtained in Bruderer, McKinney, and Kohlrausch (2010) was the combination of the LBDM-onset, timbre change, and the start of a rest.

Keywords: segmentation, cues, popular music

Supplemental materials:

(ProQuest: ... denotes formulae omitted.)

Listening to music comprises not only the aesthetic experience but also the interpretation of the musical form. Based on the structure of this form, the listener is able to segment a musical piece into smaller parts. The segmentation process has been studied perceptually (e.g., Deliege, 1987; Clarke & Krumhansl, 1990; Krumhansl, 1996; Lar- tillot & Ayari, 2008; Nooijer, Wiering, Volk, & Tabachneck-Schijf, 2008; Pearce, Mullensiefen, & Wiggins, 2008; Lartillot & Ayari, 2009) as well as musicologically (Lerdahl & Jackendoff, 1983) and different types of cues have been identified that convey the structure of music, including change in note duration, breaks, and change in pitch. Some of these cues have been incorporated into musicological models for music segmentation (e.g., Tenney & Polansky, 1980; Cambouropoulos, 2001; Temperley, 2001; Bod, 2002; Ferrand, 2004; Frankland & Cohen, 2004). However, it is not yet clear how well these models are also able to predict boundaries in pieces with a duration of several minutes or those taken from musical genres other than classical music.

Several studies tested automatic segmentation algorithms for their correlation with perceptually annotated data. For example, Cambou- ropoulos (2001) tested his model on a set of 52 melodies, for which musicians had manually marked preferred punctuation positions on a musical score. Thom, Spevak, and Höthker (2002) tested the perfor- mance of two algorithms on the Essen folk song collection (Schaf- frath, 1995), which contains segmentation boundaries added by mu- sicologists. They also asked 19 musicians to indicate "salient melodic chunks" of10 musical excerpts by placing a mark in the musical score above the first note of the chunk. The indicated boundaries, thus, rather represent where experts think a boundary should occur, instead of where they are actually perceived.

The goal of the present study was to compare the boundaries predicted by cues taken from musicological models with perceptual boundary profiles obtained in our previous perceptual studies (Brud- erer, McKinney, & Kohlrausch, 2009, 2010). In these studies, we executed several perceptual experiments to better understand how participants segment popular music pieces. In a first experiment, about 20 participants were asked to indicate segment boundaries while listening to the playback of the song. From the boundary indications of all participants about 10 boundaries were selected for each song (see Bruderer et al. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.