Corpus example

The C-ORAL-BRASIL corpus is a multimedia resource made up of a set of three file formats:

  1. wav files (audio);

  2. rtf files (text);

  3. xml files (text-audio alignment).

In order to visualize an example from the corpus, please click here and download a zipped sample.

Each transcription has a header in which information referring to the recording situation as well as participant sociolinguistic metadata are available. Click here to visualize a header example.
In order to save a header (txt format), press the right mouse button and click on Save link as or Save file as..., depending on your browser.


Extract files into a file in your computer.

In order to open the text-audio alignment file, you will need to have the software Winpitch Pro installed in your computer.

Run WinPitch and open the xml file from the Alignment file... menu. The audio file will be automatically uploaded.