A dataset containing the 1,150 conversations of 440 speakers of American
English. More information on the metadata in this data can be found here https://catalog.ldc.upenn.edu/docs/LDC97S62/swb1_manual.txt
.
sdac
A data frame with 223,606 rows and 20 variables:
ID for each conversation document
DAMSL dialog act annotation labels
Label for each speaker in the conversation
Number of contiguous utterance turns for a given speaker
The cumulative number of utterances in the conversation
The actual dialog utterance
Unique speaker identification code
Sex of the speaker
Year that the speaker was born
Region from the US where the speaker spent first 10 years
Highest educational level attained
...
Form of payment for participation
Payment amount for participation
Misc. comments
...
...