Switchboard Dialog Act Corpus Files — sdac_files • analyzr

A dataset of the corpus files containing the 1,1150 conversations of 440 speakers of American English.

sdac_files

Format

A data frame with 223,606 rows and 7 variables:

doc_id: ID for each conversation document
damsl_tag: DAMSL dialog act annotation labels
speaker: Label for each speaker in the conversation
turn_num: Number of contiguous utterance turns for a given speaker
utterance_num: The cumulative number of utterances in the conversation
utterance_text: The actual dialog utterance
speaker_id: Unique speaker identification code

Source

https://catalog.ldc.upenn.edu/docs/LDC97S62/