A dataset of the corpus files containing the 1,1150 conversations of 440 speakers of American English.

sdac_files_partial

Format

A data frame with 223,606 rows and 2 variables:

doc_id

ID for each conversation document

text

The actual dialog utterance

Source

https://catalog.ldc.upenn.edu/docs/LDC97S62/