Write a letter. Each row of the PDF is the real acoustic waveform of that line, read aloud by kokoro.
first generation downloads the voice model (~80 MB). only happens once.