Music 356

HW1: Poets of Sound and Time

Alex Han - Winter 2023


Poem 1: "Shadows of Tomorrow"

Click here to watch Poem 1

This poem is a sort of ode to MF DOOM and his project with Madlib titled Madvillainy. It continuously generates a stream of vaguely philosophical statements seeded by words taken from Madvillain's track titled Today is the Shadow of Tomorrow.

Each word is spit out rhythmically according to an underlying metrical grid, accompanied by a conga drum being hit at various intensities and pitches. As the poem keeps proceeding, new layers of sounds creep in: a Sitar with an insistent drone, a classic 90's boom-bap style beat, and a haunting synth flute melody that appears in the original Madvillain beat.

I wanted to emulate the repetitive, cryptic, and philosophical language that is seen in chants, mantras, and religious texts found throughout the world's cultures and religions. The text goes by very quickly, and the reader is not meant to stop and sit with any individual statement. Unlike with most human-generated poetry, and paradoxical to the religious chant vibe I just referred to, this piece discourages slow contemplation. Rather, it invites the viewer to just grasp at small chunks or phrases as they pass by, drawing meaning through interpretation on an aggregated level. If you stop to look closely at a single stanza and analyze it, there may be some truly profound statements, but there are also a lot of nonsensical ones. This is an inherent feature of the medium--the logic behind the poetry generation is not that complex, and only approximates similar meanings of individual keywords and imposes some arbitrary syntactical structure. However, by appreciating this poem as greater than the sum of its individual sentences, it may surprise you in its wisdom.


Poem 2: "A Stream of Echoes"

Click here to watch Poem 2

This is an interactive, real-time poem that requires the viewer to make some noises. It receives and analyzes mic input to spew out words drawn from the vector space of the model. It does some operations on values taken from the FFT, Centroid, and RMS of audio samples that were manually tuned by me to map nicely into each of the dimensions in the language model's vector space. It generates some pretty incoherent content most of the time, mostly due to the interpreted audio data tending to sit on either extreme of the language vector dimensions. With more processing it may provide more varied input. I spent considerably more time on my first poem than this one, so this is more of a concept than a fully fleshed out piece.


Click here to download source code