Difference between revisions of "356-winter-2023/etude1"

From CCRMA Wiki
Jump to: navigation, search
(The Word Embedding Model)
(The Word Embedding Model)
Line 18: Line 18:
 
** glove-wiki-gigaword-50-pca-3.txt -- dimensionally reduced model using PCA (400,000 words x 3 dimensions)
 
** glove-wiki-gigaword-50-pca-3.txt -- dimensionally reduced model using PCA (400,000 words x 3 dimensions)
 
** glove-wiki-gigaword-50-tsne-2.txt -- dimensionally reduced model using t-SNE (400,000 words x 2 dimensions)
 
** glove-wiki-gigaword-50-tsne-2.txt -- dimensionally reduced model using t-SNE (400,000 words x 2 dimensions)
* each of these has their advantages and disadvantages
+
* each of these has their tradeoffs:
 
** the 50D has the most faithful word-similarity to the training data (i.e., tend to get better search results) -- but searching is more computationally intensive (and may introduce glitches in real-time audio when querying for similiarity)
 
** the 50D has the most faithful word-similarity to the training data (i.e., tend to get better search results) -- but searching is more computationally intensive (and may introduce glitches in real-time audio when querying for similiarity)
 
** the 3D PCA version is a greatly dimensionally-reduced (from 50 to 3) version of the above which makes it much faster to do similarity retrieval (moreover the lower dimensionality in this case means we can use space partitioning like KD-trees to significantly speed up the similarity retrieval); furthermore, having three dimensions is convenient for mapping these values to audio parameters (frequency, volume, rate, timbre, etc.) -- however, the quality of the similarity retrieval is noticeably weaker, as more seemingly unrelated words will pop up as nearest neighbors.
 
** the 3D PCA version is a greatly dimensionally-reduced (from 50 to 3) version of the above which makes it much faster to do similarity retrieval (moreover the lower dimensionality in this case means we can use space partitioning like KD-trees to significantly speed up the similarity retrieval); furthermore, having three dimensions is convenient for mapping these values to audio parameters (frequency, volume, rate, timbre, etc.) -- however, the quality of the similarity retrieval is noticeably weaker, as more seemingly unrelated words will pop up as nearest neighbors.

Revision as of 18:24, 12 January 2023

Programming Etude #1: "Poets of Sound and Time"

Music and AI (Music356/CS470) | Winter 2023 | by Ge Wang

Space-kitten3.gif

In this programming etude, you are to write two chuck programs to help you create some experimental poetry involving text, sound, and time.

Tools to Play With

  • first, download the latest official ChucK release -- This will install both command line chuckand the graphical IDE miniAudicle (on macOS and Windows).
  • next, get the latest bleeding edge secret chuck build for this course -- Some important things to note:
    • this is the command line chuck, which you will need for this etude (you can use miniAudicle as a text editor only, as it does not have the new functionalities we need); you can either put the bleeding-edge chuck in your homework folder or (if you are feeling brave) you can overwrite your installed command line chuck. Depending on where you put this bleeding-edge build, you will need to explicitly run it (for example, > ./chuck program.ck will explicitly run chuck from the current directory, rather than the system-installed chuck).
    • NOTE: using command line chuck makes it possible to get console input, which is not available from the miniAudicle IDE.
    • NOTE: Windows users can either use the default cmd command prompt, or might consider downloading a terminal emulator.

The Word Embedding Model

  • lastly, you'll need to download three sets of pre-trained word vectors
    • glove-wiki-gigaword-50.txt -- pre-trained word vectors from Stanford GloVe (400,000 words x 50 dimensions)
    • glove-wiki-gigaword-50-pca-3.txt -- dimensionally reduced model using PCA (400,000 words x 3 dimensions)
    • glove-wiki-gigaword-50-tsne-2.txt -- dimensionally reduced model using t-SNE (400,000 words x 2 dimensions)
  • each of these has their tradeoffs:
    • the 50D has the most faithful word-similarity to the training data (i.e., tend to get better search results) -- but searching is more computationally intensive (and may introduce glitches in real-time audio when querying for similiarity)
    • the 3D PCA version is a greatly dimensionally-reduced (from 50 to 3) version of the above which makes it much faster to do similarity retrieval (moreover the lower dimensionality in this case means we can use space partitioning like KD-trees to significantly speed up the similarity retrieval); furthermore, having three dimensions is convenient for mapping these values to audio parameters (frequency, volume, rate, timbre, etc.) -- however, the quality of the similarity retrieval is noticeably weaker, as more seemingly unrelated words will pop up as nearest neighbors.
    • the 2D t-SNE version, despite having one less dimension than the 3D PCA version, may actually perform as well or better in terms of similarity (due to the non-linear optimization of t-SNE compared with the linear mapping of higher dimensions to lower dimensions); like the 2D, the 2D version is friendly for mapping to audio parameters and for visualization

Things to Think With

Express Yourself!

Using the ChucK/ChAI starter code for Word2Vec...

  • write code to help you create some experimental poetry involving text, sound, and time.
    • text: use the Word2Vec object in ChucK and one of the datasets to help you generate some poetry
    • sound: use sound synthesis and map the words (e.g., using their vectors to control parameters such as pitch, volume, timbre, etc.) to sound
    • time: don't forget about time! make words appear when you want them to; synchronize words with sound; visually and sonically "rap" the words in time!
  • create two poetic programs / works / performances:
    • make them as different as possible
    • for example one poem can be fully generated (you only need to run the chuck code, and the poem starts) and the other one interactive (incorporates input from the user, either through a a text prompt, or another means of input such as mouse / keypresses)

Some Prompts and Bad Ideas

  • a poem can be about anything; hint: try starting with how you feel about a particular thing, person, or event
  • starting with an existing poem and use word2vec to morph it over time
  • an experimental love poem
  • stream of consciousness
  • remember "Jabberwocky" by Lewis Carroll? maybe your poem doesn't need to make sense to everyone
  • HINT: try to take advantage of the medium: in addition to printing out text to a terminal (both a limitation and a creative constraint/opportunity) you have control over sound and time at your disposal
  • HINT: experiment with the medium to embody your message -- for example, a poem about chaos where the words gradually become disjointed and nonsensical

Starter Code and Examples

  • here are some starter code:
  • and some examples
  • free to incorporate these or use them as starting points

Reflections

  • write ~250 words of reflection on your etude. It can be about your process, or the product, or the medium, or anything else. For example, how did your poems make you feel? Did you show them to a friend? What was their reaction? What makes something "poetry" (versus, say, "prose")?

Deliverables

  • create a CCRMA webpage for this etude
  • your Etude #1 webpage is to include
    • a title and short description of the exercise (free free to link to this wiki page)
    • your poems in some form (this will depend on what you chose to do; since sound and time are involved, you could include a screen capture with sound)
    • your ChucK code
    • your 250-word reflection
    • any acknowledgements (people, code, or other things that helped you through this)
  • submit to Canvas only your webpage URL