Difference between revisions of "356-winter-2023/etude1"
From CCRMA Wiki
(→Starter Code and Examples) |
|||
(29 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
= Programming Etude #1: "Poets of Sound and Time" = | = Programming Etude #1: "Poets of Sound and Time" = | ||
− | [https://ccrma.stanford.edu/courses/356/ Music and AI (Music356/CS470)] | Winter 2023 | by Ge Wang | + | [https://ccrma.stanford.edu/courses/356-winter-2023/ Music and AI (Music356/CS470)] | Winter 2023 | by Ge Wang |
<div style="text-align: left;">[[Image:Space-kitten3.gif]]</div> | <div style="text-align: left;">[[Image:Space-kitten3.gif]]</div> | ||
Line 7: | Line 7: | ||
=== Due Date === | === Due Date === | ||
− | * Etude #1: '''due Wednesday (1/18, 11:59pm) see "Deliverables" below''' | + | * Etude #1: '''due Wednesday (1/18, 11:59pm) see "Deliverables" below (extended 48 hours to Friday)''' |
− | * Presentation: '''be prepared to share your work in class on | + | * Presentation: '''be prepared to share your work in class on Tuesday (1/24)''' |
=== Put the Disco in Discord === | === Put the Disco in Discord === | ||
Line 17: | Line 17: | ||
=== Tools to Play With === | === Tools to Play With === | ||
− | * get the latest [https://ccrma.stanford.edu/courses/356 | + | * get the latest [https://ccrma.stanford.edu/courses/356/bin/chuck/ '''bleeding edge secret <code>chuck</code> build'''] |
− | ** '''macOS''' will install both command line <code>chuck</code> and the graphical IDE miniAudicle, and replace any previous ChucK installation. | + | ** '''macOS''' this will install both command line <code>chuck</code> and the graphical IDE miniAudicle, and replace any previous ChucK installation. |
− | ** '''Windows''' will need to download and use the bleeding-edge command line <code>chuck</code> (for now, there is no bleeding-edge miniAudicle for Windows); can either use the default <code>cmd</code> command prompt, or might consider downloading a [https://www.puttygen.com/windows-terminal-emulators terminal emulator]. | + | ** '''Windows''' you will need to download and use the bleeding-edge command line <code>chuck</code> (for now, there is no bleeding-edge miniAudicle for Windows); can either use the default <code>cmd</code> command prompt, or might consider downloading a [https://www.puttygen.com/windows-terminal-emulators terminal emulator]. |
− | ** '''Linux''' will need to build from source, provided in the | + | ** '''Linux''' you will need to build from source, provided in the <code>linux</code> directory |
− | ** '''all platforms''' in order use console input (e.g., to run the sample code [https://ccrma.stanford.edu/courses/356/code/ | + | ** '''all platforms''' in order use console input (e.g., to run the sample code [https://ccrma.stanford.edu/courses/356/code/etude-1/word2vec-prompt.ck word2vec-prompt.ck]), you will need to use command line <code>chuck</code> (console input is not available from within the miniAudicle IDE). |
* NOTE: to return your chuck back to a pre-bleeding-edge state, you can always install the latest [https://chuck.stanford.edu/ official ChucK release] | * NOTE: to return your chuck back to a pre-bleeding-edge state, you can always install the latest [https://chuck.stanford.edu/ official ChucK release] | ||
* also, a Zoom recording of the [https://stanford.zoom.us/rec/share/4heERyLrVex0S_r3TrHGalipseAlMK4IiEtstU0MdiBRvBBUQ_4FQt1QahLh3ipp.WF4rMb_EbyViVIlc?startTime=1673574677000 ChucK tutorial from 2023.01.13] | * also, a Zoom recording of the [https://stanford.zoom.us/rec/share/4heERyLrVex0S_r3TrHGalipseAlMK4IiEtstU0MdiBRvBBUQ_4FQt1QahLh3ipp.WF4rMb_EbyViVIlc?startTime=1673574677000 ChucK tutorial from 2023.01.13] | ||
=== The Word Embedding Model === | === The Word Embedding Model === | ||
− | * | + | * next, you'll need to download [https://chuck.stanford.edu/chai/data/glove/ three sets of pre-trained word vectors] |
** glove-wiki-gigaword-50.txt -- pre-trained word vectors from [https://nlp.stanford.edu/projects/glove/ Stanford GloVe] (400,000 words x 50 dimensions) | ** glove-wiki-gigaword-50.txt -- pre-trained word vectors from [https://nlp.stanford.edu/projects/glove/ Stanford GloVe] (400,000 words x 50 dimensions) | ||
** glove-wiki-gigaword-50-pca-3.txt -- dimensionally reduced model using PCA (400,000 words x 3 dimensions) | ** glove-wiki-gigaword-50-pca-3.txt -- dimensionally reduced model using PCA (400,000 words x 3 dimensions) | ||
** glove-wiki-gigaword-50-tsne-2.txt -- dimensionally reduced model using t-SNE (400,000 words x 2 dimensions) | ** glove-wiki-gigaword-50-tsne-2.txt -- dimensionally reduced model using t-SNE (400,000 words x 2 dimensions) | ||
* each of these has their tradeoffs: | * each of these has their tradeoffs: | ||
− | ** the 50D has the best similarity | + | ** the 50D has the best similarity (i.e., tends to get better search results) -- but searching is more computationally intensive (and may introduce glitches in real-time audio) |
− | ** the PCA-3D version is a | + | ** the PCA-3D version is a dimensionally-reduced (from 50 to 3) version of the above, making it much faster to do similarity retrieval (moreover the lower dimensionality in this case means it's feasible to use space partitioning techniques like KD-trees to significantly speed up the search); in addition to being real-time-audio friendly, the three dimensions can be readily mapped to control audio parameters (frequency, volume, rate, timbre, etc.) -- however, the quality of the similarity retrieval is noticeably weaker, as seemingly unrelated words will show up in the results. |
** the t-SNE-2D version, despite having one less dimension than the PCA-3D version, actually performs as well or better in terms of similarity compared to PCA-3D (partially due to the non-linear optimization of t-SNE compared with PCA's linear mapping of higher dimensions to lower dimensions); like 2D, 3D is friendly for mapping to audio parameters (and for visualization). | ** the t-SNE-2D version, despite having one less dimension than the PCA-3D version, actually performs as well or better in terms of similarity compared to PCA-3D (partially due to the non-linear optimization of t-SNE compared with PCA's linear mapping of higher dimensions to lower dimensions); like 2D, 3D is friendly for mapping to audio parameters (and for visualization). | ||
** we recommend exploring word embeddings first using the 50D model, before moving on to the 3D and 2D versions for mapping and for real-time audio. It is also possible to first generate text, stored the generated text, and then render it and the audio in real-time. This approach effectively removes the real-time constraints by separating the processing into a generative stage and an efficient rendering stage; however, this also means that real-time interactions (e.g., to influence the generative processes) will be limited. | ** we recommend exploring word embeddings first using the 50D model, before moving on to the 3D and 2D versions for mapping and for real-time audio. It is also possible to first generate text, stored the generated text, and then render it and the audio in real-time. This approach effectively removes the real-time constraints by separating the processing into a generative stage and an efficient rendering stage; however, this also means that real-time interactions (e.g., to influence the generative processes) will be limited. | ||
− | === | + | === Sample Code === |
− | * free to incorporate these or use them as starting points | + | * you can find [https://ccrma.stanford.edu/courses/356/code/etude-1/ '''sample code here'''] |
− | * | + | * start playing with these; free to incorporate these or use them as starting points for your experimental poetry tool |
− | * ''' | + | ** [https://ccrma.stanford.edu/courses/356/code/etude-1/word2vec-basic.ck '''word2vec-basic.ck'''] -- basic example that... |
+ | *** loads a word vector | ||
+ | *** prints # of words and # of dimensions in the model | ||
+ | *** shows how to get a vector associated with a word (using <code>getVector()</code>) | ||
+ | *** shows how to retrieve K most similar words using a word (using <code>getSimilar()</code>) | ||
+ | *** shows how to retrieve K most similar words using a vector (using <code>getSimilar()</code>) | ||
+ | *** uses the W2V helper class to evaluate an expression like "puppy - dog + cat" (using <code>W2V.eval()</code>) | ||
+ | *** uses the W2V helper class to evaluate a logical analog like dog:puppy::cat:?? (using <code>W2V.analog()</code>) | ||
+ | ** [https://ccrma.stanford.edu/courses/356/code/etude-1/word2vec-prompt.ck '''word2vec-prompt.ck'''] -- interactive prompt word2vec explorer | ||
+ | *** this keeps a model loaded while allowing you to play with it | ||
+ | *** type <code>help</code> to get started | ||
+ | ** [https://ccrma.stanford.edu/courses/356/code/etude-1/starter-prompt.ck '''starter-prompt.ck'''] -- minimal starter code for those wishing to include an interactive prompt in chuck, with sound | ||
+ | * example of poems | ||
+ | ** [https://ccrma.stanford.edu/courses/356/code/etude-1/poem-i-feel.ck '''"i feel"'''] -- a stream of unconsciousness poem (dependency: glove-wiki-gigaword-50-tsne-2 or any other model) | ||
+ | *** usage: <code>chuck poem-i-feel.ck</code> | ||
+ | ** [https://ccrma.stanford.edu/courses/356/code/etude-1/poem-randomwalk.ck '''"Random Walk"'''] -- another stream of unconsciousness poem | ||
+ | *** usage: <code>chuck poem-randomwalk.ck</code> or <code>chuck poem-randomwalk.START_WORD</code> (to provide the starting word) | ||
+ | ** [https://ccrma.stanford.edu/courses/356/code/etude-1/poem-spew.ck '''"Spew"'''] -- yet another stream of unconsciousness poem | ||
+ | *** usage: <code>chuck poem-spew.ck</code> or <code>chuck poem-spew.ck:START_WORD</code> (to provide the starting word) | ||
+ | ** [https://ccrma.stanford.edu/courses/356/code/etude-1/poem-ungenerate.ck '''"Degenerate"'''] -- a prompt-based example (run in command line chuck) | ||
+ | *** usage: <code>chuck poem-ungenerate.ck</code> | ||
=== Express Yourself! === | === Express Yourself! === | ||
− | + | * play with the ChucK/ChAI starter/example code for Word2Vec (feel free to use any part of these as starter code) | |
* write code to help you create some experimental poetry involving text, sound, and time. | * write code to help you create some experimental poetry involving text, sound, and time. | ||
** text: use the Word2Vec object in ChucK and one of the datasets to help you generate some poetry | ** text: use the Word2Vec object in ChucK and one of the datasets to help you generate some poetry | ||
** sound: use sound synthesis and map the words (e.g., using their vectors to control parameters such as pitch, volume, timbre, etc.) to sound | ** sound: use sound synthesis and map the words (e.g., using their vectors to control parameters such as pitch, volume, timbre, etc.) to sound | ||
− | ** time: don't forget about time! make words appear ''when'' you want them to; synchronize words with sound; visually and sonically "rap" the words in time! | + | ** time: don't forget about time! make words appear ''when'' you want them to; synchronize words with sound; visually and sonically "rap" the words in time! make a song! |
− | * create two poetic programs / works / | + | * create two poetic programs / works / readings: |
** make them as different as possible | ** make them as different as possible | ||
** for example one poem can be fully generated (you only need to run the chuck code, and the poem starts) and the other one interactive (incorporates input from the user, either through a a text prompt, or another means of input such as mouse / keypresses) | ** for example one poem can be fully generated (you only need to run the chuck code, and the poem starts) and the other one interactive (incorporates input from the user, either through a a text prompt, or another means of input such as mouse / keypresses) | ||
+ | * In class, we will have a poetry reading where we will run your code | ||
=== Some Prompts and Bad Ideas === | === Some Prompts and Bad Ideas === | ||
Line 59: | Line 80: | ||
* HINT: try to take advantage of the medium: in addition to printing out text to a terminal (both a limitation and a creative constraint/opportunity) you have control over sound and time at your disposal | * HINT: try to take advantage of the medium: in addition to printing out text to a terminal (both a limitation and a creative constraint/opportunity) you have control over sound and time at your disposal | ||
* HINT: experiment with the medium to embody your message -- for example, a poem about chaos where the words gradually become disjointed and nonsensical | * HINT: experiment with the medium to embody your message -- for example, a poem about chaos where the words gradually become disjointed and nonsensical | ||
+ | * HINT: can use a mixture of lines from existing poems and machine generated/processed output | ||
=== Reflections === | === Reflections === | ||
Line 69: | Line 91: | ||
* your Etude #1 webpage is to include | * your Etude #1 webpage is to include | ||
** a title and short description of the exercise (free free to link to this wiki page) | ** a title and short description of the exercise (free free to link to this wiki page) | ||
− | ** your poems in some | + | ** all relevant ChucK code for your experimental poetry tool |
− | + | ** your two poems in some (video recording with audio) form; this will depend on what you chose to do; since sound and time are involved, you might include a screen capture with high-quality audio capture | |
** your 250-word reflection | ** your 250-word reflection | ||
** any acknowledgements (people, code, or other things that helped you through this) | ** any acknowledgements (people, code, or other things that helped you through this) | ||
* submit to Canvas '''only your webpage URL''' | * submit to Canvas '''only your webpage URL''' |
Latest revision as of 12:20, 10 January 2024
Contents
Programming Etude #1: "Poets of Sound and Time"
Music and AI (Music356/CS470) | Winter 2023 | by Ge Wang
In this programming etude, you are to write two programs using chuck
with Word2Vec
to help you create some experimental poetry involving text, sound, and time.
Due Date
- Etude #1: due Wednesday (1/18, 11:59pm) see "Deliverables" below (extended 48 hours to Friday)
- Presentation: be prepared to share your work in class on Tuesday (1/24)
Put the Disco in Discord
- direct any questions, rumination, outputs/interesting mistakes to our class Discord
Things to Think With
- watch for ideas and inspiration: Allison Parrish's "Experimental Creative Writing with the Vectorized Word" (video)
Tools to Play With
- get the latest bleeding edge secret
chuck
build- macOS this will install both command line
chuck
and the graphical IDE miniAudicle, and replace any previous ChucK installation. - Windows you will need to download and use the bleeding-edge command line
chuck
(for now, there is no bleeding-edge miniAudicle for Windows); can either use the defaultcmd
command prompt, or might consider downloading a terminal emulator. - Linux you will need to build from source, provided in the
linux
directory - all platforms in order use console input (e.g., to run the sample code word2vec-prompt.ck), you will need to use command line
chuck
(console input is not available from within the miniAudicle IDE).
- macOS this will install both command line
- NOTE: to return your chuck back to a pre-bleeding-edge state, you can always install the latest official ChucK release
- also, a Zoom recording of the ChucK tutorial from 2023.01.13
The Word Embedding Model
- next, you'll need to download three sets of pre-trained word vectors
- glove-wiki-gigaword-50.txt -- pre-trained word vectors from Stanford GloVe (400,000 words x 50 dimensions)
- glove-wiki-gigaword-50-pca-3.txt -- dimensionally reduced model using PCA (400,000 words x 3 dimensions)
- glove-wiki-gigaword-50-tsne-2.txt -- dimensionally reduced model using t-SNE (400,000 words x 2 dimensions)
- each of these has their tradeoffs:
- the 50D has the best similarity (i.e., tends to get better search results) -- but searching is more computationally intensive (and may introduce glitches in real-time audio)
- the PCA-3D version is a dimensionally-reduced (from 50 to 3) version of the above, making it much faster to do similarity retrieval (moreover the lower dimensionality in this case means it's feasible to use space partitioning techniques like KD-trees to significantly speed up the search); in addition to being real-time-audio friendly, the three dimensions can be readily mapped to control audio parameters (frequency, volume, rate, timbre, etc.) -- however, the quality of the similarity retrieval is noticeably weaker, as seemingly unrelated words will show up in the results.
- the t-SNE-2D version, despite having one less dimension than the PCA-3D version, actually performs as well or better in terms of similarity compared to PCA-3D (partially due to the non-linear optimization of t-SNE compared with PCA's linear mapping of higher dimensions to lower dimensions); like 2D, 3D is friendly for mapping to audio parameters (and for visualization).
- we recommend exploring word embeddings first using the 50D model, before moving on to the 3D and 2D versions for mapping and for real-time audio. It is also possible to first generate text, stored the generated text, and then render it and the audio in real-time. This approach effectively removes the real-time constraints by separating the processing into a generative stage and an efficient rendering stage; however, this also means that real-time interactions (e.g., to influence the generative processes) will be limited.
Sample Code
- you can find sample code here
- start playing with these; free to incorporate these or use them as starting points for your experimental poetry tool
- word2vec-basic.ck -- basic example that...
- loads a word vector
- prints # of words and # of dimensions in the model
- shows how to get a vector associated with a word (using
getVector()
) - shows how to retrieve K most similar words using a word (using
getSimilar()
) - shows how to retrieve K most similar words using a vector (using
getSimilar()
) - uses the W2V helper class to evaluate an expression like "puppy - dog + cat" (using
W2V.eval()
) - uses the W2V helper class to evaluate a logical analog like dog:puppy::cat:?? (using
W2V.analog()
)
- word2vec-prompt.ck -- interactive prompt word2vec explorer
- this keeps a model loaded while allowing you to play with it
- type
help
to get started
- starter-prompt.ck -- minimal starter code for those wishing to include an interactive prompt in chuck, with sound
- word2vec-basic.ck -- basic example that...
- example of poems
- "i feel" -- a stream of unconsciousness poem (dependency: glove-wiki-gigaword-50-tsne-2 or any other model)
- usage:
chuck poem-i-feel.ck
- usage:
- "Random Walk" -- another stream of unconsciousness poem
- usage:
chuck poem-randomwalk.ck
orchuck poem-randomwalk.START_WORD
(to provide the starting word)
- usage:
- "Spew" -- yet another stream of unconsciousness poem
- usage:
chuck poem-spew.ck
orchuck poem-spew.ck:START_WORD
(to provide the starting word)
- usage:
- "Degenerate" -- a prompt-based example (run in command line chuck)
- usage:
chuck poem-ungenerate.ck
- usage:
- "i feel" -- a stream of unconsciousness poem (dependency: glove-wiki-gigaword-50-tsne-2 or any other model)
Express Yourself!
- play with the ChucK/ChAI starter/example code for Word2Vec (feel free to use any part of these as starter code)
- write code to help you create some experimental poetry involving text, sound, and time.
- text: use the Word2Vec object in ChucK and one of the datasets to help you generate some poetry
- sound: use sound synthesis and map the words (e.g., using their vectors to control parameters such as pitch, volume, timbre, etc.) to sound
- time: don't forget about time! make words appear when you want them to; synchronize words with sound; visually and sonically "rap" the words in time! make a song!
- create two poetic programs / works / readings:
- make them as different as possible
- for example one poem can be fully generated (you only need to run the chuck code, and the poem starts) and the other one interactive (incorporates input from the user, either through a a text prompt, or another means of input such as mouse / keypresses)
- In class, we will have a poetry reading where we will run your code
Some Prompts and Bad Ideas
- a poem can be about anything; hint: try starting with how you feel about a particular thing, person, or event
- starting with an existing poem and use word2vec to morph it over time
- an experimental love poem
- stream of consciousness
- remember "Jabberwocky" by Lewis Carroll? maybe your poem doesn't need to make sense to everyone
- HINT: try to take advantage of the medium: in addition to printing out text to a terminal (both a limitation and a creative constraint/opportunity) you have control over sound and time at your disposal
- HINT: experiment with the medium to embody your message -- for example, a poem about chaos where the words gradually become disjointed and nonsensical
- HINT: can use a mixture of lines from existing poems and machine generated/processed output
Reflections
- write ~250 words of reflection on your etude. It can be about your process, or the product, or the medium, or anything else. For example, how did your poems make you feel? Did you show them to a friend? What was their reaction? What makes something "poetry" (versus, say, "prose")?
Deliverables
- create a CCRMA webpage for this etude
- the URL should live at https://ccrma.stanford.edu/~YOURUSERID/356/etude1 or https://ccrma.stanford.edu/~YOURUSERID/470/etude1
- alternately, you may use Medium or another publishing platform (but please still link to that page from your CCRMA webpage)
- your Etude #1 webpage is to include
- a title and short description of the exercise (free free to link to this wiki page)
- all relevant ChucK code for your experimental poetry tool
- your two poems in some (video recording with audio) form; this will depend on what you chose to do; since sound and time are involved, you might include a screen capture with high-quality audio capture
- your 250-word reflection
- any acknowledgements (people, code, or other things that helped you through this)
- submit to Canvas only your webpage URL