Difference between revisions of "356-winter-2023/hw2"

From CCRMA Wiki
Jump to: navigation, search
(Due Dates)
Line 29: Line 29:
 
* next, you'll need to download [https://chuck.stanford.edu/chai/data/gtzan/ the GTZAN dataset]
 
* next, you'll need to download [https://chuck.stanford.edu/chai/data/gtzan/ the GTZAN dataset]
 
** 1000 30-second music clips, labeled by humans into ten genre categories
 
** 1000 30-second music clips, labeled by humans into ten genre categories
 
=== HW2 Sample Code ===
 
* you can find [https://ccrma.stanford.edu/courses/356/code/featured-artist/ '''sample code here''']
 
* start playing with these, and reading through these to get a sense of what the code is doing
 
** [https://ccrma.stanford.edu/courses/356/code/featured-artist/phase1 '''word2vec-basic.ck'''] -- basic example that...
 
*** loads a word vector
 
*** prints # of words and # of dimensions in the model
 
*** shows how to get a vector associated with a word (using <code>getVector()</code>)
 
*** shows how to retrieve K most similar words using a word (using <code>getSimilar()</code>)
 
*** shows how to retrieve K most similar words using a vector (using <code>getSimilar()</code>)
 
*** uses the W2V helper class to evaluate an expression like "puppy - dog + cat" (using <code>W2V.eval()</code>)
 
*** uses the W2V helper class to evaluate a logical analog like dog:puppy::cat:?? (using <code>W2V.analog()</code>)
 
** [https://ccrma.stanford.edu/courses/356/code/etude1/word2vec-prompt.ck '''word2vec-prompt.ck'''] -- interactive prompt word2vec explorer
 
*** this keeps a model loaded while allowing you to play with it
 
*** type <code>help</code> to get started
 
** [https://ccrma.stanford.edu/courses/356/code/etude1/starter-prompt.ck '''starter-prompt.ck'''] -- minimal starter code for those wishing to include an interactive prompt in chuck, with sound
 
* example of poems
 
** [https://ccrma.stanford.edu/courses/356/code/etude1/poem-i-feel.ck '''"i feel"'''] -- a stream of unconsciousness poem (dependency: glove-wiki-gigaword-50-tsne-2 or any other model)
 
*** usage: <code>chuck poem-i-feel.ck</code>
 
** [https://ccrma.stanford.edu/courses/356/code/etude1/poem-randomwalk.ck '''"Random Walk"'''] -- another stream of unconsciousness poem
 
*** usage: <code>chuck poem-randomwalk.ck</code> or <code>chuck poem-randomwalk.START_WORD</code> (to provide the starting word)
 
** [https://ccrma.stanford.edu/courses/356/code/etude1/poem-spew.ck '''"Spew"'''] -- yet another stream of unconsciousness poem
 
*** usage: <code>chuck poem-spew.ck</code> or <code>chuck poem-spew.ck:START_WORD</code> (to provide the starting word)
 
** [https://ccrma.stanford.edu/courses/356/code/etude1/poem-ungenerate.ck '''"Degenerate"'''] -- a prompt-based example (run in command line chuck)
 
*** usage: <code>chuck poem-ungenerate.ck</code>
 
  
 
=== Phase One: Feature Extract, Classify, Validate ===
 
=== Phase One: Feature Extract, Classify, Validate ===
Line 60: Line 35:
 
* run real-time classifier using different feature sets
 
* run real-time classifier using different feature sets
 
* run cross-validation to evaluate the quality of classifier based different features
 
* run cross-validation to evaluate the quality of classifier based different features
 +
 +
HW2 Sample Code
 +
* you can find [https://ccrma.stanford.edu/courses/356/code/featured-artist/ '''sample code here''']
 +
* start playing with these, and reading through these to get a sense of what the code is doing
 +
** [https://ccrma.stanford.edu/courses/356/code/featured-artist/example-centroid.ck '''example-centroid.ck''']
 +
** [https://ccrma.stanford.edu/courses/356/code/featured-artist/example-mfcc.ck '''example-mfcc.ck''']
 +
** [https://ccrma.stanford.edu/courses/356/code/featured-artist/feature-extract.ck '''feature-extract.ck''']
 +
** [https://ccrma.stanford.edu/courses/356/code/featured-artist/genre-classify.ck '''genre-classify.ck''']
 +
** [https://ccrma.stanford.edu/courses/356/code/featured-artist/x-validate.ck '''x-validate.ck''']
  
 
=== Phase Two: Curate Feature Database, Design Audio Mosaic Tool ===
 
=== Phase Two: Curate Feature Database, Design Audio Mosaic Tool ===

Revision as of 01:06, 24 January 2023

Programming Project #2: "Featured Artist"

Music and AI (Music356/CS470) | Winter 2023 | by Ge Wang

Mosaiconastick.jpg

In this programming project, we will learn to work with audio features for both supervised and unsupervised tasks. These include a real-time genre-classifier and a feature-based audio mosaic tool using similarity retrieval. Create a feature-driven musical statement or performance!

Due Dates

  • Milestone: webpage due Wednesday (2/1, 11:59pm) | in-class critique Thursday (2/2)
  • Final Deliverable: webpage due Wednesday (2/8, 11:59pm)
  • In-class Presentation: Thursday (2/9)

Discord Is Our Friend

  • direct any questions, rumination, outputs/interesting mistakes to our class Discord

Things to Think With

Tools to Play With

  • get the latest bleeding edge secret chuck build (2023.01.23 or later!)
    • macOS this will install both command line chuck and the graphical IDE miniAudicle, and replace any previous ChucK installation.
    • Windows you will need to download and use the bleeding-edge command line chuck (for now, there is no bleeding-edge miniAudicle for Windows); can either use the default cmd command prompt, or might consider downloading a terminal emulator.
    • Linux you will need to build from source, provided in the linux directory
    • all platforms for this project, you will be using the command line version of chuck.
  • NOTE: to return your chuck back to a pre-bleeding-edge state, you can always install the latest official ChucK release

GTZAN Dataset

  • next, you'll need to download the GTZAN dataset
    • 1000 30-second music clips, labeled by humans into ten genre categories

Phase One: Feature Extract, Classify, Validate

  • understanding audio, FFT, feature extraction
  • extract different sets of audio features from GTZAN dataset
  • run real-time classifier using different feature sets
  • run cross-validation to evaluate the quality of classifier based different features

HW2 Sample Code

Phase Two: Curate Feature Database, Design Audio Mosaic Tool

  • build a database mapping sound frames (100::ms to 1::second) <=> feature vectors
    • curate your own set of audio files can be mixture of
      • short sound effects (~1 second)
      • music (we will perform feature extraction on each short-time window)
  • prototype a feature-based sound explorer to query your database and perform similarity retrieval
  • using your database and retrieval tool, design an interactive audio mosaic generator
    • feature-based
    • real-time
    • takes any audio input (mic or any unit generator)
    • can be used for performance
  • (optional) do this in the audiovisual domain

Phase Three

    • use your prototype from Phase Two to create a musical statement
    • (optional) do this in the audiovisual domain

Reflections

  • write ~300 words of reflection on your project. It can be about your process, or the product. What were the limitations (and how did you try to get around them?)

Deliverables

  • create a CCRMA webpage for this etude
  • your webpage is to include
    • a title and description of your project (free free to link to this wiki page)
    • all relevant chuck code from all three phases
      • phase 1: all code used (extraction, classification, validation)
      • phase 2: your mosaic generator, and database query/retrieval tool
      • phase 3: code used for your musical statement
    • video recording of your musical statement (please start early!)
    • your 300-word reflection
    • any acknowledgements (people, code, or other things that helped you through this)
  • submit to Canvas only your webpage URL