Difference between revisions of "356-winter-2023/hw2"

From CCRMA Wiki
Jump to: navigation, search
(Due Dates)
Line 7: Line 7:
  
 
=== Due Dates ===
 
=== Due Dates ===
* Milestone 1: '''webpage due Monday (1/30, 11:59pm) | in-class check-in Tuesday (1/31)'''
+
* Milestone: '''webpage due Wednesday (2/1, 11:59pm) | in-class critique Thursday (2/2)'''
 
* Final Deliverable: '''webpage due Wednesday (2/8, 11:59pm)'''
 
* Final Deliverable: '''webpage due Wednesday (2/8, 11:59pm)'''
 
* In-class Presentation: '''Thursday (2/9)'''
 
* In-class Presentation: '''Thursday (2/9)'''

Revision as of 00:59, 24 January 2023

Programming Project #2: "Featured Artist"

Music and AI (Music356/CS470) | Winter 2023 | by Ge Wang

Mosaiconastick.jpg

In this programming project, we will learn to work with audio features for both supervised and unsupervised tasks. These include a real-time genre-classifier and a feature-based audio mosaic tool using similarity retrieval. Create a feature-driven musical statement or performance!

Due Dates

  • Milestone: webpage due Wednesday (2/1, 11:59pm) | in-class critique Thursday (2/2)
  • Final Deliverable: webpage due Wednesday (2/8, 11:59pm)
  • In-class Presentation: Thursday (2/9)

Discord Is Our Friend

  • direct any questions, rumination, outputs/interesting mistakes to our class Discord

Things to Think With

Tools to Play With

  • get the latest bleeding edge secret chuck build (2023.01.23 or later!)
    • macOS this will install both command line chuck and the graphical IDE miniAudicle, and replace any previous ChucK installation.
    • Windows you will need to download and use the bleeding-edge command line chuck (for now, there is no bleeding-edge miniAudicle for Windows); can either use the default cmd command prompt, or might consider downloading a terminal emulator.
    • Linux you will need to build from source, provided in the linux directory
    • all platforms for this project, you will be using the command line version of chuck.
  • NOTE: to return your chuck back to a pre-bleeding-edge state, you can always install the latest official ChucK release

GTZAN Dataset

  • next, you'll need to download the GTZAN dataset
    • 1000 30-second music clips, labeled by humans into ten genre categories

HW2 Sample Code

  • you can find sample code here
  • start playing with these, and reading through these to get a sense of what the code is doing
    • word2vec-basic.ck -- basic example that...
      • loads a word vector
      • prints # of words and # of dimensions in the model
      • shows how to get a vector associated with a word (using getVector())
      • shows how to retrieve K most similar words using a word (using getSimilar())
      • shows how to retrieve K most similar words using a vector (using getSimilar())
      • uses the W2V helper class to evaluate an expression like "puppy - dog + cat" (using W2V.eval())
      • uses the W2V helper class to evaluate a logical analog like dog:puppy::cat:?? (using W2V.analog())
    • word2vec-prompt.ck -- interactive prompt word2vec explorer
      • this keeps a model loaded while allowing you to play with it
      • type help to get started
    • starter-prompt.ck -- minimal starter code for those wishing to include an interactive prompt in chuck, with sound
  • example of poems
    • "i feel" -- a stream of unconsciousness poem (dependency: glove-wiki-gigaword-50-tsne-2 or any other model)
      • usage: chuck poem-i-feel.ck
    • "Random Walk" -- another stream of unconsciousness poem
      • usage: chuck poem-randomwalk.ck or chuck poem-randomwalk.START_WORD (to provide the starting word)
    • "Spew" -- yet another stream of unconsciousness poem
      • usage: chuck poem-spew.ck or chuck poem-spew.ck:START_WORD (to provide the starting word)
    • "Degenerate" -- a prompt-based example (run in command line chuck)
      • usage: chuck poem-ungenerate.ck

Phase One: Feature Extract, Classify, Validate

  • understanding audio, FFT, feature extraction
  • extract different sets of audio features from GTZAN dataset
  • run real-time classifier using different feature sets
  • run cross-validation to evaluate the quality of classifier based different features

Phase Two: Curate Feature Database, Design Audio Mosaic Tool

  • build a database mapping sound frames (100::ms to 1::second) <=> feature vectors
    • curate your own set of audio files can be mixture of
      • short sound effects (~1 second)
      • music (we will perform feature extraction on each short-time window)
  • prototype a feature-based sound explorer to query your database and perform similarity retrieval
  • using your database and retrieval tool, design an interactive audio mosaic generator
    • feature-based
    • real-time
    • takes any audio input (mic or any unit generator)
    • can be used for performance
  • (optional) do this in the audiovisual domain

Phase Three

    • use your prototype from Phase Two to create a musical statement
    • (optional) do this in the audiovisual domain

Reflections

  • write ~300 words of reflection on your project. It can be about your process, or the product. What were the limitations (and how did you try to get around them?)

Deliverables

  • create a CCRMA webpage for this etude
  • your webpage is to include
    • a title and description of your project (free free to link to this wiki page)
    • all relevant chuck code from all three phases
      • phase 1: all code used (extraction, classification, validation)
      • phase 2: your mosaic generator, and database query/retrieval tool
      • phase 3: code used for your musical statement
    • video recording of your musical statement (please start early!)
    • your 300-word reflection
    • any acknowledgements (people, code, or other things that helped you through this)
  • submit to Canvas only your webpage URL