Revision as of 00:59, 24 January 2023

Programming Project #2: "Featured Artist"

Music and AI (Music356/CS470) | Winter 2023 | by Ge Wang

In this programming project, we will learn to work with audio features for both supervised and unsupervised tasks. These include a real-time genre-classifier and a feature-based audio mosaic tool using similarity retrieval. Create a feature-driven musical statement or performance!

Due Dates

Milestone: webpage due Wednesday (2/1, 11:59pm) | in-class critique Thursday (2/2)
Final Deliverable: webpage due Wednesday (2/8, 11:59pm)
In-class Presentation: Thursday (2/9)

Discord Is Our Friend

direct any questions, rumination, outputs/interesting mistakes to our class Discord

Things to Think With

read/skim the classic article "Musical Genre Classification of Audio Signals" (Tzanetakis and Cook, 2002)
- don't worry about the details yet; first get a general sense what audio features and how they can be used

Tools to Play With

get the latest bleeding edge secret chuck build (2023.01.23 or later!)
- macOS this will install both command line chuck and the graphical IDE miniAudicle, and replace any previous ChucK installation.
- Windows you will need to download and use the bleeding-edge command line chuck (for now, there is no bleeding-edge miniAudicle for Windows); can either use the default cmd command prompt, or might consider downloading a terminal emulator.
- Linux you will need to build from source, provided in the linux directory
- all platforms for this project, you will be using the command line version of chuck.
NOTE: to return your chuck back to a pre-bleeding-edge state, you can always install the latest official ChucK release

GTZAN Dataset

next, you'll need to download the GTZAN dataset
- 1000 30-second music clips, labeled by humans into ten genre categories

HW2 Sample Code

you can find sample code here
start playing with these, and reading through these to get a sense of what the code is doing
- word2vec-basic.ck -- basic example that...
  - loads a word vector
  - prints # of words and # of dimensions in the model
  - shows how to get a vector associated with a word (using getVector())
  - shows how to retrieve K most similar words using a word (using getSimilar())
  - shows how to retrieve K most similar words using a vector (using getSimilar())
  - uses the W2V helper class to evaluate an expression like "puppy - dog + cat" (using W2V.eval())
  - uses the W2V helper class to evaluate a logical analog like dog:puppy::cat:?? (using W2V.analog())
- word2vec-prompt.ck -- interactive prompt word2vec explorer
  - this keeps a model loaded while allowing you to play with it
  - type help to get started
- starter-prompt.ck -- minimal starter code for those wishing to include an interactive prompt in chuck, with sound
example of poems
- "i feel" -- a stream of unconsciousness poem (dependency: glove-wiki-gigaword-50-tsne-2 or any other model)
  - usage: chuck poem-i-feel.ck
- "Random Walk" -- another stream of unconsciousness poem
  - usage: chuck poem-randomwalk.ck or chuck poem-randomwalk.START_WORD (to provide the starting word)
- "Spew" -- yet another stream of unconsciousness poem
  - usage: chuck poem-spew.ck or chuck poem-spew.ck:START_WORD (to provide the starting word)
- "Degenerate" -- a prompt-based example (run in command line chuck)
  - usage: chuck poem-ungenerate.ck

Phase One: Feature Extract, Classify, Validate

understanding audio, FFT, feature extraction
extract different sets of audio features from GTZAN dataset
run real-time classifier using different feature sets
run cross-validation to evaluate the quality of classifier based different features

Phase Two: Curate Feature Database, Design Audio Mosaic Tool

build a database mapping sound frames (100::ms to 1::second) <=> feature vectors
- curate your own set of audio files can be mixture of
  - short sound effects (~1 second)
  - music (we will perform feature extraction on each short-time window)
prototype a feature-based sound explorer to query your database and perform similarity retrieval
using your database and retrieval tool, design an interactive audio mosaic generator
- feature-based
- real-time
- takes any audio input (mic or any unit generator)
- can be used for performance
(optional) do this in the audiovisual domain

Phase Three

- use your prototype from Phase Two to create a musical statement
- (optional) do this in the audiovisual domain

Reflections

write ~300 words of reflection on your project. It can be about your process, or the product. What were the limitations (and how did you try to get around them?)

Deliverables

create a CCRMA webpage for this etude
- the URL should live at https://ccrma.stanford.edu/~YOURUSERID/356/hw2 or https://ccrma.stanford.edu/~YOURUSERID/470/hw2
- alternately, you may use Medium or another publishing platform (but please still link to that page from your CCRMA webpage)
your webpage is to include
- a title and description of your project (free free to link to this wiki page)
- all relevant chuck code from all three phases
  - phase 1: all code used (extraction, classification, validation)
  - phase 2: your mosaic generator, and database query/retrieval tool
  - phase 3: code used for your musical statement
- video recording of your musical statement (please start early!)
- your 300-word reflection
- any acknowledgements (people, code, or other things that helped you through this)
submit to Canvas only your webpage URL

@@ Line 7: / Line 7: @@
 === Due Dates ===
-* Milestone 1: '''webpage due Monday (1/30, 11:59pm) | in-class check-in Tuesday (1/31)'''
+* Milestone: '''webpage due Wednesday (2/1, 11:59pm) | in-class critique Thursday (2/2)'''
 * Final Deliverable: '''webpage due Wednesday (2/8, 11:59pm)'''
 * In-class Presentation: '''Thursday (2/9)'''

Difference between revisions of "356-winter-2023/hw2"

Revision as of 00:59, 24 January 2023

Contents

Programming Project #2: "Featured Artist"

Due Dates

Discord Is Our Friend

Things to Think With

Tools to Play With

GTZAN Dataset

HW2 Sample Code

Phase One: Feature Extract, Classify, Validate

Phase Two: Curate Feature Database, Design Audio Mosaic Tool

Phase Three

Reflections

Deliverables

Navigation menu

Views

Personal tools

Navigation

Search

Tools