Michael J. Wilson - Music 256a Final Project Development Log

2010-12-10 (Version 1.0 Post-Mortem)

Now that version 1.0 is out, let's look at what I accomplished as compared to my original vision. According to the project proposal, here was the main idea of the program:

"Sequencer that lets you record or open short waveforms and does spectral resynthesis on them in a short, step-sequenced loop."

Accomplished!

Interface

Here are specific points from the original plan that I managed to include:

Realtime modification of any of the parameters is supported
Tempo +/-
Play
Pause/Stop
Quit
Create new track
Record sound
Tracking area (matrix editor)
Close track

Here are specific points that I modified, or wasn't able to include in the first version:

Save song
Load song
Load sound
Save sound
Load sequence
Save sequence - I didn't integrate any saving or loading into the program, since it seemed like the least necessary feature to communicate the core idea
Toggle sines
Toggle noise - I didn't allow any tweaking of the spectral analysis or synthesis parameters, although this would be an interesting avenue to explore
64-timestep 24-vertical-step matrix editors - I ended up going with 32 timesteps since that felt more natural, but I could also make this variable.

And a new point:

Mute track - turned out to be pretty convenient

Software architecture

In terms of architecture, here are the components from my proposal that I used:

Pitch correction / shifting - via libsms
Sines / noise decomposition - via libsms
FFT - via libsms
Audio capture - via RTAudio
Audio playback - via RTAudio
Display GUI - via QT
Matrix view - via a custom QT widget
Enveloping - manually done
Time stretching - via libsms

Here are components from my proposal that I didn't use:

Pitch estimation - this was supposed to correct all the toned sounds to the same base pitch, but I didn't have time to experiment with it
Load/save sequence
Load/save waveforms (decomposed) - Loading and saving was dropped as it wasn't core to the experience

And here are components that I didn't list initially:

Sine wave playback - to give feedback before the user has recorded anything
Timers - via QT to control when to trigger the notes of the sequence

The UI turned out to be very simple. Ge commented on this after my presentation as well. I really wanted the sequencer to be very open: no hidden controls, no sub-dialogs (except for saving/loading which I didn't incorporate), no hidden notes on any matrices in view. My initial proposal had a modal record dialog but I think the color-changing record button on each track is a much better interface (even though I never mocked up a modal record dialog to test, I can tell it would be annoying just from what I tried). I still have a lot of room on each track to add some additional controls if I desire.

By doing things this way I think it's easier to interact with in real-time, and makes the focus of the program what you can do with it instead of the program interface itself.

Testing

I did all the technical testing myself. I ran into several issues with improper handling of pointers or bad casting. I also had to be careful when mixing the C-style memory management of libsms with the C++-style memory management of the rest of my program. Surprisingly, I didn't run into any multithreading issues. I had such a high expectation that I would that it increased the time it took to track down the real issues since I was looking for race conditions instead of pointer misuse.

Some of my fellow students played with some of the development versions and gave me valuable feedback on the user experience. Based on this feedback I was able to tune the GUI, and also made it so that nothing plays when you're overdubbing a previously-recorded track. They also let me know when it crossed the threshold and became "fun" to play with. There was other feedback - being able to organize tracks, being able to transpose tracks independently by a variable amount, being able to change the number of timesteps, adding onset detection to the recording, that would be valuable to incorporate into future versions as well.

Measures of goodness (from the proposal, in decreasing order of importance):

Is the program fun? - I thought it was fun and my classmates who played with the near-release version all told me they thought it was fun to play with. It is important to note that all of us are computer music students, and thus have a certain level of technical knowledge and definition of "fun" that probably doesn't directly align with the general public. It would be of great interest to me to get feedback from someone who is not in the field of computer music.
Ease of use - The program is lacking some basic sequencer conveniences, but is still reasonably easy to use. I noticed that it was not intuitive to others that a mouse right-click removed notes on the matrix editor. But once that was explained it was picked up and used effectively.
Technical correctness - The program is pretty technically correct. Ideally I would have tuned the spectral modeling parameters a bit, but since I don't really know what sort of sound is going to be recorded that's not a huge issue (the defaults work pretty well for voice).
Reproducibility of results - No saving and loading means that there is no way to exactly reproduce anything you create with this program. Better record it while you can!
Efficiency - The program is not very efficient. libsms is re-initialized for every analysis and synthesis step. Sean Coffin noted that since I have 24 vertical steps I could just pre-synthesize every possible note for a given sample, which is a great idea but one I didn't have time to implement. I noticed that running jack with even moderately low buffer sizes (~256 on the CCRMA workstations) resulted in a large number of xruns. Thankfully, increasing the buffer size didn't seem to impact the feeling of responsiveness of the program except for perhaps the recording part, which I believe can be helped further by using onset detection.
Robustness to degenerate inputs - The program is actually very robust, thanks in a large part to me limiting the number of controls and thus the number of possible inputs.

Milestones

uh-oh...

2010.11.10 - Initial GUI with all controls working. Able to record and play back sounds in sequence without any processing. - COMPLETED 2010.12.05
2010.11.17 - Resynthesis of sounds working. Saving and loading of individual sounds and sequences working. - First part COMPLETED 2010.12.07. Second part NOT COMPLETED.
2010.11.24 - Resynthesis tuned, batch save and load implemented. Polishing. - First part NOT COMPLETED. Polishing was done up to the deadline.

Yep. But even though I completely missed all of my deadlines, the core of the project was completed on time. I felt that I should have integrated the audio processing earlier and played with libsms much earlier. However, by following the path that I did I had a very solid simple sequencer framework that was robust and that I understood very well. So I am a bit torn as to if I would want to put off getting that working first. Perhaps more distance from the project is necessary before I can really analyze what happened with the schedule.

Conclusions

Overall I call the project a success. It got across everything I wanted to get across. When I had time to develop it I found it extremely enjoyable to work on.

The worst part of the process was right before the first milestone when I was scrambling to get my prototype working, and then it didn't work correctly. This was a wakeup call and I did make significant progress immediately after the presentation. But it was stressful.

The best part of the process was the first time I got realtime spectral resynthesis integrated and working. Even though this happened right before the deadline, I had started exploring the libraries I would use and building some structure in advance of this. Maybe I should have felt more stressed at this point but I was in a state of flow, which is one of my favorite ways to feel.

I got feedback from people who attended the final presentations that my presentation (which was basically me just playing around with the program in real-time) was good. I also got feedback that it sounded like Mario Paint... probably because I put a "meow" sound into one of the tracks :-)

What's next?

There are features that I think would be simple to implement and add significantly to the experience (listed roughly in increasing order of estimated effort):

Ability to transpose a track up and down half-steps
Color the down-beats and octaves on the track matrix view
Onset detection for recording
Multiplatform support - all of the libraries I am using are multiplatform compatible

When I get a chance, I will try to incorporate these into a new release version.

Thank you!

Thanks for reading my development log! If you have any questions or comments please feel free to email me (you can find my email address on the main page of this website).

2010-12-08

1AM - Tried a couple things to get processing more efficient since I was seeing xruns when running it, but they didn't work. The thing that seemed to work best was increasing the buffer size in Jack. In any event, it's fun to play with and ready to present. Code is posted on the index page.

UPDATE (11 PM) - Presentation complete, course complete. Once my last final is done I'll revisit this.

2010-12-07

The day before the final presentation, and all the components are finally working. There was a bit of trickery getting libsms to do what I wanted it to do. It's a very nice C library, but unfortunately I'm programming in C++ and there are some stylistic and conventional differences between the languages that took me a little while to reconcile to my satisfaction. But it works beautifuly. I should probably write Rich Eakin a big thank-you letter for his work on the library.

As of now, 7 AM, spectral modeling synthesis is being used via libsms to change the pitches of recorded samples. I still need to stretch / compress time to make them fit in with the beat of the sequencer, but I'm pretty confident I know how to do that. After that is done I'll fork a "backup" for the presentation and then see what sorts of other effects I can cram in.

Oh, and that segfault I mentioned below? It was due to maintaining a "current sample" pointer into the recorded sound and not reinitializing it to 0 when a new sound is recorded (it did and does get initialized to 0 when a new note is played). Thus, if a new sound was recorded for a track over a shorter sound, it could end up having the pointer out of range if a sample was requested before the next note was played. Simple fix. Every problem I've had that I thought was due to race conditions or multithreading or not enough resources has turned out to be basic pointer issues. I'm very pleased that computers are fast enough to do all this processing in realtime.

More updates will probably be coming throughout the day.

UPDATE (8AM) - here's a semifinal screenshot:

Oh man, am I really going to call this thing sequins? The only other contender I have right now is "Sine Lord". I guess I should at least capitalize it. Sequins. I don't know.

Added timestretching and a little ADSR envelope to smooth out transitions between notes. It's working well enough that I'm going to take a nap then take care of the administrative things - putting the GPL notice in my code, fixing up the website, working on how I'm going to present it tomorrow.

UPDATE (11PM) - Website's done, code will be up shortly. It's sequins after all. I'll do a post-mortem sometime after the presentation is over. Definitely learned some things from this project. Overall I'm satisfied with the result. It's a toy, but a fun toy and it gets across all the main ideas I wanted to experiment with.

2010-12-05

Got realtime recording and sample playback (without SMS) integrated in the project. Sound processing isn't done realtime, and opens up a small potential race condition if someone is fast enough to hit record and then hit it again after they've stopped recording another track but before it has finished processing. But I am just going to to live with that for the time being, even though it shouldn't be that hard to fix. The obvious error of double-recording tracks simultaneously is handled correctly, of course.

I also made a test program which hacked smsSynth and RtAudio together in order to do sms playback directly. It worked, but I wasn't really doing the processing in realtime. I'm a little concerned it won't be fast enough, even if I precompute the sms datastructure beforehand. But there's only one way to find out.

If this doesn't work, Bjoern Erlach recommended other things I could try: phase vocoders, Loris, ATS, the entire CLAM system. But hopefully libsms will work.

UPDATE - libsms is now integrated and doing analysis but not synthesis. I do have an unrelated segfault occuring occasionally when I hit record. If I do things slowly it doesn't seem to occur, so it's probably a race condition. Next step is to try to get synthesis working correctly, then I'll iron out as many bugs as I can before Wednesday (between my other finals and final coursework).

2010-12-01

Second milestone was today. You can now click and drag over the matrix to add or remove notes, and I smoothed out the sine wave transitions. But still no spectral modeling synthesis integrated. Ge agreed in the milestone presentation that SMS is central to the concept of the project. I have a week to add it.

2010-11-23

I downloaded and played around with smstools [EDIT: by which I mean libsms], UPF's spectral modeling synthesis toolkit [EDIT: by which I mean library]. It seems like it could be possible to integrate this with my project. But I may still write my own resynthesis routines. I still need to work out recording sounds in the meantime.

I put some linear gradiants on the matrix notes, and made the timebar flash. I played with the color scheme a bit too. Nothing major and no screenshots yet. Qt makes changing this pretty simple so I can tweak this more at the end.

2010-11-17

I hacked and hacked trying to get ready for the first milestone, but ended up with just a sinewave sequencer. A segfaulting sinewave sequencer that played noise out of one channel. This is not what I meant by sines and noise decomposition!

But I was able to fix it up after the milestone presentations:

Tempo adjustment works, creating and deleting multiple tracks with full step-sequencer sine wave playback works, and there are now label areas for each track. That part may be revised later. I waited too long to do the sound integration, but now that I've gotten output working I'm feeling a lot better about the project. Still need to be able to record sounds, and hopefully I can leverage some of Xavier Serra's work to do the DSP portion of the program.

For reference, the segfaults were caused by 1) using the wrong datatype in my audio callback function (double instead of float - that's what I get from copy-pasting code from an earlier assignment) and 2) not getting the widget member of the QLayoutItem() from the list I was iterating over. Simple mistakes but they cost me a lot of time. The program is rock solid now stability-wise, and hopefully I won't be under so much time pressure when the next milestone comes around.

2010-11-15

Finally got a solid chunk of time to work on this.

The new track and close track buttons work as expected, the matrix view responds to mouse events (add notes with left click and remove them with right click; only one note per column; no sustained notes yet but that can come later). The minimum size of the matrix has been increased. And there is now a timer that sends out beat and stop messages, although there's no audible or visible feedback for that right now.

Our first milestone is looming on Wednesday so I may end up just making this a sine wave sequencer for that. There are still some GUI enhancements to be made but they can probably wait. I'm also going to put off saving, loading and exporting until later.

So the top priorities for tomorrow are:

Get sound output
Make tempo adjustible
Start working on microphone input

2010-11-11

Read through this tutorial and got aquainted with QT's Meta Object Compiler. I'm much more confident now that the custom matrix widget will work out just fine. I didn't hit my initial milestone of having the minimal system up by yesterday, but the class milestone is Monday [EDIT - actually it's Wednesday] and I should be able to do it by then.

Even though the GUI isn't done, I'm starting to think more about sound design and techniques as the next big challenge. But I should finish a minimal GUI first.

Drawing a grid turned out to be pretty simple (of course, making it beautiful will be more difficult but that will come last). Right now I have 32 steps per track and 24 notes. Sean Coffin suggested having a transpose feature or making the range scrollable to allow for basslines or higher pitched lines. I'll have to think about it; I want to keep the interface simple but two octaves is a bit limiting. There are also some obvious problems with the size of the grid elements on the edges but those should be simple to work out. The next big thing in the UI is making the matrix editor respond to mouse input.

2010-11-10

I got all the easy parts of the GUI mocked up. Making the custom widget for the sequencing area will take some doing and I'm a little nervous about that. Making sure the widgets are destroyed properly takes some care; I was getting problems with double frees until I started handling things correctly.

Because spectral coding has some roots in telephone coding, and since the button arrangements right now strongly suggest telephones (to me, anyway), I'm wondering if some sort of telephone metaphor is appropriate for the project. Until then, codename 'sequins' it is.

I guess that the "right" way to make a QT program is with QT designer and re-running their makefile generator when changing things, but I just grabbed the necessary make flags and am doing everything in code. We'll see how that works out.

2010-11-06

Started project. It doesn't look like we have the QT development headers installed so I will have to find a way to work around that.

I looked into it a bit more and found that I need to run qmake-qt4 to generate a makefile with some options. I've only used QT with Python before so I didn't know about this requirement when using C++. I was able to compile and run a hello world GUI program. Everything for the GUI should be simple to make except for perhaps the transport line and the matrix editor for the notes. I hope to have a simple GUI with all the controls except for those done shortly.

2010-11-01

Gave presentation of project proposal in class.