https://ccrma.stanford.edu/mediawiki/api.php?action=feedcontributions&user=Jcaceres&feedformat=atomCCRMA Wiki - User contributions [en]2024-03-28T16:17:45ZUser contributionsMediaWiki 1.24.1https://ccrma.stanford.edu/mediawiki/index.php?title=Mailing_Lists&diff=9364Mailing Lists2009-12-05T22:13:43Z<p>Jcaceres: </p>
<hr />
<div>CCRMA has several mailing lists. Go here [http://www.stanford.edu/services/mailman Stanford Mailman] for more information on Stanford mailing lists.<br />
<br />
==users and local-users==<br />
<br />
These mailing lists are available only to those with a CCRMA account. '''local-users''' is a mailing list for currently active CCRMA folks. That is, those that have logged into a computer physically at CCRMA within the last 100 days. Posts to this list are rather frequent. The '''users''' mailing list includes anyone with a CCRMA account (that's more than 1,300 recipients). <br />
<br />
If you would like to stay on the '''local-users''' mailing list past the 100 days, you can create a file (which goes into your home directory):<br />
<br />
<pre>~/.ccrma.conf</pre><br />
<br />
with the following contents:<br />
<br />
<pre>localUsersListTimer=-1</pre><br />
<br />
[[Category:CCRMA User Guide]]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mac_OS_X_Leopard_Tips_and_Tricks&diff=4592Mac OS X Leopard Tips and Tricks2008-03-17T01:56:52Z<p>Jcaceres: /* QJackCtl */</p>
<hr />
<div>== Jack Audio ==<br />
The easiest is to install the precompiled binaries.<br />
<br />
=== Jack Audio Connection Kit ===<br />
The Jack audio connection kit can be found at http://jackosx.com/<br />
<br />
=== QJackCtl ===<br />
[http://qjackctl.sourceforge.net/ QJackCtl] is highly recommended instead of jackpilot. It is kind of harder to find online as binary version.<br />
Is can be found in this strange place http://ardour.org/osx_system_requirements<br />
<br />
Note: The program starts, but cannot start jackd. I am in the meantime using jackpilot to start it and then QJackCtl to make connections. I tried to compile the source but it doesn't find qt frameworks...<br />
The bug seems to be that QJackCtl is using the option -n instead of -d for the device in coreaudio.<br />
<br />
== Finder ==<br />
<br />
=== Sort Folders Before Files ===<br />
This only works when you sort by kind...<br />
Just edit with emacs or any other text editor (as root) the file:<br />
System/Library/CoreServices/Finder.app/Contents/Resources/English.lproj/InfoPlist.strings<br />
<br />
and change the line:<br />
/* General kind strings */<br />
"Folder" = "Folder";<br />
to:<br />
"Folder" = "~Folder";<br />
<br />
Restart OS X.</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mac_OS_X_Leopard_Tips_and_Tricks&diff=4591Mac OS X Leopard Tips and Tricks2008-03-17T01:43:16Z<p>Jcaceres: /* Jack Audio Connection Kit */</p>
<hr />
<div>== Jack Audio ==<br />
The easiest is to install the precompiled binaries.<br />
<br />
=== Jack Audio Connection Kit ===<br />
The Jack audio connection kit can be found at http://jackosx.com/<br />
<br />
=== QJackCtl ===<br />
[http://qjackctl.sourceforge.net/ QJackCtl] is highly recommended instead of jackpilot. It is kind of harder to find online as binary version.<br />
Is can be found in this strange place http://ardour.org/osx_system_requirements<br />
<br />
Note: The program starts, but cannot start jackd. I am in the meantime using jackpilot to start it and then QJackCtl to make connections. I tried to compile the source but it doesn't find qt frameworks...<br />
<br />
== Finder ==<br />
<br />
=== Sort Folders Before Files ===<br />
This only works when you sort by kind...<br />
Just edit with emacs or any other text editor (as root) the file:<br />
System/Library/CoreServices/Finder.app/Contents/Resources/English.lproj/InfoPlist.strings<br />
<br />
and change the line:<br />
/* General kind strings */<br />
"Folder" = "Folder";<br />
to:<br />
"Folder" = "~Folder";<br />
<br />
Restart OS X.</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mac_OS_X_Leopard_Tips_and_Tricks&diff=4590Mac OS X Leopard Tips and Tricks2008-03-17T01:42:22Z<p>Jcaceres: /* Jack Audio */</p>
<hr />
<div>== Jack Audio ==<br />
The easiest is to install the precompiled binaries.<br />
<br />
=== Jack Audio Connection Kit ===<br />
The Jack audio connection kit can be found at http://jackosx.com/<br />
<br />
[http://qjackctl.sourceforge.net/ QjackCtl] is highly recommended instead of jackpilot. It is kind of harder to find online as binary version.<br />
Is can be found in this strange place http://ardour.org/osx_system_requirements<br />
<br />
Note: The program starts, but cannot start jackd. I am in the meantime using jackpilot to start it and then QJackCtl to make connections. I tried to compile the source but it doesn't find qt frameworks...<br />
<br />
== Finder ==<br />
<br />
=== Sort Folders Before Files ===<br />
This only works when you sort by kind...<br />
Just edit with emacs or any other text editor (as root) the file:<br />
System/Library/CoreServices/Finder.app/Contents/Resources/English.lproj/InfoPlist.strings<br />
<br />
and change the line:<br />
/* General kind strings */<br />
"Folder" = "Folder";<br />
to:<br />
"Folder" = "~Folder";<br />
<br />
Restart OS X.</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mac_OS_X_Leopard_Tips_and_Tricks&diff=4589Mac OS X Leopard Tips and Tricks2008-03-16T23:51:28Z<p>Jcaceres: </p>
<hr />
<div>== Jack Audio ==<br />
The easiest is to install the precompiled binaries.<br />
<br />
The Jack audio connection kit can be found at http://jackosx.com/<br />
<br />
[http://qjackctl.sourceforge.net/ QjackCtl] is highly recommended instead of jackpilot. It is kind of harder to find online as binary version.<br />
Is can be found in this strange place http://ardour.org/osx_system_requirements<br />
<br />
== Finder ==<br />
<br />
=== Sort Folders Before Files ===<br />
This only works when you sort by kind...<br />
Just edit with emacs or any other text editor (as root) the file:<br />
System/Library/CoreServices/Finder.app/Contents/Resources/English.lproj/InfoPlist.strings<br />
<br />
and change the line:<br />
/* General kind strings */<br />
"Folder" = "Folder";<br />
to:<br />
"Folder" = "~Folder";<br />
<br />
Restart OS X.</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mac_OS_X_Leopard_Tips_and_Tricks&diff=4588Mac OS X Leopard Tips and Tricks2008-03-16T22:34:29Z<p>Jcaceres: /* Sort Folders Before Files */</p>
<hr />
<div>== Finder ==<br />
<br />
=== Sort Folders Before Files ===<br />
This only works when you sort by kind...<br />
Just edit with emacs or any other text editor (as root) the file:<br />
System/Library/CoreServices/Finder.app/Contents/Resources/English.lproj/InfoPlist.strings<br />
<br />
and change the line:<br />
/* General kind strings */<br />
"Folder" = "Folder";<br />
to:<br />
"Folder" = "~Folder";<br />
<br />
Restart OS X.</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mac_OS_X_Leopard_Tips_and_Tricks&diff=4587Mac OS X Leopard Tips and Tricks2008-03-16T22:12:38Z<p>Jcaceres: </p>
<hr />
<div>== Finder ==<br />
<br />
=== Sort Folders Before Files ===<br />
Just edit with emacs or any other text editor (as root) the file:<br />
System/Library/CoreServices/Finder.app/Contents/Resources/English.lproj/InfoPlist.strings<br />
<br />
and change the line:<br />
/* General kind strings */<br />
"Folder" = "Folder";<br />
to:<br />
"Folder" = "~Folder";<br />
<br />
Restart OS X.</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mac_OS_X_Leopard_Tips_and_Tricks&diff=4586Mac OS X Leopard Tips and Tricks2008-03-16T22:09:55Z<p>Jcaceres: </p>
<hr />
<div><br />
== Finder ==<br />
<br />
=== Sort Folders Before Files ===<br />
Just edit with emacs or any other text editor (as root) the file<br />
<br />
Change:<br />
/* General kind strings */<br />
<br />
"Folder" = "Folder";<br />
to <br />
"Folder" = "~Folder";<br />
<br />
Restart OS X.</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=StanfordBeijingPlanning&diff=4064StanfordBeijingPlanning2008-02-12T01:54:13Z<p>Jcaceres: /* tentative plan so far */</p>
<hr />
<div>= Beijing <---> Stanford Planning Meeting (2008.2.4) =<br />
<br />
== tentative plan so far ==<br />
* what: networked + laptop orchestra telematic concert between Beijing and Stanford (full duplex)<br />
* when: April 30th, 2008, 8pm PST, part of Jindong's PanAsia Music Festival<br />
* where: Stanford: Dink, Beijing: TBD<br />
* who:<br />
** Stanford: Chris (celleto), Ge (laptop orchestra), Juan-Pablo (network?), Jindong (?)<br />
*** NOTE (from Juan-Pablo): I would actually like to play the network in this concert, but I mean, TO PLAY IT. I can be controlling live reverberation and echoic algorithms embedded in network, plus other ideas that may come along. This may be specially interesting in the first and last piece in the program.<br />
** Beijing: Mrs. Fields (erhu?), TBD<br />
* why: because it'd be sweet if we can pull it off<br />
* how: TBD<br />
<br />
== tentative program (networked portion) ==<br />
* er-huh + celleto improv jam<br />
* Ives++: "unanswered question" (er-hu + celleto + laptop orchestra)<br />
* possible laptop orchestra piece for local + chinese audience (possibly by MSTs, possibly involving Radio Baton)<br />
* end piece: Pauline Oliveros's "Tuning Meditation"<br />
<br />
== current concerns ==<br />
* Beijing venue - where to do this? Some current candidates include:<br />
** Central Conservatory of China, on pavilion outside of main auditory<br />
** National Theatre, possibly in atrium<br />
** landmark - moon/sun temple? scenic mountain park?<br />
* Beijing onsite technical setup<br />
** port/access to node on internet backbone<br />
** machine (we might provide this)<br />
** sound system<br />
** video hardware (might require another machine?)<br />
* video capture/transmission/playback<br />
** what to use for these?<br />
** who to investigate these tools (perhaps Deepak on CCRMA side)<br />
* Beijing event as public concert (perhaps for students)<br />
<br />
== next steps ==<br />
* get in touch with Ken Fields (ge)<br />
** invite his wife to play<br />
* get in touch with Zhang Xiaofu<br />
* get in touch with Mungo<br />
<br />
* arrange conference call (ge)<br />
* gain access to point of presence at Beijing for internet backbone (perhaps ping Chris's friend on Internet2, coordinator of fine arts?) (chris?)<br />
* determine point person, one in Beijing, one here<br />
* figure out who can go back to China in March to set up tech, run tests (Juan-Pablo, Rob?)</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Soundwire-fall-2007&diff=2653Soundwire-fall-20072007-10-03T23:37:04Z<p>Jcaceres: </p>
<hr />
<div>* class [http://ccrma.stanford.edu/groups/soundwire/course/ homepage]<br />
* [[Soundwire-fall2007/People|list of CCRMA folks]]<br />
* [[Soundwire-fall2007/Bios|bios]]<br />
* [[Soundwire-fall2007/Equipment | list of equipment each person plans to bring]]<br />
<br />
=== Next meeting (October 4)===<br />
We are going to make 4 instrumental groups with one analog mixer each. From each mixer 2-channels are going to <br />
be sent to the main mixer, and those 8 channels are going to be streamed. We are looking for volunteers to be <br />
in charge of each group, if you want to do that let us know.<br />
<br />
Everyone that is using electronics or laptops is responsible for their own cables, i.e., you need to provide<br />
1/4 inch output from your setup (mono or stereo) to get into the mixer. Please also make sure that your signal isn't noisy.<br />
<br />
'''Group leaders'''<br />
* rob @ 3 o' clock mixer! (for reference: screen is 12 o' clock)<br />
* ?<br />
* ?<br />
* ?<br />
<br />
<br />
[[Category:Courses]]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Soundwire-fall-2007&diff=2633Soundwire-fall-20072007-10-03T18:54:36Z<p>Jcaceres: </p>
<hr />
<div>* class [http://ccrma.stanford.edu/groups/soundwire/course/ homepage]<br />
* [[Soundwire-fall2007/People|list of CCRMA folks]]<br />
* [[Soundwire-fall2007/Bios|bios]]<br />
<br />
=== Next meeting (October 4)===<br />
We are going to make 4 instrumental groups with one analog mixer each. From each mixer 2-channels are going to <br />
be sent to the main mixer, and those 8 channels are going to be streamed. We are looking for volunteers to be <br />
in charge of each group, if you want to do that let us know.<br />
<br />
Everyone that is using electronics or laptops is responsible for their own cables, i.e., you need to provide<br />
1/4inch output from your setup (mono or stereo) to get into the mixer. Please also make sure that your signal isn't noisy.<br />
<br />
<br />
Please list equipment you plan to bring in the following list:<br />
* [[Soundwire-fall2007/Equipment | list of equipment each person plans to bring]]<br />
<br />
[[Category:Courses]]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Soundwire-fall2007/Bios&diff=2632Soundwire-fall2007/Bios2007-10-03T18:53:07Z<p>Jcaceres: </p>
<hr />
<div>[http://ccrma.stanford.edu/~cc/ Chris Chafe] is a composer/ cellist / music researcher with an interest in computer music composition and interactive performance. He has been a long-term denizen of the Center for Computer Research in Music and Acoustics, Stanford University where he directs the center and teaches computer music courses. His doctorate in music composition was completed at Stanford in 1983 with prior degrees in music from the University of California at San Diego and Antioch College. Two yearlong research periods were spent at IRCAM, and the Banff Center for the Arts developing methods for computer sound synthesis based on physical models of musical instrument mechanics. A current project, "SoundWIRE", explores musical collaboration and network evaluation using high-speed internets for high-quality sound.<br />
<br />
<br />
[http://ccrma.stanford.edu/~ge/ Ge Wang] received his B.S. in 2000 in Computer Science from Duke University, PhD in 2007 (hopefully!) in Computer Science (adviser Perry Cook) from Princeton University, and is currently an assistant professor in the Center for Computer Research in Music and Acoustics (CCRMA) at Stanford University. His research interests include real-time software systems for computer music, programming languages, visualization, new performance ensembles (e.g., laptop orchestras) and paradigms (e.g., live coding), interfaces for human-computer interaction, pedagogical methodologies at the intersection of computer science and computer music. Ge is the chief architect of the ChucK audio programming language and the Audicle environment. He is a founding developer and co-director of the Princeton Laptop Orchestra (PLOrk), and a co-creator of the TAPESTREA sound design environment. Ge composes and performs via various electro-acoustic and computer-mediated means.<br />
<br />
<br />
Composer and guitarist [http://ccrma.stanford.edu/~rob/ Robert Hamilton] (b.1973) is actively engaged in the composition of contemporary electroacoustic musics as well as the development of interactive musical systems for performance and composition. Mr. Hamilton holds degrees from Stanford University, Dartmouth College, and the Peabody Institute of The Johns Hopkins University with additional studies at Le Centre de Création Musicale de Iannis Xenakis (CCMIX) and L'Ecole Normale de Musique de Paris with the EAMA. His compositions and published writings have been presented at the International Computer Music Conference (ICMC 2007, 2006, 2005), newStage:CCRMA Festival, SEAMUS 2007 (Ames), NIME 2006 (Paris), the CCRMA Concert Series, Sound in Media Workshop (Copenhagen), the SPARK Festival, 3rd Practice Festival, ISMIR 2003, the Dartmouth Electric Rainbow Coalition Festival and the Smithsonian Institute. Mr. Hamilton is currently pursuing his Ph.D. in Computer-based Music Theory and Acoustics at Stanford University's CCRMA working with Chris Chafe. His research interests include novel platforms for electroacoustic composition and performance, the definition and implementation of flexible parameter-spaces for interactive musical systems, and systems for real-time musical data-exchange, translation and notation display.<br />
<br />
<br />
'''Juan-Pablo Caceres''' is a composer, performer and engineer born in Santiago, <br />
Chile. He is currently a PhD student in computer music at CCRMA in Stanford <br />
University (USA). His work includes instrumental and electronic pieces, as <br />
well as performance of avantgarde rock music, with a albums edited in [http://www.lizardrecords.it/yonhosago.html Europe] <br />
and [http://www.innova.mu/artist1.asp?skuID=256 America]. Juan-Pablo's interests include Internet music and performance (he is an active member of the <br />
[http://ccrma.stanford.edu/groups/soundwire/ SOUNDWire project]), <br />
virtual acoustic spaces, popular experimental music, boundary pushing <br />
computer music (in both directions).</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Soundwire-fall2007/People&diff=2631Soundwire-fall2007/People2007-10-03T18:51:41Z<p>Jcaceres: </p>
<hr />
<div>=== CCRMA folks (and soundwire instruments) ===<br />
* [http://ccrma.stanford.edu/~cc/ Chris Chafe] (electric cello)<br />
* [http://ccrma.stanford.edu/~ge/ Ge Wang] (laptop, code)<br />
* Kyle Spratt (laptop)<br />
* Nick Bryan (laptop, clarinet)<br />
* Turner Kirk (laptop, bagpipe, good times)<br />
* Hayden Bursk (laptop, guitar, drums)<br />
* Cobi van Tonder (laptop, voice, mic/objects)<br />
* Gina (Yiqing Gu) (laptop, flute, piano)<br />
* Elise MacMillan (violin, electric/acoustic)<br />
* Diana Siwiak (flute, piano, voice)<br />
* Baek Chang (laptop, guitar, piano)<br />
* Dennis (laptop, guitar, piano)<br />
* Max Citron (laptop, guitar, drums, synth)<br />
* Luke Dahl (melodica + synth)<br />
* [http://ccrma.stanford.edu/~rob Rob Hamilton] (guitar)<br />
* Juan-Pablo Caceres (synth:NL3)<br />
* Tania Marques (piano, max/msp?)<br />
* Hiroko Terasawa (voice, video, photo)<br />
* Adnan Marques-B (adnanm@stanford.edu)<br />
* Joel Darnaver (bass, guitar, rocks bang together)<br />
* Chris Warren (8-string bass, guigar, feedback)<br />
* Jeff Cooper (?)</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Soundwire-fall2007/Bios&diff=2630Soundwire-fall2007/Bios2007-10-03T18:49:31Z<p>Jcaceres: </p>
<hr />
<div>[http://ccrma.stanford.edu/~cc/ Chris Chafe] is a composer/ cellist / music researcher with an interest in computer music composition and interactive performance. He has been a long-term denizen of the Center for Computer Research in Music and Acoustics, Stanford University where he directs the center and teaches computer music courses. His doctorate in music composition was completed at Stanford in 1983 with prior degrees in music from the University of California at San Diego and Antioch College. Two yearlong research periods were spent at IRCAM, and the Banff Center for the Arts developing methods for computer sound synthesis based on physical models of musical instrument mechanics. A current project, "SoundWIRE", explores musical collaboration and network evaluation using high-speed internets for high-quality sound.<br />
<br />
<br />
[http://ccrma.stanford.edu/~ge/ Ge Wang] received his B.S. in 2000 in Computer Science from Duke University, PhD in 2007 (hopefully!) in Computer Science (adviser Perry Cook) from Princeton University, and is currently an assistant professor in the Center for Computer Research in Music and Acoustics (CCRMA) at Stanford University. His research interests include real-time software systems for computer music, programming languages, visualization, new performance ensembles (e.g., laptop orchestras) and paradigms (e.g., live coding), interfaces for human-computer interaction, pedagogical methodologies at the intersection of computer science and computer music. Ge is the chief architect of the ChucK audio programming language and the Audicle environment. He is a founding developer and co-director of the Princeton Laptop Orchestra (PLOrk), and a co-creator of the TAPESTREA sound design environment. Ge composes and performs via various electro-acoustic and computer-mediated means.<br />
<br />
<br />
Composer and guitarist [http://ccrma.stanford.edu/~rob/ Robert Hamilton] (b.1973) is actively engaged in the composition of contemporary electroacoustic musics as well as the development of interactive musical systems for performance and composition. Mr. Hamilton holds degrees from Stanford University, Dartmouth College, and the Peabody Institute of The Johns Hopkins University with additional studies at Le Centre de Création Musicale de Iannis Xenakis (CCMIX) and L'Ecole Normale de Musique de Paris with the EAMA. His compositions and published writings have been presented at the International Computer Music Conference (ICMC 2007, 2006, 2005), newStage:CCRMA Festival, SEAMUS 2007 (Ames), NIME 2006 (Paris), the CCRMA Concert Series, Sound in Media Workshop (Copenhagen), the SPARK Festival, 3rd Practice Festival, ISMIR 2003, the Dartmouth Electric Rainbow Coalition Festival and the Smithsonian Institute. Mr. Hamilton is currently pursuing his Ph.D. in Computer-based Music Theory and Acoustics at Stanford University's CCRMA working with Chris Chafe. His research interests include novel platforms for electroacoustic composition and performance, the definition and implementation of flexible parameter-spaces for interactive musical systems, and systems for real-time musical data-exchange, translation and notation display.<br />
<br />
<br />
'''Juan-Pablo Caceres''' is a composer, performer and engineer born in Santiago, <br />
Chile. He is currently a PhD student in computer music at CCRMA in Stanford <br />
University (USA). His work includes instrumental and electronic pieces, as <br />
well as performance of avantgarde rock music, with a albums edited in [http://www.lizardrecords.it/yonhosago.html Europe] <br />
and [http://www.innova.mu/artist1.asp?skuID=256 America]. Juan-Pablo's interests include Internet music and performance, <br />
virtual acoustic spaces, popular experimental music, boundary pushing <br />
computer music (in both directions).</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Soundwire-fall-2007&diff=2610Soundwire-fall-20072007-10-02T21:58:53Z<p>Jcaceres: </p>
<hr />
<div>* class [http://ccrma.stanford.edu/groups/soundwire/course/ homepage]<br />
* [[Soundwire-fall2007/People|list of CCRMA folks]]<br />
<br />
=== Next meeting ===<br />
Please fill out your equipment in the following list:<br />
* [[Soundwire-fall2007/Equipment | list of equipment each person plans to bring]]<br />
<br />
We are going to make 4 instrumental groups with one analog mixer each. From each mixer 2-channels are going to <br />
be sent to the main mixer, and those 8 channels are going to be streamed. We are looking for volunteers to be <br />
in charge of each group, if you want to do that let us know.<br />
<br />
Everyone that is using electronics or laptops is responsible for their own cables, i.e., you need to provide<br />
1/4inch output from your setup (mono or stereo) to get into the mixer. Please also make sure that your signal isn't noisy.<br />
<br />
[[Category:Courses]]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Soundwire-fall-2007&diff=2609Soundwire-fall-20072007-10-02T21:58:05Z<p>Jcaceres: </p>
<hr />
<div>* class [http://ccrma.stanford.edu/groups/soundwire/course/ homepage]<br />
* [[Soundwire-fall2007/People|list of CCRMA folks]]<br />
<br />
=== Next meeting ===<br />
Please fill out your equipment in the following list:<br />
* [[Soundwire-fall2007/Equipment | list of equipment each person plans to bring]]<br />
<br />
We are going to make 4 instrumental groups with one analog mixer each. From each mixer 2-channels are going to <br />
be sent to the main mixer, and those 8 channels are going to be streamed. We are looking for volunteers to be <br />
in charge of each group, if you want to do that let us know.<br />
<br />
Everyone that is using electronics or laptops is responsible to for their own cables to get into the mixer, i.e., you need to provide<br />
1/4inch output from your setup (mono or stereo). Please also make sure that your signal isn't noisy.<br />
<br />
[[Category:Courses]]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Soundwire-fall2007/Equipment&diff=2608Soundwire-fall2007/Equipment2007-10-02T21:50:45Z<p>Jcaceres: </p>
<hr />
<div>=== What each person is bringing, exactly ===<br />
<br />
please indicate (as '''precisely''' as possible) what gear you'll be bring to class. Thanks!<br />
<br />
* Chris C:<br />
* Ge: laptop (need AC), stereo 1/8" -> RCA cable.<br />
* Kyle:<br />
* Nick:<br />
* Turner:<br />
* Hayden:<br />
* Cobi:<br />
* Gina:<br />
* Elise:<br />
* Diana:<br />
* Baek:<br />
* Dennis:<br />
* Max:<br />
* Luke:<br />
* Rob: electric guitar -> 1/4 -> direct-box-eq -> 1/4<br />
* Juan-Pablo: Synth (NL3) 2-channels<br />
* Tania:<br />
* Hiroko:<br />
* Adnan:<br />
* Joel:<br />
* Chris W:<br />
* Jeff:<br />
<br />
<br />
[[Category:Courses]]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Using_Subversion&diff=2208Using Subversion2007-06-29T21:24:04Z<p>Jcaceres: /* Configuring Subversion */</p>
<hr />
<div>== What is Subversion? ==<br />
<br />
Subversion ("svn") is a version control system; it manages successive revisions of files, keeping track of the latest version of each file, which versions of other files are associated with particular versions of a file, etc. It can be used to:<br />
<br />
* Get a copy of the most recent version of a file or directory<br />
* Get a copy of a set of files as of a given date, a given version of one particular file, etc.<br />
* Compare the "working" version of a file to the most recent version<br />
* Allow multiple people to work on files at the same time<br />
* Etc., etc. <br />
<br />
Comprehensive svn documentation: http://svnbook.red-bean.com<br />
<br />
Subversion website: http://subversion.tigris.org<br />
<br />
The main terms you need to understand are "repository," "revision," and "working copy". The repository is a central store of all the versions of all the files that are under Subversion's control. A "revision" of the repository is an integer that uniquely identifies the states of all of the files in the repository. A working copy (aka "sandbox" or "working directory") is a local copy of one particular version of one or more files, which you typically edit, test, debug, etc., and then "commit" to Subversion, thereby creating a new revision. If things get complicated, with people working on two or more versions of the same files, you'll want to think about using branches.<br />
<br />
== Getting and Installing Subversion ==<br />
<br />
It all starts here: http://subversion.tigris.org/project_packages.html<br />
<br />
=== Linux (Planet CCRMA) ===<br />
<br />
The command-line Subversion program is part of the Planet CCRMA distribution, so you don't have to do anything special to use it.<br />
<br />
"KSvn is a Konqueror-integrated frontend to the subversion (svn) revision control system for KDE." http://sourceforge.net/projects/ksvn<br />
<br />
=== OSX ===<br />
<br />
I use the command-line Subversion tools on OSX, which came as pre-built and easy-to-install binaries: http://metissian.com/projects/macosx/subversion<br />
<br />
There seem to be a few OSX graphical clients for Subversion, all in buggy beta stages:<br />
<br />
* http://www.nikwest.de/Software/#SvenOverview<br />
* http://scplugin.tigris.org<br />
* http://www.lachoseinteractive.net/en/community/subversion/svnx/features <br />
<br />
There are also Subversion plug-ins for BBEdit and other text editors, as well as graphical front-ends.<br />
<br />
Some people at CNMAT interact with svn through the text editor called BBEdit:<br />
<br />
"Basically it works seamlessly, but you have to install a prefernce panel called sshlogin". You get the preference panel from http://www.bebits.com/app/746 and then you have to check things out using the command-line tools, but then there's a Subversion menu (with an icon that looks like a big stylized letter "S" tilted 45 degrees to the left) in BBEdit that lets you do everything you need to do directly from BBEdit.<br />
<br />
It seems to be good for anything involving individual files (even files such as Max patches that BBEdit can't open), and it has a nice graphical way to show you the differences between two text files side by side, but it's terrible at things involving directories. So, for example, BBEdit isn't good at changing the name of something that's already been checked into the repository ("svn rename"), or for adding new directories to the repository ("svn mkdir").<br />
<br />
The command-line tools are installed into /usr/local/bin, which isn't in people's path by default when they open an OSX terminal window. After installing Subversion, open a Terminal window and type "svn help". If it prints help, you're fine. Otherwise, type "/usr/local/bin/svn help". If this works, you need to add /usr/local/bin to your path. Otherwise either you didn't really install Subversion or you accidentally installed it somewhere weird.<br />
<br />
Information on using subversion with Xcode can be found in the followings:<br />
<br />
* Getting Control with Subversion and Xcode<br />
* Using Subversion <br />
<br />
=== Windows ===<br />
<br />
I installed Subversion (from http://subversion.tigris.org/project_packages.html), plus the GUI front-end TortiseSVN (http://tortoisesvn.tigris.org).<br />
<br />
It didn't work because it couldn't find ssh. Here are my notes on how I fixed this (on CNMAT's Windows machine):<br />
<br />
command-line ssh is here:<br />
C:\Program Files\SSH Communications Security\SSH Secure Shell\ssh2.exe<br />
<br />
So you must add that directory to your path:<br />
<br />
Start Menu<br />
right-click "My Computer"<br />
Choose "properties"<br />
"Advanced" tab<br />
"Environment Variables" button<br />
In "System variables", scroll down to "Path"<br />
Select and press "Edit"<br />
<br />
Then you have to tell Subversion that "ssh" is named "ssh2":<br />
cd C:\Documents and Settings\Matt Wright\Application Data\Subversion<br />
<br />
Change config so that "[tunnels]" isn't commented out, and then underneath that, so it says<br />
ssh = ssh2<br />
<br />
=== Configuring Subversion ===<br />
<br />
At some point Subversion will automatically open a text editor to let you type in some important information. If you would like to choose which text editor it will use (instead of the default "vi"), put something like this in your .cshrc file or the equivalent:<br />
<br />
setenv VISUAL emacs<br />
setenv EDITOR emacs<br />
<br />
If you use Bash, put something like this in your .bashrc file:<br />
<br />
export VISUAL='emacs'<br />
export EDITOR='emacs'<br />
<br />
There's only one way that I've ever changed subversion's configuration:<br />
<br />
Subversion can be set to "ignore" certain files, which basically means not to try to check them into the repository if they appear in a folder. You mainly need this for files that are created by the build process, which don't belong in the repository. By default Subversion comes configured with a pretty conservative list of these files; I've added quite a few. This configuration goes into the file config inside the "hidden" .subversion subdirectory of your home directory (at least on OSX and Linux). Here's what the appropriate line in ~matt/.subversion/config looks like (note that it's a single long line):<br />
<br />
global-ignores = *.o *.lo *.la #*# .*.rej *.rej .*~ *~ .#* \<br />
.DS_Store build-mac build-classic *Data *.macyucky version.h \<br />
*.sit svn-commit* *.mode1 *.pbxuser build<br />
<br />
== Making A Repository ==<br />
<br />
Subversion has a client/server architecture in which the repository (all the versions of all the files checked into svn) lives on a server, and people access the files via clients. I recommend that you use one of the CCRMA Linux machines as your server. (Don't use ccrma-gate, because it's running a bad old version of Linux that doesn't work with subversion.)<br />
<br />
You have to use an "FSFS" repository at CCRMA (because your home directory is NFS-mounted). Decide where you want the repository to live. If it's going to be enormous because you want to check in zillions of megabytes of media, you should put it on the (not-backed-up) snd disk. Otherwise, you should put it in your home directory somewhere. Here's the magic incantation to create an SVN repository in the directory named "svn-repository" underneath your home directory:<br />
<br />
svnadmin create --fs-type fsfs ~/svn-repository<br />
<br />
This creates an empty repository of the given name. The next steps are to check it out, copy files into it, 'add' those files to the repository, and then 'commit'. These steps are summarized below. If you want to read more, here's the whole story: http://svnbook.red-bean.com/en/1.1/ch05s02.html<br />
<br />
== Checking out a Working Copy ==<br />
<br />
You'll do all your work (i.e., editing, compiling, testing, debugging) in a working copy. First decide where you want your working copy to be, and go there. For example, suppose you want to work in classes/220c:<br />
<br />
cd classes/220c<br />
<br />
You make the working copy with "svn checkout". Obviously, you have to tell the "svn checkout" command where to find the repository from which you're checking out the working copy.<br />
<br />
* If you're on a CCRMA machine the "path" to your repository will be something like this:<br />
*: '''file:///user/x/xyz/svn-repository'''<br />
* If you're on a non-CCRMA machine, the "path" will be something like this:<br />
*: '''svn+ssh://cmn18.stanford.edu/user/x/xyz/svn-repository''' <br />
<br />
Note that you have to type in the full path to your Subversion repository, which in this example is the svn-repository directory in Xyz's account:<br />
<br />
Your working copy can contain any of the following:<br />
<br />
# The entire contents of your repository (good if you only have one project under version control, or if you want to look at everything you've been working on):<br />
#: svn co path_to_your_repository<br />
# Just a subdirectory (if you want to work on only one project in this particular subdirectory):<br />
#: svn co path_to_your_repository/name/of/sub/directory<br />
# Just the top-level folder of your repository without any of the subfolders (if you want to add a new top-level directory to the repository without having to check out any of the existing other directories):<br />
#: svn co -N path_to_your_repository<br />
#: The "-N" option is for "non-recursive"; it avoids checking out the subdirectories of a given directory.<br />
# Any complicated subset of the files and directories in your repository. (You're on your own for how to ask for this.)<br />
<br />
== Adding a Directory to a Working Copy ==<br />
<br />
Go to your working copy and use "svn mkdir" to add the new directory:<br />
<br />
cd wherever/my/working/copy/is<br />
svn mkdir name-of-my-new-directory<br />
<br />
The "svn mkdir" command will make an actual directory in your working copy, but it will also keep track of the fact that you intend for this directory to live in the repository. (Just saying regular Unix "mkdir" would do the former but not the latter.)<br />
<br />
Now you must "commit" your new empty directory, so that it will truly go into the repository:<br />
<br />
svn commit<br />
<br />
It will bring up a text editor so you can type in a "log message", which will probably be something profound like "initial empty directory for such and such a project". If you weren't following instructions above and didn't set the shell variable VISUAL to your favorite text editor, you'll probably be happy to know that you can type ":q" (followed by the return key) to quit vi.<br />
<br />
== Adding Files and Subdirectories to the Repository ==<br />
<br />
<br />
The basic stratgegy is that you start with a working directory, add the files to it, and then commit.<br />
<br />
For example, continuing the examples above, assume you have a working directory in /zap/svn-repository/name-of-my-new-directory:<br />
<br />
cd wherever/my/working/copy/is/name-of-my-new-directory<br />
touch foo.c<br />
svn add foo.c<br />
svn commit<br />
<br />
In this example I used "touch" to create an empty file; in a real situation you'd actually copy an existing file into the directory or make it in a text editor or some other program.<br />
<br />
The"svn add" lets subversion know that you intend for the new file to live in the repository. It's somewhat misleading that "add" doesn't actually add the files to the repository, it just "schedules" them to be added the next time you "commit".<br />
<br />
== How to Avoid Typing Your Password All The Time ==<br />
<br />
This section is cryptic but better than nothing!<br />
<br />
man ssh-keygen<br />
Make it like this:<br />
<br />
matt% ls ~/.ssh<br />
id_rsa id_rsa.pub known_hosts<br />
matt% ssh cmn69<br />
[Feldman:~] matt% ls .ssh/<br />
<br />
authorized_keys known_hosts<br />
<br />
This link has a step by step instructions on how to do this:<br />
[http://linuxhelp.blogspot.com/2005/09/how-to-setup-ssh-keys-and-why.html how-to-setup-ssh-keys-and-why]<br />
<br />
== Viewing Log History ==<br />
<br />
You can view the log history by typing <br />
<br />
svn log<br />
<br />
== Basic Work Cycle ==<br />
<br />
Here's some great documentation on the basic Subversion Work Cycle: http://svnbook.red-bean.com/en/1.1/ch03s05.html<br />
<br />
To summarize this summary,<br />
<br />
# You must have a working copy. (svn checkout)<br />
# Your working copy may be out-of-date because somebody else has checked in improvements since you checked yours out. Fix this (svn update)<br />
# Make changes to your files.<br />
## If you need to change file names or directory structure, do it through Subversion (svn add, svn delete, svn mkdir, svn rename, svn copy, etc. <br />
# Examine your changes:<br />
## Which files have I changed (or have changed out from under me)? (svn status)<br />
## What changes have I made? (svn diff) <br />
# Commit your changes (svn commit). Please write a descriptive log message when you commit any file(s) to svn. <br />
<br />
You almost never need to refer to the server in a typed svn command, because working copies remember where they came from. So when you say svn update, svn status, svn diff, svn commit, etc., svn automatically interprets these to mean "with respect to the working copy in the current directory and the repository it came from" and so you don't have to mention the server.<br />
<br />
The only time you need to refer to the server is when you check out a working copy in the first place (where obviously you need to tell your svn client which server to get the files from):<br />
<br />
svn checkout svn+ssh://cmn17.stanford.edu/user/x/xyz/svn-repository <br />
<br />
The svn log command gives you the entire history of the given file, including the revision numbers (of the entire repository) for each different version of the file. If you want to compare two versions of a file from (for example) revisions 111 and 234, you can say<br />
<br />
svn diff -r111:234 filename<br />
<br />
== Best Practices ==<br />
<br />
Never store "derived" files in version control. Check in only source files, makefiles, etc.; the results of the build process (.o files, executable files, etc.) should not be under version control.<br />
<br />
Check in often. Any time you make any improvement to any code, check it back in ASAP. If it's untested or has any other potential problems, you can just make a note of them in the comment when you check it in.<br />
<br />
More advice:<br />
<br />
* http://www.geocities.com/vivekv/cvs-bestpractices<br />
* http://www.cmcrossroads.com/bradapp/acme/<br />
<br />
== Troubleshooting ==<br />
<br />
=== Last Resort ===<br />
<br />
When things really get confused, like when you don't understand SVN's error messages, this always works for me:<br />
<br />
# Manually move any files that you've modified somewhere safe, i.e., outside your working copy<br />
# Delete the troublesome parts of your working copy (files, directories, or the whole thing)<br />
# Do "svn update" to restore your working copy with the current version from the repository<br />
# Look at the result to see what actually was committed<br />
# If necessary, copy your changed files back into you shiny clean new working copy and submit them again. <br />
<br />
If things get completely hosed, you can always check out an all new working copy and then copy in any un-checked-in files that you modified.<br />
<br />
<br />
== Example Repository Set Up ==<br />
<br />
Here is the sequence of steps that was taken to set up the source control repository for the REALSIMPLE CCRMA Research Group:<br />
<br />
cd /usr/ccrma/group/realsimple<br />
svnadmin create --fs-type fsfs svn<br />
svn mkdir file:////usr/ccrma/group/realsimple/svn/doc -m 'Create REALSIMPLE doc directory'<br />
svn import /usr/ccrma/group/realsimple/doc file:////usr/ccrma/group/realsimple/svn/doc/trunk \<br />
-m 'Initial import of REALSIMPLE doc project'<br />
mv doc doc-old-CAN-DELETE<br />
svn checkout file:////usr/ccrma/group/realsimple/svn/doc/trunk /usr/ccrma/group/realsimple/doc<br />
cd doc<br />
svn propset svn:keywords 'URL Author Date Rev' *.tex Make* <etc.><br />
<cd <subdirs> and repeat.><br />
svn commit -m "Enable keywords URL Author Date Rev on all Make and .tex files, etc."<br />
<br />
Say 'svn help propset', etc., to learn about the various svn commands.<br />
<br />
To "check out" your own copy of the REALSIMPLE source tree at CCRMA, you should be able to say, from anywhere in the world,<br />
<br />
svn checkout svn+ssh://username@ccrma-gate.stanford.edu/usr/ccrma/group/realsimple/svn/doc/trunk ~/realsimple/doc<br />
<br />
for example. This checks out a working copy of the doc subdirectory of the REALSIMPLE svn repository, and places it in the realsimple subdirectory of your home directory (which should already exist). <br />
<br />
If you make changes to your local copy, you can upload them via <br />
<br />
cd ~/realsimple/doc<br />
svn commit -m 'My changes had to do with ..."<br />
<br />
However, to include newly created files in your working copy, you must first use 'svn add filename' to add them to the project, as mentioned above. As also described above, 'svn mkdir dirname' will make a new directory both in your working copy and in the repository, 'svn delete filename' will delete from both, and so on.<br />
<br />
To update your local copy with new changes by others, say<br />
<br />
svn update<br />
<br />
and so on.<br />
<br />
== How to Change or Switch the Location of Your Repository ==<br />
<br />
Use<br />
<br />
svn switch <NEW_URL><br />
<br />
See [http://svnbook.red-bean.com/en/1.1/ch04s05.html this svn document] for more.</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Using_Subversion&diff=2207Using Subversion2007-06-29T21:23:16Z<p>Jcaceres: /* Configuring Subversion */</p>
<hr />
<div>== What is Subversion? ==<br />
<br />
Subversion ("svn") is a version control system; it manages successive revisions of files, keeping track of the latest version of each file, which versions of other files are associated with particular versions of a file, etc. It can be used to:<br />
<br />
* Get a copy of the most recent version of a file or directory<br />
* Get a copy of a set of files as of a given date, a given version of one particular file, etc.<br />
* Compare the "working" version of a file to the most recent version<br />
* Allow multiple people to work on files at the same time<br />
* Etc., etc. <br />
<br />
Comprehensive svn documentation: http://svnbook.red-bean.com<br />
<br />
Subversion website: http://subversion.tigris.org<br />
<br />
The main terms you need to understand are "repository," "revision," and "working copy". The repository is a central store of all the versions of all the files that are under Subversion's control. A "revision" of the repository is an integer that uniquely identifies the states of all of the files in the repository. A working copy (aka "sandbox" or "working directory") is a local copy of one particular version of one or more files, which you typically edit, test, debug, etc., and then "commit" to Subversion, thereby creating a new revision. If things get complicated, with people working on two or more versions of the same files, you'll want to think about using branches.<br />
<br />
== Getting and Installing Subversion ==<br />
<br />
It all starts here: http://subversion.tigris.org/project_packages.html<br />
<br />
=== Linux (Planet CCRMA) ===<br />
<br />
The command-line Subversion program is part of the Planet CCRMA distribution, so you don't have to do anything special to use it.<br />
<br />
"KSvn is a Konqueror-integrated frontend to the subversion (svn) revision control system for KDE." http://sourceforge.net/projects/ksvn<br />
<br />
=== OSX ===<br />
<br />
I use the command-line Subversion tools on OSX, which came as pre-built and easy-to-install binaries: http://metissian.com/projects/macosx/subversion<br />
<br />
There seem to be a few OSX graphical clients for Subversion, all in buggy beta stages:<br />
<br />
* http://www.nikwest.de/Software/#SvenOverview<br />
* http://scplugin.tigris.org<br />
* http://www.lachoseinteractive.net/en/community/subversion/svnx/features <br />
<br />
There are also Subversion plug-ins for BBEdit and other text editors, as well as graphical front-ends.<br />
<br />
Some people at CNMAT interact with svn through the text editor called BBEdit:<br />
<br />
"Basically it works seamlessly, but you have to install a prefernce panel called sshlogin". You get the preference panel from http://www.bebits.com/app/746 and then you have to check things out using the command-line tools, but then there's a Subversion menu (with an icon that looks like a big stylized letter "S" tilted 45 degrees to the left) in BBEdit that lets you do everything you need to do directly from BBEdit.<br />
<br />
It seems to be good for anything involving individual files (even files such as Max patches that BBEdit can't open), and it has a nice graphical way to show you the differences between two text files side by side, but it's terrible at things involving directories. So, for example, BBEdit isn't good at changing the name of something that's already been checked into the repository ("svn rename"), or for adding new directories to the repository ("svn mkdir").<br />
<br />
The command-line tools are installed into /usr/local/bin, which isn't in people's path by default when they open an OSX terminal window. After installing Subversion, open a Terminal window and type "svn help". If it prints help, you're fine. Otherwise, type "/usr/local/bin/svn help". If this works, you need to add /usr/local/bin to your path. Otherwise either you didn't really install Subversion or you accidentally installed it somewhere weird.<br />
<br />
Information on using subversion with Xcode can be found in the followings:<br />
<br />
* Getting Control with Subversion and Xcode<br />
* Using Subversion <br />
<br />
=== Windows ===<br />
<br />
I installed Subversion (from http://subversion.tigris.org/project_packages.html), plus the GUI front-end TortiseSVN (http://tortoisesvn.tigris.org).<br />
<br />
It didn't work because it couldn't find ssh. Here are my notes on how I fixed this (on CNMAT's Windows machine):<br />
<br />
command-line ssh is here:<br />
C:\Program Files\SSH Communications Security\SSH Secure Shell\ssh2.exe<br />
<br />
So you must add that directory to your path:<br />
<br />
Start Menu<br />
right-click "My Computer"<br />
Choose "properties"<br />
"Advanced" tab<br />
"Environment Variables" button<br />
In "System variables", scroll down to "Path"<br />
Select and press "Edit"<br />
<br />
Then you have to tell Subversion that "ssh" is named "ssh2":<br />
cd C:\Documents and Settings\Matt Wright\Application Data\Subversion<br />
<br />
Change config so that "[tunnels]" isn't commented out, and then underneath that, so it says<br />
ssh = ssh2<br />
<br />
=== Configuring Subversion ===<br />
<br />
At some point Subversion will automatically open a text editor to let you type in some important information. If you would like to choose which text editor it will use (instead of the default "vi"), put something like this in your .cshrc file or the equivalent:<br />
<br />
setenv VISUAL emacs<br />
setenv EDITOR emacs<br />
<br />
If you use Bash, put something like this in your .bashrc file:<br />
<br />
export VISUAL='emacs -nw'<br />
export EDITOR='emacs -nw'<br />
<br />
There's only one way that I've ever changed subversion's configuration:<br />
<br />
Subversion can be set to "ignore" certain files, which basically means not to try to check them into the repository if they appear in a folder. You mainly need this for files that are created by the build process, which don't belong in the repository. By default Subversion comes configured with a pretty conservative list of these files; I've added quite a few. This configuration goes into the file config inside the "hidden" .subversion subdirectory of your home directory (at least on OSX and Linux). Here's what the appropriate line in ~matt/.subversion/config looks like (note that it's a single long line):<br />
<br />
global-ignores = *.o *.lo *.la #*# .*.rej *.rej .*~ *~ .#* \<br />
.DS_Store build-mac build-classic *Data *.macyucky version.h \<br />
*.sit svn-commit* *.mode1 *.pbxuser build<br />
<br />
== Making A Repository ==<br />
<br />
Subversion has a client/server architecture in which the repository (all the versions of all the files checked into svn) lives on a server, and people access the files via clients. I recommend that you use one of the CCRMA Linux machines as your server. (Don't use ccrma-gate, because it's running a bad old version of Linux that doesn't work with subversion.)<br />
<br />
You have to use an "FSFS" repository at CCRMA (because your home directory is NFS-mounted). Decide where you want the repository to live. If it's going to be enormous because you want to check in zillions of megabytes of media, you should put it on the (not-backed-up) snd disk. Otherwise, you should put it in your home directory somewhere. Here's the magic incantation to create an SVN repository in the directory named "svn-repository" underneath your home directory:<br />
<br />
svnadmin create --fs-type fsfs ~/svn-repository<br />
<br />
This creates an empty repository of the given name. The next steps are to check it out, copy files into it, 'add' those files to the repository, and then 'commit'. These steps are summarized below. If you want to read more, here's the whole story: http://svnbook.red-bean.com/en/1.1/ch05s02.html<br />
<br />
== Checking out a Working Copy ==<br />
<br />
You'll do all your work (i.e., editing, compiling, testing, debugging) in a working copy. First decide where you want your working copy to be, and go there. For example, suppose you want to work in classes/220c:<br />
<br />
cd classes/220c<br />
<br />
You make the working copy with "svn checkout". Obviously, you have to tell the "svn checkout" command where to find the repository from which you're checking out the working copy.<br />
<br />
* If you're on a CCRMA machine the "path" to your repository will be something like this:<br />
*: '''file:///user/x/xyz/svn-repository'''<br />
* If you're on a non-CCRMA machine, the "path" will be something like this:<br />
*: '''svn+ssh://cmn18.stanford.edu/user/x/xyz/svn-repository''' <br />
<br />
Note that you have to type in the full path to your Subversion repository, which in this example is the svn-repository directory in Xyz's account:<br />
<br />
Your working copy can contain any of the following:<br />
<br />
# The entire contents of your repository (good if you only have one project under version control, or if you want to look at everything you've been working on):<br />
#: svn co path_to_your_repository<br />
# Just a subdirectory (if you want to work on only one project in this particular subdirectory):<br />
#: svn co path_to_your_repository/name/of/sub/directory<br />
# Just the top-level folder of your repository without any of the subfolders (if you want to add a new top-level directory to the repository without having to check out any of the existing other directories):<br />
#: svn co -N path_to_your_repository<br />
#: The "-N" option is for "non-recursive"; it avoids checking out the subdirectories of a given directory.<br />
# Any complicated subset of the files and directories in your repository. (You're on your own for how to ask for this.)<br />
<br />
== Adding a Directory to a Working Copy ==<br />
<br />
Go to your working copy and use "svn mkdir" to add the new directory:<br />
<br />
cd wherever/my/working/copy/is<br />
svn mkdir name-of-my-new-directory<br />
<br />
The "svn mkdir" command will make an actual directory in your working copy, but it will also keep track of the fact that you intend for this directory to live in the repository. (Just saying regular Unix "mkdir" would do the former but not the latter.)<br />
<br />
Now you must "commit" your new empty directory, so that it will truly go into the repository:<br />
<br />
svn commit<br />
<br />
It will bring up a text editor so you can type in a "log message", which will probably be something profound like "initial empty directory for such and such a project". If you weren't following instructions above and didn't set the shell variable VISUAL to your favorite text editor, you'll probably be happy to know that you can type ":q" (followed by the return key) to quit vi.<br />
<br />
== Adding Files and Subdirectories to the Repository ==<br />
<br />
<br />
The basic stratgegy is that you start with a working directory, add the files to it, and then commit.<br />
<br />
For example, continuing the examples above, assume you have a working directory in /zap/svn-repository/name-of-my-new-directory:<br />
<br />
cd wherever/my/working/copy/is/name-of-my-new-directory<br />
touch foo.c<br />
svn add foo.c<br />
svn commit<br />
<br />
In this example I used "touch" to create an empty file; in a real situation you'd actually copy an existing file into the directory or make it in a text editor or some other program.<br />
<br />
The"svn add" lets subversion know that you intend for the new file to live in the repository. It's somewhat misleading that "add" doesn't actually add the files to the repository, it just "schedules" them to be added the next time you "commit".<br />
<br />
== How to Avoid Typing Your Password All The Time ==<br />
<br />
This section is cryptic but better than nothing!<br />
<br />
man ssh-keygen<br />
Make it like this:<br />
<br />
matt% ls ~/.ssh<br />
id_rsa id_rsa.pub known_hosts<br />
matt% ssh cmn69<br />
[Feldman:~] matt% ls .ssh/<br />
<br />
authorized_keys known_hosts<br />
<br />
This link has a step by step instructions on how to do this:<br />
[http://linuxhelp.blogspot.com/2005/09/how-to-setup-ssh-keys-and-why.html how-to-setup-ssh-keys-and-why]<br />
<br />
== Viewing Log History ==<br />
<br />
You can view the log history by typing <br />
<br />
svn log<br />
<br />
== Basic Work Cycle ==<br />
<br />
Here's some great documentation on the basic Subversion Work Cycle: http://svnbook.red-bean.com/en/1.1/ch03s05.html<br />
<br />
To summarize this summary,<br />
<br />
# You must have a working copy. (svn checkout)<br />
# Your working copy may be out-of-date because somebody else has checked in improvements since you checked yours out. Fix this (svn update)<br />
# Make changes to your files.<br />
## If you need to change file names or directory structure, do it through Subversion (svn add, svn delete, svn mkdir, svn rename, svn copy, etc. <br />
# Examine your changes:<br />
## Which files have I changed (or have changed out from under me)? (svn status)<br />
## What changes have I made? (svn diff) <br />
# Commit your changes (svn commit). Please write a descriptive log message when you commit any file(s) to svn. <br />
<br />
You almost never need to refer to the server in a typed svn command, because working copies remember where they came from. So when you say svn update, svn status, svn diff, svn commit, etc., svn automatically interprets these to mean "with respect to the working copy in the current directory and the repository it came from" and so you don't have to mention the server.<br />
<br />
The only time you need to refer to the server is when you check out a working copy in the first place (where obviously you need to tell your svn client which server to get the files from):<br />
<br />
svn checkout svn+ssh://cmn17.stanford.edu/user/x/xyz/svn-repository <br />
<br />
The svn log command gives you the entire history of the given file, including the revision numbers (of the entire repository) for each different version of the file. If you want to compare two versions of a file from (for example) revisions 111 and 234, you can say<br />
<br />
svn diff -r111:234 filename<br />
<br />
== Best Practices ==<br />
<br />
Never store "derived" files in version control. Check in only source files, makefiles, etc.; the results of the build process (.o files, executable files, etc.) should not be under version control.<br />
<br />
Check in often. Any time you make any improvement to any code, check it back in ASAP. If it's untested or has any other potential problems, you can just make a note of them in the comment when you check it in.<br />
<br />
More advice:<br />
<br />
* http://www.geocities.com/vivekv/cvs-bestpractices<br />
* http://www.cmcrossroads.com/bradapp/acme/<br />
<br />
== Troubleshooting ==<br />
<br />
=== Last Resort ===<br />
<br />
When things really get confused, like when you don't understand SVN's error messages, this always works for me:<br />
<br />
# Manually move any files that you've modified somewhere safe, i.e., outside your working copy<br />
# Delete the troublesome parts of your working copy (files, directories, or the whole thing)<br />
# Do "svn update" to restore your working copy with the current version from the repository<br />
# Look at the result to see what actually was committed<br />
# If necessary, copy your changed files back into you shiny clean new working copy and submit them again. <br />
<br />
If things get completely hosed, you can always check out an all new working copy and then copy in any un-checked-in files that you modified.<br />
<br />
<br />
== Example Repository Set Up ==<br />
<br />
Here is the sequence of steps that was taken to set up the source control repository for the REALSIMPLE CCRMA Research Group:<br />
<br />
cd /usr/ccrma/group/realsimple<br />
svnadmin create --fs-type fsfs svn<br />
svn mkdir file:////usr/ccrma/group/realsimple/svn/doc -m 'Create REALSIMPLE doc directory'<br />
svn import /usr/ccrma/group/realsimple/doc file:////usr/ccrma/group/realsimple/svn/doc/trunk \<br />
-m 'Initial import of REALSIMPLE doc project'<br />
mv doc doc-old-CAN-DELETE<br />
svn checkout file:////usr/ccrma/group/realsimple/svn/doc/trunk /usr/ccrma/group/realsimple/doc<br />
cd doc<br />
svn propset svn:keywords 'URL Author Date Rev' *.tex Make* <etc.><br />
<cd <subdirs> and repeat.><br />
svn commit -m "Enable keywords URL Author Date Rev on all Make and .tex files, etc."<br />
<br />
Say 'svn help propset', etc., to learn about the various svn commands.<br />
<br />
To "check out" your own copy of the REALSIMPLE source tree at CCRMA, you should be able to say, from anywhere in the world,<br />
<br />
svn checkout svn+ssh://username@ccrma-gate.stanford.edu/usr/ccrma/group/realsimple/svn/doc/trunk ~/realsimple/doc<br />
<br />
for example. This checks out a working copy of the doc subdirectory of the REALSIMPLE svn repository, and places it in the realsimple subdirectory of your home directory (which should already exist). <br />
<br />
If you make changes to your local copy, you can upload them via <br />
<br />
cd ~/realsimple/doc<br />
svn commit -m 'My changes had to do with ..."<br />
<br />
However, to include newly created files in your working copy, you must first use 'svn add filename' to add them to the project, as mentioned above. As also described above, 'svn mkdir dirname' will make a new directory both in your working copy and in the repository, 'svn delete filename' will delete from both, and so on.<br />
<br />
To update your local copy with new changes by others, say<br />
<br />
svn update<br />
<br />
and so on.<br />
<br />
== How to Change or Switch the Location of Your Repository ==<br />
<br />
Use<br />
<br />
svn switch <NEW_URL><br />
<br />
See [http://svnbook.red-bean.com/en/1.1/ch04s05.html this svn document] for more.</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Using_Subversion&diff=2206Using Subversion2007-06-29T20:50:20Z<p>Jcaceres: /* Making A Repository */</p>
<hr />
<div>== What is Subversion? ==<br />
<br />
Subversion ("svn") is a version control system; it manages successive revisions of files, keeping track of the latest version of each file, which versions of other files are associated with particular versions of a file, etc. It can be used to:<br />
<br />
* Get a copy of the most recent version of a file or directory<br />
* Get a copy of a set of files as of a given date, a given version of one particular file, etc.<br />
* Compare the "working" version of a file to the most recent version<br />
* Allow multiple people to work on files at the same time<br />
* Etc., etc. <br />
<br />
Comprehensive svn documentation: http://svnbook.red-bean.com<br />
<br />
Subversion website: http://subversion.tigris.org<br />
<br />
The main terms you need to understand are "repository," "revision," and "working copy". The repository is a central store of all the versions of all the files that are under Subversion's control. A "revision" of the repository is an integer that uniquely identifies the states of all of the files in the repository. A working copy (aka "sandbox" or "working directory") is a local copy of one particular version of one or more files, which you typically edit, test, debug, etc., and then "commit" to Subversion, thereby creating a new revision. If things get complicated, with people working on two or more versions of the same files, you'll want to think about using branches.<br />
<br />
== Getting and Installing Subversion ==<br />
<br />
It all starts here: http://subversion.tigris.org/project_packages.html<br />
<br />
=== Linux (Planet CCRMA) ===<br />
<br />
The command-line Subversion program is part of the Planet CCRMA distribution, so you don't have to do anything special to use it.<br />
<br />
"KSvn is a Konqueror-integrated frontend to the subversion (svn) revision control system for KDE." http://sourceforge.net/projects/ksvn<br />
<br />
=== OSX ===<br />
<br />
I use the command-line Subversion tools on OSX, which came as pre-built and easy-to-install binaries: http://metissian.com/projects/macosx/subversion<br />
<br />
There seem to be a few OSX graphical clients for Subversion, all in buggy beta stages:<br />
<br />
* http://www.nikwest.de/Software/#SvenOverview<br />
* http://scplugin.tigris.org<br />
* http://www.lachoseinteractive.net/en/community/subversion/svnx/features <br />
<br />
There are also Subversion plug-ins for BBEdit and other text editors, as well as graphical front-ends.<br />
<br />
Some people at CNMAT interact with svn through the text editor called BBEdit:<br />
<br />
"Basically it works seamlessly, but you have to install a prefernce panel called sshlogin". You get the preference panel from http://www.bebits.com/app/746 and then you have to check things out using the command-line tools, but then there's a Subversion menu (with an icon that looks like a big stylized letter "S" tilted 45 degrees to the left) in BBEdit that lets you do everything you need to do directly from BBEdit.<br />
<br />
It seems to be good for anything involving individual files (even files such as Max patches that BBEdit can't open), and it has a nice graphical way to show you the differences between two text files side by side, but it's terrible at things involving directories. So, for example, BBEdit isn't good at changing the name of something that's already been checked into the repository ("svn rename"), or for adding new directories to the repository ("svn mkdir").<br />
<br />
The command-line tools are installed into /usr/local/bin, which isn't in people's path by default when they open an OSX terminal window. After installing Subversion, open a Terminal window and type "svn help". If it prints help, you're fine. Otherwise, type "/usr/local/bin/svn help". If this works, you need to add /usr/local/bin to your path. Otherwise either you didn't really install Subversion or you accidentally installed it somewhere weird.<br />
<br />
Information on using subversion with Xcode can be found in the followings:<br />
<br />
* Getting Control with Subversion and Xcode<br />
* Using Subversion <br />
<br />
=== Windows ===<br />
<br />
I installed Subversion (from http://subversion.tigris.org/project_packages.html), plus the GUI front-end TortiseSVN (http://tortoisesvn.tigris.org).<br />
<br />
It didn't work because it couldn't find ssh. Here are my notes on how I fixed this (on CNMAT's Windows machine):<br />
<br />
command-line ssh is here:<br />
C:\Program Files\SSH Communications Security\SSH Secure Shell\ssh2.exe<br />
<br />
So you must add that directory to your path:<br />
<br />
Start Menu<br />
right-click "My Computer"<br />
Choose "properties"<br />
"Advanced" tab<br />
"Environment Variables" button<br />
In "System variables", scroll down to "Path"<br />
Select and press "Edit"<br />
<br />
Then you have to tell Subversion that "ssh" is named "ssh2":<br />
cd C:\Documents and Settings\Matt Wright\Application Data\Subversion<br />
<br />
Change config so that "[tunnels]" isn't commented out, and then underneath that, so it says<br />
ssh = ssh2<br />
<br />
=== Configuring Subversion ===<br />
<br />
At some point Subversion will automatically open a text editor to let you type in some important information. If you would like to choose which text editor it will use (instead of the default "vi"), put something like this in your .cshrc file or the equivalent:<br />
<br />
setenv VISUAL emacs<br />
setenv EDITOR emacs<br />
<br />
There's only one way that I've ever changed subversion's configuration:<br />
<br />
Subversion can be set to "ignore" certain files, which basically means not to try to check them into the repository if they appear in a folder. You mainly need this for files that are created by the build process, which don't belong in the repository. By default Subversion comes configured with a pretty conservative list of these files; I've added quite a few. This configuration goes into the file config inside the "hidden" .subversion subdirectory of your home directory (at least on OSX and Linux). Here's what the appropriate line in ~matt/.subversion/config looks like (note that it's a single long line):<br />
<br />
global-ignores = *.o *.lo *.la #*# .*.rej *.rej .*~ *~ .#* \<br />
.DS_Store build-mac build-classic *Data *.macyucky version.h \<br />
*.sit svn-commit* *.mode1 *.pbxuser build<br />
<br />
== Making A Repository ==<br />
<br />
Subversion has a client/server architecture in which the repository (all the versions of all the files checked into svn) lives on a server, and people access the files via clients. I recommend that you use one of the CCRMA Linux machines as your server. (Don't use ccrma-gate, because it's running a bad old version of Linux that doesn't work with subversion.)<br />
<br />
You have to use an "FSFS" repository at CCRMA (because your home directory is NFS-mounted). Decide where you want the repository to live. If it's going to be enormous because you want to check in zillions of megabytes of media, you should put it on the (not-backed-up) snd disk. Otherwise, you should put it in your home directory somewhere. Here's the magic incantation to create an SVN repository in the directory named "svn-repository" underneath your home directory:<br />
<br />
svnadmin create --fs-type fsfs ~/svn-repository<br />
<br />
This creates an empty repository of the given name. The next steps are to check it out, copy files into it, 'add' those files to the repository, and then 'commit'. These steps are summarized below. If you want to read more, here's the whole story: http://svnbook.red-bean.com/en/1.1/ch05s02.html<br />
<br />
== Checking out a Working Copy ==<br />
<br />
You'll do all your work (i.e., editing, compiling, testing, debugging) in a working copy. First decide where you want your working copy to be, and go there. For example, suppose you want to work in classes/220c:<br />
<br />
cd classes/220c<br />
<br />
You make the working copy with "svn checkout". Obviously, you have to tell the "svn checkout" command where to find the repository from which you're checking out the working copy.<br />
<br />
* If you're on a CCRMA machine the "path" to your repository will be something like this:<br />
*: '''file:///user/x/xyz/svn-repository'''<br />
* If you're on a non-CCRMA machine, the "path" will be something like this:<br />
*: '''svn+ssh://cmn18.stanford.edu/user/x/xyz/svn-repository''' <br />
<br />
Note that you have to type in the full path to your Subversion repository, which in this example is the svn-repository directory in Xyz's account:<br />
<br />
Your working copy can contain any of the following:<br />
<br />
# The entire contents of your repository (good if you only have one project under version control, or if you want to look at everything you've been working on):<br />
#: svn co path_to_your_repository<br />
# Just a subdirectory (if you want to work on only one project in this particular subdirectory):<br />
#: svn co path_to_your_repository/name/of/sub/directory<br />
# Just the top-level folder of your repository without any of the subfolders (if you want to add a new top-level directory to the repository without having to check out any of the existing other directories):<br />
#: svn co -N path_to_your_repository<br />
#: The "-N" option is for "non-recursive"; it avoids checking out the subdirectories of a given directory.<br />
# Any complicated subset of the files and directories in your repository. (You're on your own for how to ask for this.)<br />
<br />
== Adding a Directory to a Working Copy ==<br />
<br />
Go to your working copy and use "svn mkdir" to add the new directory:<br />
<br />
cd wherever/my/working/copy/is<br />
svn mkdir name-of-my-new-directory<br />
<br />
The "svn mkdir" command will make an actual directory in your working copy, but it will also keep track of the fact that you intend for this directory to live in the repository. (Just saying regular Unix "mkdir" would do the former but not the latter.)<br />
<br />
Now you must "commit" your new empty directory, so that it will truly go into the repository:<br />
<br />
svn commit<br />
<br />
It will bring up a text editor so you can type in a "log message", which will probably be something profound like "initial empty directory for such and such a project". If you weren't following instructions above and didn't set the shell variable VISUAL to your favorite text editor, you'll probably be happy to know that you can type ":q" (followed by the return key) to quit vi.<br />
<br />
== Adding Files and Subdirectories to the Repository ==<br />
<br />
<br />
The basic stratgegy is that you start with a working directory, add the files to it, and then commit.<br />
<br />
For example, continuing the examples above, assume you have a working directory in /zap/svn-repository/name-of-my-new-directory:<br />
<br />
cd wherever/my/working/copy/is/name-of-my-new-directory<br />
touch foo.c<br />
svn add foo.c<br />
svn commit<br />
<br />
In this example I used "touch" to create an empty file; in a real situation you'd actually copy an existing file into the directory or make it in a text editor or some other program.<br />
<br />
The"svn add" lets subversion know that you intend for the new file to live in the repository. It's somewhat misleading that "add" doesn't actually add the files to the repository, it just "schedules" them to be added the next time you "commit".<br />
<br />
== How to Avoid Typing Your Password All The Time ==<br />
<br />
This section is cryptic but better than nothing!<br />
<br />
man ssh-keygen<br />
Make it like this:<br />
<br />
matt% ls ~/.ssh<br />
id_rsa id_rsa.pub known_hosts<br />
matt% ssh cmn69<br />
[Feldman:~] matt% ls .ssh/<br />
<br />
authorized_keys known_hosts<br />
<br />
This link has a step by step instructions on how to do this:<br />
[http://linuxhelp.blogspot.com/2005/09/how-to-setup-ssh-keys-and-why.html how-to-setup-ssh-keys-and-why]<br />
<br />
== Viewing Log History ==<br />
<br />
You can view the log history by typing <br />
<br />
svn log<br />
<br />
== Basic Work Cycle ==<br />
<br />
Here's some great documentation on the basic Subversion Work Cycle: http://svnbook.red-bean.com/en/1.1/ch03s05.html<br />
<br />
To summarize this summary,<br />
<br />
# You must have a working copy. (svn checkout)<br />
# Your working copy may be out-of-date because somebody else has checked in improvements since you checked yours out. Fix this (svn update)<br />
# Make changes to your files.<br />
## If you need to change file names or directory structure, do it through Subversion (svn add, svn delete, svn mkdir, svn rename, svn copy, etc. <br />
# Examine your changes:<br />
## Which files have I changed (or have changed out from under me)? (svn status)<br />
## What changes have I made? (svn diff) <br />
# Commit your changes (svn commit). Please write a descriptive log message when you commit any file(s) to svn. <br />
<br />
You almost never need to refer to the server in a typed svn command, because working copies remember where they came from. So when you say svn update, svn status, svn diff, svn commit, etc., svn automatically interprets these to mean "with respect to the working copy in the current directory and the repository it came from" and so you don't have to mention the server.<br />
<br />
The only time you need to refer to the server is when you check out a working copy in the first place (where obviously you need to tell your svn client which server to get the files from):<br />
<br />
svn checkout svn+ssh://cmn17.stanford.edu/user/x/xyz/svn-repository <br />
<br />
The svn log command gives you the entire history of the given file, including the revision numbers (of the entire repository) for each different version of the file. If you want to compare two versions of a file from (for example) revisions 111 and 234, you can say<br />
<br />
svn diff -r111:234 filename<br />
<br />
== Best Practices ==<br />
<br />
Never store "derived" files in version control. Check in only source files, makefiles, etc.; the results of the build process (.o files, executable files, etc.) should not be under version control.<br />
<br />
Check in often. Any time you make any improvement to any code, check it back in ASAP. If it's untested or has any other potential problems, you can just make a note of them in the comment when you check it in.<br />
<br />
More advice:<br />
<br />
* http://www.geocities.com/vivekv/cvs-bestpractices<br />
* http://www.cmcrossroads.com/bradapp/acme/<br />
<br />
== Troubleshooting ==<br />
<br />
=== Last Resort ===<br />
<br />
When things really get confused, like when you don't understand SVN's error messages, this always works for me:<br />
<br />
# Manually move any files that you've modified somewhere safe, i.e., outside your working copy<br />
# Delete the troublesome parts of your working copy (files, directories, or the whole thing)<br />
# Do "svn update" to restore your working copy with the current version from the repository<br />
# Look at the result to see what actually was committed<br />
# If necessary, copy your changed files back into you shiny clean new working copy and submit them again. <br />
<br />
If things get completely hosed, you can always check out an all new working copy and then copy in any un-checked-in files that you modified.<br />
<br />
<br />
== Example Repository Set Up ==<br />
<br />
Here is the sequence of steps that was taken to set up the source control repository for the REALSIMPLE CCRMA Research Group:<br />
<br />
cd /usr/ccrma/group/realsimple<br />
svnadmin create --fs-type fsfs svn<br />
svn mkdir file:////usr/ccrma/group/realsimple/svn/doc -m 'Create REALSIMPLE doc directory'<br />
svn import /usr/ccrma/group/realsimple/doc file:////usr/ccrma/group/realsimple/svn/doc/trunk \<br />
-m 'Initial import of REALSIMPLE doc project'<br />
mv doc doc-old-CAN-DELETE<br />
svn checkout file:////usr/ccrma/group/realsimple/svn/doc/trunk /usr/ccrma/group/realsimple/doc<br />
cd doc<br />
svn propset svn:keywords 'URL Author Date Rev' *.tex Make* <etc.><br />
<cd <subdirs> and repeat.><br />
svn commit -m "Enable keywords URL Author Date Rev on all Make and .tex files, etc."<br />
<br />
Say 'svn help propset', etc., to learn about the various svn commands.<br />
<br />
To "check out" your own copy of the REALSIMPLE source tree at CCRMA, you should be able to say, from anywhere in the world,<br />
<br />
svn checkout svn+ssh://username@ccrma-gate.stanford.edu/usr/ccrma/group/realsimple/svn/doc/trunk ~/realsimple/doc<br />
<br />
for example. This checks out a working copy of the doc subdirectory of the REALSIMPLE svn repository, and places it in the realsimple subdirectory of your home directory (which should already exist). <br />
<br />
If you make changes to your local copy, you can upload them via <br />
<br />
cd ~/realsimple/doc<br />
svn commit -m 'My changes had to do with ..."<br />
<br />
However, to include newly created files in your working copy, you must first use 'svn add filename' to add them to the project, as mentioned above. As also described above, 'svn mkdir dirname' will make a new directory both in your working copy and in the repository, 'svn delete filename' will delete from both, and so on.<br />
<br />
To update your local copy with new changes by others, say<br />
<br />
svn update<br />
<br />
and so on.<br />
<br />
== How to Change or Switch the Location of Your Repository ==<br />
<br />
Use<br />
<br />
svn switch <NEW_URL><br />
<br />
See [http://svnbook.red-bean.com/en/1.1/ch04s05.html this svn document] for more.</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1477Mass project2006-09-04T16:41:21Z<p>Jcaceres: /* September 04, 2006 */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Project Summary ==<br />
<br />
# Recording and diffusion methodologies - testing and implementation<br />
#* Comparison between PZM (4channel) and Sound Field (4 directional)<br />
#* Decision taken for the Sound Filed, because it gives a better space image.<br />
#* Diffusion in a semi-anechoic room, using a 4channel setup.<br />
# Processing from offices recordings<br />
#* Tokyo office recording of all the sounds necessary for Calibration and for Impulse response generation.<br />
#* Processing of impulses responses inside and outside the room<br />
#* Pink noise calibration recording<br />
#* Room calibration through a generalized equalization methodology, using omni microphone recordings in the real room and in the simulated room as comparison.<br />
# FM masker generation<br />
#* Exploration of different strategies to follow (sinusoidal versus random modulation)<br />
#* Critical band (ERB) width of noise bands.<br />
#* Decision taken for using 3 bands, with random walk modulation (way less annoying than sinusoidal modulation).<br />
# Experiment design and implementation<br />
#* Experiment 01<br />
#*: Beta experiment to test system setup (C++ implementation) for real time experiment with automatic data retrieval. Also, fine tune of the experiment psychoacoustic design.<br />
#* Experiment 02<br />
#*: Masker Refinement through a general purpose process in which the best candidates are being selected while the worst are being discarded. This is achieved by varying one parameter at the time, and then moving to the next stage with the best candidate for that parameter, and moving another parameter. For the FM masker, the parameters where Center Band Frequency (3 bands), Band Amplitude, Modulation Rate (for each band), and Amplitude of the Modulation (for each band).<br />
#* Experiment 03<br />
#*: Efficiency, uses Santa Barbara corpus of conversation, in which for 1 masker (that is on throughout the whole experiment) random parts of the conversation are presented and asked if they are either heard or not. The RMS of the random part is recorded, as well as the answer of the subject.<br />
#* Experiment 04<br />
#*: Annoyance, in design process<br />
#Spatialization study<br />
#: Study of special variables in the direccionality of the masker. Generation of a “virtual impulse response” in which the sound (masker) comes from outside the room (where the intruding sound is located) but the filtering effect of the wall is removed.<br />
<br />
== How to setup and calibrate Tascam 3200 mixer ==<br />
<br />
** Detailed instructions<br />
<br />
*1)Start hdspmixer & hdspconf(all settings automatic)<br />
*2)Type in terminal cd /usr/bin/ then cpufreq-selector –g performance (sets maxcpu)<br />
*3)Open jack<br />
a.Set Frames/Period to 1024<br />
b.Set Sample Rate to 44100<br />
c.Set Interface to RME Hammerfall<br />
<br />
Mixer config :<br />
*4)Equalizing levels and linking channels<br />
a.Under “SCREEN MODE/NUMERIC ENTRY” click “METER.FADER”<br />
i.Under tab “CH FADER” set gain levels “CH 1-18” equal<br />
ii.Under tab “Master M/F” set bus levels “BUSS 1-16” equal <br />
b.Under “SCREEN MODE/NUMERIC ENTRY” click “ALT-LINK/GRP”<br />
i.Click “SEL” for channel 1 followed by 2, 3, and 4<br />
ii.Double click tab “GROUP ON/OFF”<br />
iii.Click down curser to set the next grouping<br />
iv.Click “SEL” for channel 5 followed by 6,7, and 8 <br />
*5)Setting the speakers for surround sound<br />
a.Click “SEL” for channel 1<br />
i.Under “OUTPUT ASSIGN” select “1”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
b.Click “SEL” for Channel 2<br />
i.Under “OUTPUT ASSIGN” select “3”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
c.Repeat this process for the following combinations<br />
i.Ch1:1, CH2:3, CH3:5, CH4:7, CH5:13, CH6:14, CH7:15, CH8:16<br />
ii.Channels 1-4 are head-level and Channel 5-8 are above<br />
*6)Set up I/0 (if is already screwed up)<br />
a.Click “ALT_ROUTING” and click “INPUT”<br />
i.Set CH1 to adat-1, CH2 to adat-2, etc…<br />
ii.If you want to set up a record line, do so setting CH9 to M/L 9<br />
1.Set the top knob and switch to appropriate setting <br />
2.Use CH9 fader to set input level to application<br />
b.Click “ALT-ROUTING” and click “OUTPUT SLOT” for output cards<br />
i.Slot A set Trk1-8 to BUSS 1-8 in sequential order (Horizontal)<br />
ii.Slot B set Trk1-8 to BUSS 9-16 in sequential order (Vertical)<br />
Software Config:<br />
<br />
*7)Setting up the software with hardware<br />
a.Go to Application under Bash shell and type “m”, then “make”, then “go”<br />
b.Play Voice recording and set levels to 25dbA at center<br />
c.Play Masker noise and set levels to 45dBa at center <br />
*8)Go to “MAIN DIALOG” in software app to set ID & output dir then Repeat 6a<br />
<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
=== Implementation ===<br />
[[Image:yamaha_exp2.png|thumb|GUI Experiment 2|250px|right|GUI Experiment 2]]<br />
<br />
*Verbal Instructions:<br />
This is a test where there are only 2 buttons are required, spacebar and (enter)return. You are going to have 2 test runs where you are going to be presented with speech. When you hear any speech you are to press spacebar immediately after to signal us that you heared speech. At the end of each cycle a purple bar will light up to let you know the cycle is ready. You will then press (enter)return to begin the next cycle. The first 2 trials are to get you used to pushing the buttons in response to speech, data will be recorded at the beginning of the third trial testing if you heared speech. <br />
<br />
*Speech used in the experiment where voices by Jason and Hiroko with the intent of neutral stress on vowels. The words chosen were one, two, three, four, eight which were convolved with impulse response from the tokyo conference room combined with recorded room noise.<br />
<br />
=== Post Experiment Subject Interviews ===<br />
*Phase01:<br />
This test had the most diversity in types of sounds. Since some maskers were not effiient, subjects learned about rhythem of speech presented. Subjects clearly described how some sounds worked better in masking then others since they had an idea of how many sounds were coming at what rate for each masker. Subjects enjoyed this test because differences in maskers were clear. <br />
*Phase02:<br />
Out of the bunch of 27 maskers we picked 2 candidates for our "golden masker." For this test we changed the amplitude of different center frequencies for these 2 maskers which gave very different sounds throughout the test. Some subjects found that sounds were noticably much harsher and annoying to listen to then others. Several subjects defined that for one masker, it worked really well in masking and sounded like being on an airplane. Subjects still enjoyed this test because differences in maskers were clear.<br />
*Phase03<br />
At this point we chose 1 masker and used different frequencies of modulation. Most subjects described the sound as droning meaning that it entranced or hypnotized them. This had an effect on most subjects who described the latter half of the test more difficult for them to concentrate. Some subjects claim to almost fall asleep making it difficult to give consistent answers. As I administered the test, I even noticed the sleepy feeling every single time so I started leaving the room during the test. Subjects said that they could hear the female voice very clearly when they would click spacebar (although they would miss more female speech overall). For the male voice that would come through, they would listen for the deep male voice that sounded like short spurts of "wha" and "woo." For the most part, subjects were hitting spacebar when there was speech and not hitting spacebar when they did not hear it as expected from the subjects that I did observe. <br />
*Phase04<br />
Most subjects described the sound as droning meaning that it entranced or hypnotized them as well. This made sense since we kept the same basic sounds but would change the frequency modulation amplitude. The main difference in this test as I would observe subjects is that they would push spacebar repeatedly when there would be no sound presented. This seem to be due to the fact that the 4 speakers above playing the masking sound is uncorrelated and getting random interference patterns. I assume that the sounds that were generated have a interference pattern that was comparable to the speech used ultimately confusing the listener. This effect played a role on all subjects that I observed and I let them continue pushing spacebar throughout the test. Some felt test was too long because they were falling asleep.<br />
<br />
== Experiment 03 - Efficiency ==<br />
<br />
=== Implementation ===<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold.<br />
<br />
The dialog chosen and start times are as follows:<br />
<br />
* Santa Barbara Corpus Clips Used<br />
Each clip is 5 minutes long with the start time indicated below. The trackes were normalized then tuned to the appropriate dbFS level in relation to each other to be in the acceptable threshold level for experimentation.<br />
<br />
*TRACK /Start Time /dbFS<br />
#sbc0001 /0:23 /-22.1<br />
#sbc0002 /0:00 /-9.1<br />
#sbc0008 /0:34 /-4.8<br />
#sbc0011 /0:14 /-2.3<br />
#sbc015 /0:00 /-1.9<br />
#sbc020 /0:00 /-4.4<br />
#sbc024 /0:00 /-4.1<br />
#sbc025 /0:00 /-3.5<br />
#sbc027 /0:00 /-7.0<br />
#sbc029 /0:00 /-5.7<br />
#sbc048 /1:15 /-0.8<br />
#sbc050 /2:17 /-6.5<br />
<br />
== Experiment 04 - Annoyance ==<br />
=== Experiment design ===<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
=== Aparatus To DO list ===<br />
<br />
*All randomized total 20 minutes<br />
#Subject walks in with ambient noise<br />
#List over speaker of approx 15 words (20sec)<br />
#linear fade of masker during 15 word recital<br />
#beep/flash to start mental math as long as possible (or 2.5 min) 3maskers<br />
#flash to start recital and repeat as much of the list in microphone for as long as they need<br />
#Subject chooses when to start next phase<br />
#quick fade out of masker to next masker while new 15 words played through speaker.<br />
<br />
*Data type<br />
Solutions, time between answers, # of recall word list<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== September 04, 2006 ===<br />
Tuesday 9:00AM '''Japan''' - Monday 5:00PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1459Mass project2006-08-31T00:20:27Z<p>Jcaceres: /* Project Summary */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Project Summary ==<br />
<br />
# Recording and diffusion methodologies - testing and implementation<br />
#* Comparison between PZM (4channel) and Sound Field (4 directional)<br />
#* Decision taken for the Sound Filed, because it gives a better space image.<br />
#* Diffusion in a semi-anechoic room, using a 4channel setup.<br />
# Processing from offices recordings<br />
#* Tokyo office recording of all the sounds necessary for Calibration and for Impulse response generation.<br />
#* Processing of impulses responses inside and outside the room<br />
#* Pink noise calibration recording<br />
#* Room calibration through a generalized equalization methodology, using omni microphone recordings in the real room and in the simulated room as comparison.<br />
# FM masker generation<br />
#* Exploration of different strategies to follow (sinusoidal versus random modulation)<br />
#* Critical band (ERB) width of noise bands.<br />
#* Decision taken for using 3 bands, with random walk modulation (way less annoying than sinusoidal modulation).<br />
# Experiment design and implementation<br />
#* Experiment 01<br />
#*: Beta experiment to test system setup (C++ implementation) for real time experiment with automatic data retrieval. Also, fine tune of the experiment psychoacoustic design.<br />
#* Experiment 02<br />
#*: Masker Refinement through a general purpose process in which the best candidates are being selected while the worst are being discarded. This is achieved by varying one parameter at the time, and then moving to the next stage with the best candidate for that parameter, and moving another parameter. For the FM masker, the parameters where Center Band Frequency (3 bands), Band Amplitude, Modulation Rate (for each band), and Amplitude of the Modulation (for each band).<br />
#* Experiment 03<br />
#*: Efficiency, uses Santa Barbara corpus of conversation, in which for 1 masker (that is on throughout the whole experiment) random parts of the conversation are presented and asked if they are either heard or not. The RMS of the random part is recorded, as well as the answer of the subject.<br />
#* Experiment 04<br />
#*: Annoyance, in design process<br />
#Spatialization study<br />
#: Study of special variables in the direccionality of the masker. Generation of a “virtual impulse response” in which the sound (masker) comes from outside the room (where the intruding sound is located) but the filtering effect of the wall is removed.<br />
<br />
== How to setup and calibrate Tascam 3200 mixer ==<br />
<br />
** Detailed instructions<br />
<br />
*1)Start hdspmixer & hdspconf(all settings automatic)<br />
*2)Type in terminal cd /usr/bin/ then cpufreq-selector –g performance (sets maxcpu)<br />
*3)Open jack<br />
a.Set Frames/Period to 1024<br />
b.Set Sample Rate to 44100<br />
c.Set Interface to RME Hammerfall<br />
<br />
Mixer config :<br />
*4)Equalizing levels and linking channels<br />
a.Under “SCREEN MODE/NUMERIC ENTRY” click “METER.FADER”<br />
i.Under tab “CH FADER” set gain levels “CH 1-18” equal<br />
ii.Under tab “Master M/F” set bus levels “BUSS 1-16” equal <br />
b.Under “SCREEN MODE/NUMERIC ENTRY” click “ALT-LINK/GRP”<br />
i.Click “SEL” for channel 1 followed by 2, 3, and 4<br />
ii.Double click tab “GROUP ON/OFF”<br />
iii.Click down curser to set the next grouping<br />
iv.Click “SEL” for channel 5 followed by 6,7, and 8 <br />
*5)Setting the speakers for surround sound<br />
a.Click “SEL” for channel 1<br />
i.Under “OUTPUT ASSIGN” select “1”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
b.Click “SEL” for Channel 2<br />
i.Under “OUTPUT ASSIGN” select “3”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
c.Repeat this process for the following combinations<br />
i.Ch1:1, CH2:3, CH3:5, CH4:7, CH5:13, CH6:14, CH7:15, CH8:16<br />
ii.Channels 1-4 are head-level and Channel 5-8 are above<br />
*6)Set up I/0 (if is already screwed up)<br />
a.Click “ALT_ROUTING” and click “INPUT”<br />
i.Set CH1 to adat-1, CH2 to adat-2, etc…<br />
ii.If you want to set up a record line, do so setting CH9 to M/L 9<br />
1.Set the top knob and switch to appropriate setting <br />
2.Use CH9 fader to set input level to application<br />
b.Click “ALT-ROUTING” and click “OUTPUT SLOT” for output cards<br />
i.Slot A set Trk1-8 to BUSS 1-8 in sequential order (Horizontal)<br />
ii.Slot B set Trk1-8 to BUSS 9-16 in sequential order (Vertical)<br />
Software Config:<br />
<br />
*7)Setting up the software with hardware<br />
a.Go to Application under Bash shell and type “m”, then “make”, then “go”<br />
b.Play Voice recording and set levels to 25dbA at center<br />
c.Play Masker noise and set levels to 45dBa at center <br />
*8)Go to “MAIN DIALOG” in software app to set ID & output dir then Repeat 6a<br />
<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
=== Implementation ===<br />
[[Image:yamaha_exp2.png|thumb|GUI Experiment 2|250px|right|GUI Experiment 2]]<br />
<br />
*Verbal Instructions:<br />
This is a test where there are only 2 buttons are required, spacebar and (enter)return. You are going to have 2 test runs where you are going to be presented with speech. When you hear any speech you are to press spacebar immediately after to signal us that you heared speech. At the end of each cycle a purple bar will light up to let you know the cycle is ready. You will then press (enter)return to begin the next cycle. The first 2 trials are to get you used to pushing the buttons in response to speech, data will be recorded at the beginning of the third trial testing if you heared speech. <br />
<br />
*Speech used in the experiment where voices by Jason and Hiroko with the intent of neutral stress on vowels. The words chosen were one, two, three, four, eight which were convolved with impulse response from the tokyo conference room combined with recorded room noise.<br />
<br />
=== Post Experiment Subject Interviews ===<br />
*Phase01:<br />
This test had the most diversity in types of sounds. Since some maskers were not effiient, subjects learned about rhythem of speech presented. Subjects clearly described how some sounds worked better in masking then others since they had an idea of how many sounds were coming at what rate for each masker. Subjects enjoyed this test because differences in maskers were clear. <br />
*Phase02:<br />
Out of the bunch of 27 maskers we picked 2 candidates for our "golden masker." For this test we changed the amplitude of different center frequencies for these 2 maskers which gave very different sounds throughout the test. Some subjects found that sounds were noticably much harsher and annoying to listen to then others. Several subjects defined that for one masker, it worked really well in masking and sounded like being on an airplane. Subjects still enjoyed this test because differences in maskers were clear.<br />
*Phase03<br />
At this point we chose 1 masker and used different frequencies of modulation. Most subjects described the sound as droning meaning that it entranced or hypnotized them. This had an effect on most subjects who described the latter half of the test more difficult for them to concentrate. Some subjects claim to almost fall asleep making it difficult to give consistent answers. As I administered the test, I even noticed the sleepy feeling every single time so I started leaving the room during the test. Subjects said that they could hear the female voice very clearly when they would click spacebar (although they would miss more female speech overall). For the male voice that would come through, they would listen for the deep male voice that sounded like short spurts of "wha" and "woo." For the most part, subjects were hitting spacebar when there was speech and not hitting spacebar when they did not hear it as expected from the subjects that I did observe. <br />
*Phase04<br />
Most subjects described the sound as droning meaning that it entranced or hypnotized them as well. This made sense since we kept the same basic sounds but would change the frequency modulation amplitude. The main difference in this test as I would observe subjects is that they would push spacebar repeatedly when there would be no sound presented. This seem to be due to the fact that the 4 speakers above playing the masking sound is uncorrelated and getting random interference patterns. I assume that the sounds that were generated have a interference pattern that was comparable to the speech used ultimately confusing the listener. This effect played a role on all subjects that I observed and I let them continue pushing spacebar throughout the test. Some felt test was too long because they were falling asleep.<br />
<br />
== Experiment 03 - Efficiency ==<br />
<br />
=== Implementation ===<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold.<br />
<br />
The dialog chosen and start times are as follows:<br />
<br />
* Santa Barbara Corpus Clips Used<br />
Each clip is 5 minutes long with the start time indicated below. The trackes were normalized then tuned to the appropriate dbFS level in relation to each other to be in the acceptable threshold level for experimentation.<br />
<br />
*TRACK /Start Time /dbFS<br />
#sbc0001 /0:23 /-22.1<br />
#sbc0002 /0:00 /-9.1<br />
#sbc0008 /0:34 /-4.8<br />
#sbc0011 /0:14 /-2.3<br />
#sbc015 /0:00 /-1.9<br />
#sbc020 /0:00 /-4.4<br />
#sbc024 /0:00 /-4.1<br />
#sbc025 /0:00 /-3.5<br />
#sbc027 /0:00 /-7.0<br />
#sbc029 /0:00 /-5.7<br />
#sbc048 /1:15 /-0.8<br />
#sbc050 /2:17 /-6.5<br />
<br />
== Experiment 04 - Annoyance ==<br />
=== Experiment design ===<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
=== Aparatus To DO list ===<br />
<br />
*All randomized total 20 minutes<br />
#Subject walks in with ambient noise<br />
#List over speaker of approx 15 words (20sec)<br />
#linear fade of masker during 15 word recital<br />
#beep/flash to start mental math as long as possible (or 2.5 min) 3maskers<br />
#flash to start recital and repeat as much of the list in microphone for as long as they need<br />
#Subject chooses when to start next phase<br />
#quick fade out of masker to next masker while new 15 words played through speaker.<br />
<br />
*Data type<br />
Solutions, time between answers, # of recall word list<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== September 04, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1458Mass project2006-08-31T00:09:34Z<p>Jcaceres: </p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Project Summary ==<br />
<br />
== How to setup and calibrate Tascam 3200 mixer ==<br />
<br />
** Detailed instructions<br />
<br />
*1)Start hdspmixer & hdspconf(all settings automatic)<br />
*2)Type in terminal cd /usr/bin/ then cpufreq-selector –g performance (sets maxcpu)<br />
*3)Open jack<br />
a.Set Frames/Period to 1024<br />
b.Set Sample Rate to 44100<br />
c.Set Interface to RME Hammerfall<br />
<br />
Mixer config :<br />
*4)Equalizing levels and linking channels<br />
a.Under “SCREEN MODE/NUMERIC ENTRY” click “METER.FADER”<br />
i.Under tab “CH FADER” set gain levels “CH 1-18” equal<br />
ii.Under tab “Master M/F” set bus levels “BUSS 1-16” equal <br />
b.Under “SCREEN MODE/NUMERIC ENTRY” click “ALT-LINK/GRP”<br />
i.Click “SEL” for channel 1 followed by 2, 3, and 4<br />
ii.Double click tab “GROUP ON/OFF”<br />
iii.Click down curser to set the next grouping<br />
iv.Click “SEL” for channel 5 followed by 6,7, and 8 <br />
*5)Setting the speakers for surround sound<br />
a.Click “SEL” for channel 1<br />
i.Under “OUTPUT ASSIGN” select “1”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
b.Click “SEL” for Channel 2<br />
i.Under “OUTPUT ASSIGN” select “3”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
c.Repeat this process for the following combinations<br />
i.Ch1:1, CH2:3, CH3:5, CH4:7, CH5:13, CH6:14, CH7:15, CH8:16<br />
ii.Channels 1-4 are head-level and Channel 5-8 are above<br />
*6)Set up I/0 (if is already screwed up)<br />
a.Click “ALT_ROUTING” and click “INPUT”<br />
i.Set CH1 to adat-1, CH2 to adat-2, etc…<br />
ii.If you want to set up a record line, do so setting CH9 to M/L 9<br />
1.Set the top knob and switch to appropriate setting <br />
2.Use CH9 fader to set input level to application<br />
b.Click “ALT-ROUTING” and click “OUTPUT SLOT” for output cards<br />
i.Slot A set Trk1-8 to BUSS 1-8 in sequential order (Horizontal)<br />
ii.Slot B set Trk1-8 to BUSS 9-16 in sequential order (Vertical)<br />
Software Config:<br />
<br />
*7)Setting up the software with hardware<br />
a.Go to Application under Bash shell and type “m”, then “make”, then “go”<br />
b.Play Voice recording and set levels to 25dbA at center<br />
c.Play Masker noise and set levels to 45dBa at center <br />
*8)Go to “MAIN DIALOG” in software app to set ID & output dir then Repeat 6a<br />
<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
=== Implementation ===<br />
[[Image:yamaha_exp2.png|thumb|GUI Experiment 2|250px|right|GUI Experiment 2]]<br />
<br />
*Verbal Instructions:<br />
This is a test where there are only 2 buttons are required, spacebar and (enter)return. You are going to have 2 test runs where you are going to be presented with speech. When you hear any speech you are to press spacebar immediately after to signal us that you heared speech. At the end of each cycle a purple bar will light up to let you know the cycle is ready. You will then press (enter)return to begin the next cycle. The first 2 trials are to get you used to pushing the buttons in response to speech, data will be recorded at the beginning of the third trial testing if you heared speech. <br />
<br />
*Speech used in the experiment where voices by Jason and Hiroko with the intent of neutral stress on vowels. The words chosen were one, two, three, four, eight which were convolved with impulse response from the tokyo conference room combined with recorded room noise.<br />
<br />
=== Post Experiment Subject Interviews ===<br />
*Phase01:<br />
This test had the most diversity in types of sounds. Since some maskers were not effiient, subjects learned about rhythem of speech presented. Subjects clearly described how some sounds worked better in masking then others since they had an idea of how many sounds were coming at what rate for each masker. Subjects enjoyed this test because differences in maskers were clear. <br />
*Phase02:<br />
Out of the bunch of 27 maskers we picked 2 candidates for our "golden masker." For this test we changed the amplitude of different center frequencies for these 2 maskers which gave very different sounds throughout the test. Some subjects found that sounds were noticably much harsher and annoying to listen to then others. Several subjects defined that for one masker, it worked really well in masking and sounded like being on an airplane. Subjects still enjoyed this test because differences in maskers were clear.<br />
*Phase03<br />
At this point we chose 1 masker and used different frequencies of modulation. Most subjects described the sound as droning meaning that it entranced or hypnotized them. This had an effect on most subjects who described the latter half of the test more difficult for them to concentrate. Some subjects claim to almost fall asleep making it difficult to give consistent answers. As I administered the test, I even noticed the sleepy feeling every single time so I started leaving the room during the test. Subjects said that they could hear the female voice very clearly when they would click spacebar (although they would miss more female speech overall). For the male voice that would come through, they would listen for the deep male voice that sounded like short spurts of "wha" and "woo." For the most part, subjects were hitting spacebar when there was speech and not hitting spacebar when they did not hear it as expected from the subjects that I did observe. <br />
*Phase04<br />
Most subjects described the sound as droning meaning that it entranced or hypnotized them as well. This made sense since we kept the same basic sounds but would change the frequency modulation amplitude. The main difference in this test as I would observe subjects is that they would push spacebar repeatedly when there would be no sound presented. This seem to be due to the fact that the 4 speakers above playing the masking sound is uncorrelated and getting random interference patterns. I assume that the sounds that were generated have a interference pattern that was comparable to the speech used ultimately confusing the listener. This effect played a role on all subjects that I observed and I let them continue pushing spacebar throughout the test. Some felt test was too long because they were falling asleep.<br />
<br />
== Experiment 03 - Efficiency ==<br />
<br />
=== Implementation ===<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold.<br />
<br />
The dialog chosen and start times are as follows:<br />
<br />
* Santa Barbara Corpus Clips Used<br />
Each clip is 5 minutes long with the start time indicated below. The trackes were normalized then tuned to the appropriate dbFS level in relation to each other to be in the acceptable threshold level for experimentation.<br />
<br />
*TRACK /Start Time /dbFS<br />
#sbc0001 /0:23 /-22.1<br />
#sbc0002 /0:00 /-9.1<br />
#sbc0008 /0:34 /-4.8<br />
#sbc0011 /0:14 /-2.3<br />
#sbc015 /0:00 /-1.9<br />
#sbc020 /0:00 /-4.4<br />
#sbc024 /0:00 /-4.1<br />
#sbc025 /0:00 /-3.5<br />
#sbc027 /0:00 /-7.0<br />
#sbc029 /0:00 /-5.7<br />
#sbc048 /1:15 /-0.8<br />
#sbc050 /2:17 /-6.5<br />
<br />
== Experiment 04 - Annoyance ==<br />
=== Experiment design ===<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
=== Aparatus To DO list ===<br />
<br />
*All randomized total 20 minutes<br />
#Subject walks in with ambient noise<br />
#List over speaker of approx 15 words (20sec)<br />
#linear fade of masker during 15 word recital<br />
#beep/flash to start mental math as long as possible (or 2.5 min) 3maskers<br />
#flash to start recital and repeat as much of the list in microphone for as long as they need<br />
#Subject chooses when to start next phase<br />
#quick fade out of masker to next masker while new 15 words played through speaker.<br />
<br />
*Data type<br />
Solutions, time between answers, # of recall word list<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== September 04, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1449Mass project2006-08-29T21:07:55Z<p>Jcaceres: /* August 28, 2006 */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== How to setup and calibrate Tascam 3200 mixer ==<br />
<br />
** Detailed instructions<br />
<br />
*1)Start hdspmixer & hdspconf(all settings automatic)<br />
*2)Type in terminal cd /usr/bin/ then cpufreq-selector –g performance (sets maxcpu)<br />
*3)Open jack<br />
a.Set Frames/Period to 1024<br />
b.Set Sample Rate to 44100<br />
c.Set Interface to RME Hammerfall<br />
<br />
Mixer config :<br />
*4)Equalizing levels and linking channels<br />
a.Under “SCREEN MODE/NUMERIC ENTRY” click “METER.FADER”<br />
i.Under tab “CH FADER” set gain levels “CH 1-18” equal<br />
ii.Under tab “Master M/F” set bus levels “BUSS 1-16” equal <br />
b.Under “SCREEN MODE/NUMERIC ENTRY” click “ALT-LINK/GRP”<br />
i.Click “SEL” for channel 1 followed by 2, 3, and 4<br />
ii.Double click tab “GROUP ON/OFF”<br />
iii.Click down curser to set the next grouping<br />
iv.Click “SEL” for channel 5 followed by 6,7, and 8 <br />
*5)Setting the speakers for surround sound<br />
a.Click “SEL” for channel 1<br />
i.Under “OUTPUT ASSIGN” select “1”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
b.Click “SEL” for Channel 2<br />
i.Under “OUTPUT ASSIGN” select “3”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
c.Repeat this process for the following combinations<br />
i.Ch1:1, CH2:3, CH3:5, CH4:7, CH5:13, CH6:14, CH7:15, CH8:16<br />
ii.Channels 1-4 are head-level and Channel 5-8 are above<br />
*6)Set up I/0 (if is already screwed up)<br />
a.Click “ALT_ROUTING” and click “INPUT”<br />
i.Set CH1 to adat-1, CH2 to adat-2, etc…<br />
ii.If you want to set up a record line, do so setting CH9 to M/L 9<br />
1.Set the top knob and switch to appropriate setting <br />
2.Use CH9 fader to set input level to application<br />
b.Click “ALT-ROUTING” and click “OUTPUT SLOT” for output cards<br />
i.Slot A set Trk1-8 to BUSS 1-8 in sequential order (Horizontal)<br />
ii.Slot B set Trk1-8 to BUSS 9-16 in sequential order (Vertical)<br />
Software Config:<br />
<br />
*7)Setting up the software with hardware<br />
a.Go to Application under Bash shell and type “m”, then “make”, then “go”<br />
b.Play Voice recording and set levels to 25dbA at center<br />
c.Play Masker noise and set levels to 45dBa at center <br />
*8)Go to “MAIN DIALOG” in software app to set ID & output dir then Repeat 6a<br />
<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
=== Implementation ===<br />
[[Image:yamaha_exp2.png|thumb|GUI Experiment 2|250px|right|GUI Experiment 2]]<br />
<br />
*Verbal Instructions:<br />
This is a test where there are only 2 buttons are required, spacebar and (enter)return. You are going to have 2 test runs where you are going to be presented with speech. When you hear any speech you are to press spacebar immediately after to signal us that you heared speech. At the end of each cycle a purple bar will light up to let you know the cycle is ready. You will then press (enter)return to begin the next cycle. The first 2 trials are to get you used to pushing the buttons in response to speech, data will be recorded at the beginning of the third trial testing if you heared speech. <br />
<br />
*Speech used in the experiment where voices by Jason and Hiroko with the intent of neutral stress on vowels. The words chosen were one, two, three, four, eight which were convolved with impulse response from the tokyo conference room combined with recorded room noise.<br />
<br />
=== Post Experiment Subject Interviews ===<br />
*Phase01:<br />
This test had the most diversity in types of sounds. Since some maskers were not effiient, subjects learned about rhythem of speech presented. Subjects clearly described how some sounds worked better in masking then others since they had an idea of how many sounds were coming at what rate for each masker. Subjects enjoyed this test because differences in maskers were clear. <br />
*Phase02:<br />
Out of the bunch of 27 maskers we picked 2 candidates for our "golden masker." For this test we changed the amplitude of different center frequencies for these 2 maskers which gave very different sounds throughout the test. Some subjects found that sounds were noticably much harsher and annoying to listen to then others. Several subjects defined that for one masker, it worked really well in masking and sounded like being on an airplane. Subjects still enjoyed this test because differences in maskers were clear.<br />
*Phase03<br />
At this point we chose 1 masker and used different frequencies of modulation. Most subjects described the sound as droning meaning that it entranced or hypnotized them. This had an effect on most subjects who described the latter half of the test more difficult for them to concentrate. Some subjects claim to almost fall asleep making it difficult to give consistent answers. As I administered the test, I even noticed the sleepy feeling every single time so I started leaving the room during the test. Subjects said that they could hear the female voice very clearly when they would click spacebar (although they would miss more female speech overall). For the male voice that would come through, they would listen for the deep male voice that sounded like short spurts of "wha" and "woo." For the most part, subjects were hitting spacebar when there was speech and not hitting spacebar when they did not hear it as expected from the subjects that I did observe. <br />
*Phase04<br />
Most subjects described the sound as droning meaning that it entranced or hypnotized them as well. This made sense since we kept the same basic sounds but would change the frequency modulation amplitude. The main difference in this test as I would observe subjects is that they would push spacebar repeatedly when there would be no sound presented. This seem to be due to the fact that the 4 speakers above playing the masking sound is uncorrelated and getting random interference patterns. I assume that the sounds that were generated have a interference pattern that was comparable to the speech used ultimately confusing the listener. This effect played a role on all subjects that I observed and I let them continue pushing spacebar throughout the test. Some felt test was too long because they were falling asleep.<br />
<br />
== Experiment 03 - Efficiency ==<br />
<br />
=== Implementation ===<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold.<br />
<br />
The dialog chosen and start times are as follows:<br />
<br />
* Santa Barbara Corpus Clips Used<br />
Each clip is 5 minutes long with the start time indicated below. The trackes were normalized then tuned to the appropriate dbFS level in relation to each other to be in the acceptable threshold level for experimentation.<br />
<br />
*TRACK /Start Time /dbFS<br />
#sbc0001 /0:23 /-22.1<br />
#sbc0002 /0:00 /-9.1<br />
#sbc0008 /0:34 /-4.8<br />
#sbc0011 /0:14 /-2.3<br />
#sbc015 /0:00 /-1.9<br />
#sbc020 /0:00 /-4.4<br />
#sbc024 /0:00 /-4.1<br />
#sbc025 /0:00 /-3.5<br />
#sbc027 /0:00 /-7.0<br />
#sbc029 /0:00 /-5.7<br />
#sbc048 /1:15 /-0.8<br />
#sbc050 /2:17 /-6.5<br />
<br />
== Experiment 04 - Annoyance ==<br />
=== Experiment design ===<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
=== Aparatus To DO list ===<br />
<br />
*All randomized total 20 minutes<br />
#Subject walks in with ambient noise<br />
#List over speaker of approx 15 words (20sec)<br />
#linear fade of masker during 15 word recital<br />
#beep/flash to start mental math as long as possible (or 2.5 min) 3maskers<br />
#flash to start recital and repeat as much of the list in microphone for as long as they need<br />
#Subject chooses when to start next phase<br />
#quick fade out of masker to next masker while new 15 words played through speaker.<br />
<br />
*Data type<br />
Solutions, time between answers, # of recall word list<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== September 04, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1441Mass project2006-08-29T00:35:17Z<p>Jcaceres: /* Experiment 03 - Intelligibility */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== How to setup and calibrate Tascam 3200 mixer ==<br />
<br />
** Detailed instructions<br />
<br />
*1)Start hdspmixer & hdspconf(all settings automatic)<br />
*2)Type in terminal cd /usr/bin/ then cpufreq-selector –g performance (sets maxcpu)<br />
*3)Open jack<br />
a.Set Frames/Period to 1024<br />
b.Set Sample Rate to 44100<br />
c.Set Interface to RME Hammerfall<br />
<br />
Mixer config :<br />
*4)Equalizing levels and linking channels<br />
a.Under “SCREEN MODE/NUMERIC ENTRY” click “METER.FADER”<br />
i.Under tab “CH FADER” set gain levels “CH 1-18” equal<br />
ii.Under tab “Master M/F” set bus levels “BUSS 1-16” equal <br />
b.Under “SCREEN MODE/NUMERIC ENTRY” click “ALT-LINK/GRP”<br />
i.Click “SEL” for channel 1 followed by 2, 3, and 4<br />
ii.Double click tab “GROUP ON/OFF”<br />
iii.Click down curser to set the next grouping<br />
iv.Click “SEL” for channel 5 followed by 6,7, and 8 <br />
*5)Setting the speakers for surround sound<br />
a.Click “SEL” for channel 1<br />
i.Under “OUTPUT ASSIGN” select “1”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
b.Click “SEL” for Channel 2<br />
i.Under “OUTPUT ASSIGN” select “3”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
c.Repeat this process for the following combinations<br />
i.Ch1:1, CH2:3, CH3:5, CH4:7, CH5:13, CH6:14, CH7:15, CH8:16<br />
ii.Channels 1-4 are head-level and Channel 5-8 are above<br />
*6)Set up I/0 (if is already screwed up)<br />
a.Click “ALT_ROUTING” and click “INPUT”<br />
i.Set CH1 to adat-1, CH2 to adat-2, etc…<br />
ii.If you want to set up a record line, do so setting CH9 to M/L 9<br />
1.Set the top knob and switch to appropriate setting <br />
2.Use CH9 fader to set input level to application<br />
b.Click “ALT-ROUTING” and click “OUTPUT SLOT” for output cards<br />
i.Slot A set Trk1-8 to BUSS 1-8 in sequential order (Horizontal)<br />
ii.Slot B set Trk1-8 to BUSS 9-16 in sequential order (Vertical)<br />
Software Config:<br />
<br />
*7)Setting up the software with hardware<br />
a.Go to Application under Bash shell and type “m”, then “make”, then “go”<br />
b.Play Voice recording and set levels to 25dbA at center<br />
c.Play Masker noise and set levels to 45dBa at center <br />
*8)Go to “MAIN DIALOG” in software app to set ID & output dir then Repeat 6a<br />
<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
=== Implementation ===<br />
[[Image:yamaha_exp2.png|thumb|GUI Experiment 2|250px|right|GUI Experiment 2]]<br />
<br />
*Verbal Instructions:<br />
This is a test where there are only 2 buttons are required, spacebar and (enter)return. You are going to have 2 test runs where you are going to be presented with speech. When you hear any speech you are to press spacebar immediately after to signal us that you heared speech. At the end of each cycle a purple bar will light up to let you know the cycle is ready. You will then press (enter)return to begin the next cycle. The first 2 trials are to get you used to pushing the buttons in response to speech, data will be recorded at the beginning of the third trial testing if you heared speech. <br />
<br />
*Speech used in the experiment where voices by Jason and Hiroko with the intent of neutral stress on vowels. The words chosen were one, two, three, four, eight which were convolved with impulse response from the tokyo conference room combined with recorded room noise.<br />
<br />
=== Post Experiment Subject Interviews ===<br />
*Phase01:<br />
This test had the most diversity in types of sounds. Since some maskers were not effiient, subjects learned about rhythem of speech presented. Subjects clearly described how some sounds worked better in masking then others since they had an idea of how many sounds were coming at what rate for each masker. Subjects enjoyed this test because differences in maskers were clear. <br />
*Phase02:<br />
Out of the bunch of 27 maskers we picked 2 candidates for our "golden masker." For this test we changed the amplitude of different center frequencies for these 2 maskers which gave very different sounds throughout the test. Some subjects found that sounds were noticably much harsher and annoying to listen to then others. Several subjects defined that for one masker, it worked really well in masking and sounded like being on an airplane. Subjects still enjoyed this test because differences in maskers were clear.<br />
*Phase03<br />
At this point we chose 1 masker and used different frequencies of modulation. Most subjects described the sound as droning meaning that it entranced or hypnotized them. This had an effect on most subjects who described the latter half of the test more difficult for them to concentrate. Some subjects claim to almost fall asleep making it difficult to give consistent answers. As I administered the test, I even noticed the sleepy feeling every single time so I started leaving the room during the test. Subjects said that they could hear the female voice very clearly when they would click spacebar (although they would miss more female speech overall). For the male voice that would come through, they would listen for the deep male voice that sounded like short spurts of "wha" and "woo." For the most part, subjects were hitting spacebar when there was speech and not hitting spacebar when they did not hear it as expected from the subjects that I did observe. <br />
*Phase04<br />
Most subjects described the sound as droning meaning that it entranced or hypnotized them as well. This made sense since we kept the same basic sounds but would change the frequency modulation amplitude. The main difference in this test as I would observe subjects is that they would push spacebar repeatedly when there would be no sound presented. This seem to be due to the fact that the 4 speakers above playing the masking sound is uncorrelated and getting random interference patterns. I assume that the sounds that were generated have a interference pattern that was comparable to the speech used ultimately confusing the listener. This effect played a role on all subjects that I observed and I let them continue pushing spacebar throughout the test. Some felt test was too long because they were falling asleep.<br />
<br />
== Experiment 03 - Efficiency ==<br />
<br />
=== Implementation ===<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold.<br />
<br />
The dialog chosen and start times are as follows:<br />
<br />
* Santa Barbara Corpus Clips Used<br />
Each clip is 5 minutes long with the start time indicated below<br />
<br />
*TRACK Start Time<br />
#sbc0001 0:23<br />
#sbc0002 0:00<br />
#sbc0008 0:34<br />
#sbc0011 0:14<br />
#sbc015 0:00<br />
#sbc020 0:00<br />
#sbc024 0:00<br />
#sbc025 0:00<br />
#sbc027 0:00<br />
#sbc029 0:00<br />
#sbc048 1:15<br />
#sbc050 2:17<br />
<br />
== Experiment 04 - Annoyance ==<br />
=== Experiment design ===<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1440Mass project2006-08-29T00:14:09Z<p>Jcaceres: /* Post Experiment Subject Interviews */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== How to setup and calibrate Tascam 3200 mixer ==<br />
<br />
** Detailed instructions<br />
<br />
*1)Start hdspmixer & hdspconf(all settings automatic)<br />
*2)Type in terminal cd /usr/bin/ then cpufreq-selector –g performance (sets maxcpu)<br />
*3)Open jack<br />
a.Set Frames/Period to 1024<br />
b.Set Sample Rate to 44100<br />
c.Set Interface to RME Hammerfall<br />
<br />
Mixer config :<br />
*4)Equalizing levels and linking channels<br />
a.Under “SCREEN MODE/NUMERIC ENTRY” click “METER.FADER”<br />
i.Under tab “CH FADER” set gain levels “CH 1-18” equal<br />
ii.Under tab “Master M/F” set bus levels “BUSS 1-16” equal <br />
b.Under “SCREEN MODE/NUMERIC ENTRY” click “ALT-LINK/GRP”<br />
i.Click “SEL” for channel 1 followed by 2, 3, and 4<br />
ii.Double click tab “GROUP ON/OFF”<br />
iii.Click down curser to set the next grouping<br />
iv.Click “SEL” for channel 5 followed by 6,7, and 8 <br />
*5)Setting the speakers for surround sound<br />
a.Click “SEL” for channel 1<br />
i.Under “OUTPUT ASSIGN” select “1”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
b.Click “SEL” for Channel 2<br />
i.Under “OUTPUT ASSIGN” select “3”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
c.Repeat this process for the following combinations<br />
i.Ch1:1, CH2:3, CH3:5, CH4:7, CH5:13, CH6:14, CH7:15, CH8:16<br />
ii.Channels 1-4 are head-level and Channel 5-8 are above<br />
*6)Set up I/0 (if is already screwed up)<br />
a.Click “ALT_ROUTING” and click “INPUT”<br />
i.Set CH1 to adat-1, CH2 to adat-2, etc…<br />
ii.If you want to set up a record line, do so setting CH9 to M/L 9<br />
1.Set the top knob and switch to appropriate setting <br />
2.Use CH9 fader to set input level to application<br />
b.Click “ALT-ROUTING” and click “OUTPUT SLOT” for output cards<br />
i.Slot A set Trk1-8 to BUSS 1-8 in sequential order (Horizontal)<br />
ii.Slot B set Trk1-8 to BUSS 9-16 in sequential order (Vertical)<br />
Software Config:<br />
<br />
*7)Setting up the software with hardware<br />
a.Go to Application under Bash shell and type “m”, then “make”, then “go”<br />
b.Play Voice recording and set levels to 25dbA at center<br />
c.Play Masker noise and set levels to 45dBa at center <br />
*8)Go to “MAIN DIALOG” in software app to set ID & output dir then Repeat 6a<br />
<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
=== Implementation ===<br />
[[Image:yamaha_exp2.png|thumb|GUI Experiment 2|250px|right|GUI Experiment 2]]<br />
<br />
*Verbal Instructions:<br />
This is a test where there are only 2 buttons are required, spacebar and (enter)return. You are going to have 2 test runs where you are going to be presented with speech. When you hear any speech you are to press spacebar immediately after to signal us that you heared speech. At the end of each cycle a purple bar will light up to let you know the cycle is ready. You will then press (enter)return to begin the next cycle. The first 2 trials are to get you used to pushing the buttons in response to speech, data will be recorded at the beginning of the third trial testing if you heared speech. <br />
<br />
*Speech used in the experiment where voices by Jason and Hiroko with the intent of neutral stress on vowels. The words chosen were one, two, three, four, eight which were convolved with impulse response from the tokyo conference room combined with recorded room noise.<br />
<br />
=== Post Experiment Subject Interviews ===<br />
*Phase01:<br />
This test had the most diversity in types of sounds. Since some maskers were not effiient, subjects learned about rhythem of speech presented. Subjects clearly described how some sounds worked better in masking then others since they had an idea of how many sounds were coming at what rate for each masker. Subjects enjoyed this test because differences in maskers were clear. <br />
*Phase02:<br />
Out of the bunch of 27 maskers we picked 2 candidates for our "golden masker." For this test we changed the amplitude of different center frequencies for these 2 maskers which gave very different sounds throughout the test. Some subjects found that sounds were noticably much harsher and annoying to listen to then others. Several subjects defined that for one masker, it worked really well in masking and sounded like being on an airplane. Subjects still enjoyed this test because differences in maskers were clear.<br />
*Phase03<br />
At this point we chose 1 masker and used different frequencies of modulation. Most subjects described the sound as droning meaning that it entranced or hypnotized them. This had an effect on most subjects who described the latter half of the test more difficult for them to concentrate. Some subjects claim to almost fall asleep making it difficult to give consistent answers. As I administered the test, I even noticed the sleepy feeling every single time so I started leaving the room during the test. Subjects said that they could hear the female voice very clearly when they would click spacebar (although they would miss more female speech overall). For the male voice that would come through, they would listen for the deep male voice that sounded like short spurts of "wha" and "woo." For the most part, subjects were hitting spacebar when there was speech and not hitting spacebar when they did not hear it as expected from the subjects that I did observe. <br />
*Phase04<br />
Most subjects described the sound as droning meaning that it entranced or hypnotized them as well. This made sense since we kept the same basic sounds but would change the frequency modulation amplitude. The main difference in this test as I would observe subjects is that they would push spacebar repeatedly when there would be no sound presented. This seem to be due to the fact that the 4 speakers above playing the masking sound is uncorrelated and getting random interference patterns. I assume that the sounds that were generated have a interference pattern that was comparable to the speech used ultimately confusing the listener. This effect played a role on all subjects that I observed and I let them continue pushing spacebar throughout the test. Some felt test was too long because they were falling asleep.<br />
<br />
== Experiment 03 - Intelligibility ==<br />
=== Experiment design ===<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
=== Implementation ===<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold.<br />
<br />
The dialog chosen and start times are as follows:<br />
<br />
* Santa Barbara Corpus Clips Used<br />
Each clip is 5 minutes long with the start time indicated below<br />
<br />
*TRACK Start Time<br />
#sbc0001 0:23<br />
#sbc0002 0:00<br />
#sbc0008 0:34<br />
#sbc0011 0:14<br />
#sbc015 0:00<br />
#sbc020 0:00<br />
#sbc024 0:00<br />
#sbc025 0:00<br />
#sbc027 0:00<br />
#sbc029 0:00<br />
#sbc048 1:15<br />
#sbc050 2:17<br />
<br />
== Experiment 04 - Annoyance ==<br />
=== Experiment design ===<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1433Mass project2006-08-25T19:23:40Z<p>Jcaceres: /* Experiment 01 - Beta Test */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== How to setup and calibrate Tascam 3200 mixer ==<br />
<br />
** Detailed instructions<br />
<br />
*1)Start hdspmixer & hdspconf(all settings automatic)<br />
*2)Type in terminal cd /usr/bin/ then cpufreq-selector –g performance (sets maxcpu)<br />
*3)Open jack<br />
a.Set Frames/Period to 1024<br />
b.Set Sample Rate to 44100<br />
c.Set Interface to RME Hammerfall<br />
<br />
Mixer config :<br />
*4)Equalizing levels and linking channels<br />
a.Under “SCREEN MODE/NUMERIC ENTRY” click “METER.FADER”<br />
i.Under tab “CH FADER” set gain levels “CH 1-18” equal<br />
ii.Under tab “Master M/F” set bus levels “BUSS 1-16” equal <br />
b.Under “SCREEN MODE/NUMERIC ENTRY” click “ALT-LINK/GRP”<br />
i.Click “SEL” for channel 1 followed by 2, 3, and 4<br />
ii.Double click tab “GROUP ON/OFF”<br />
iii.Click down curser to set the next grouping<br />
iv.Click “SEL” for channel 5 followed by 6,7, and 8 <br />
*5)Setting the speakers for surround sound<br />
a.Click “SEL” for channel 1<br />
i.Under “OUTPUT ASSIGN” select “1”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
b.Click “SEL” for Channel 2<br />
i.Under “OUTPUT ASSIGN” select “3”<br />
ii.Make sure “STEREO” and “DIRECT” are unchecked <br />
c.Repeat this process for the following combinations<br />
i.Ch1:1, CH2:3, CH3:5, CH4:7, CH5:13, CH6:14, CH7:15, CH8:16<br />
ii.Channels 1-4 are head-level and Channel 5-8 are above<br />
*6)Set up I/0 (if is already screwed up)<br />
a.Click “ALT_ROUTING” and click “INPUT”<br />
i.Set CH1 to adat-1, CH2 to adat-2, etc…<br />
ii.If you want to set up a record line, do so setting CH9 to M/L 9<br />
1.Set the top knob and switch to appropriate setting <br />
2.Use CH9 fader to set input level to application<br />
b.Click “ALT-ROUTING” and click “OUTPUT SLOT” for output cards<br />
i.Slot A set Trk1-8 to BUSS 1-8 in sequential order (Horizontal)<br />
ii.Slot B set Trk1-8 to BUSS 9-16 in sequential order (Vertical)<br />
Software Config:<br />
<br />
*7)Setting up the software with hardware<br />
a.Go to Application under Bash shell and type “m”, then “make”, then “go”<br />
b.Play Voice recording and set levels to 25dbA at center<br />
c.Play Masker noise and set levels to 45dBa at center <br />
*8)Go to “MAIN DIALOG” in software app to set ID & output dir then Repeat 6a<br />
<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
=== Implementation ===<br />
[[Image:yamaha_exp2.png|thumb|GUI Experiment 2|250px|right|GUI Experiment 2]]<br />
<br />
**Verbal Instructions:<br />
This is a test where there are only 2 buttons are required, spacebar and (enter)return. You are going to have 2 test runs where you are going to be presented with speech. When you hear any speech you are to press spacebar immediately after to signal us that you heared speech. At the end of each cycle a purple bar will light up to let you know the cycle is ready. You will then press (enter)return to begin the next cycle. The first 2 trials are to get you used to pushing the buttons in response to speech, data will be recorded at the beginning of the third trial testing if you heared speech. <br />
<br />
**Speech used in the experiment where voices by Jason and Hiroko with the intent of neutral stress on vowels. The words chosen were one, two, three, four, eight which were convolved with impulse response from the tokyo <br />
conference room combined with recorded room noise.<br />
<br />
== Experiment 03 - Intelligibility ==<br />
=== Experiment design ===<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
=== Implementation ===<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold.<br />
<br />
The dialog chosen and start times are as follows:<br />
<br />
* Santa Barbara Corpus Clips Used<br />
**Each clip is 5 minutes long with the start time indicated below<br />
<br />
**TRACK Start Time<br />
**sbc0001 0:23<br />
**sbc0002 0:00<br />
**sbc0008 0:34<br />
**sbc0011 0:14<br />
**sbc015 0:00<br />
**sbc020 0:00<br />
**sbc024 0:00<br />
**sbc025 0:00<br />
**sbc027 0:00<br />
**sbc029 0:00<br />
**sbc048 1:15<br />
**sbc050 2:17<br />
<br />
== Experiment 04 - Annoyance ==<br />
=== Experiment design ===<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1426Mass project2006-08-23T17:22:17Z<p>Jcaceres: /* Experiment Design */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
=== Implementation ===<br />
[[Image:yamaha_exp2.png|thumb|GUI Experiment 2|250px|right|GUI Experiment 2]]<br />
The implementation was donne with the numbers recorded by Hiroko and Jason, mantaining a neutral stress. The user just have to press the space bar when he/she hear voices (numbers). When the experiment is done for one masker, the noise stops and the user presses the next button to go to the next trail.<br />
<br />
== Experiment 03 - Intelligibility ==<br />
=== Experiment design ===<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
=== Implementation ===<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Experiment 04 - Annoyance ==<br />
=== Experiment design ===<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1425Mass project2006-08-23T17:14:42Z<p>Jcaceres: /* Implementation */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
=== Implementation ===<br />
[[Image:yamaha_exp2.png|thumb|GUI Experiment 2|250px|right|GUI Experiment 2]]<br />
The implementation was donne with the numbers recorded by Hiroko and Jason, mantaining a neutral stress. The user just have to press the space bar when he/she hear voices (numbers). When the experiment is done for one masker, the noise stops and the user presses the next button to go to the next trail.<br />
<br />
== Experiment 03 - Intelligibility ==<br />
=== Experiment design ===<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
=== Implementation ===<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Experiment 04 - Annoyance ==<br />
=== Experiment Design ===<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1424Mass project2006-08-23T17:09:14Z<p>Jcaceres: /* Experiment 02 - Masker Refinement */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
=== Implementation ===<br />
[[Image:yamaha_exp2.png|thumb|GUI Experiment 2|250px|right|GUI Experiment 2]]<br />
<br />
== Experiment 03 - Intelligibility ==<br />
=== Experiment design ===<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
=== Implementation ===<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Experiment 04 - Annoyance ==<br />
=== Experiment Design ===<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=File:Yamaha_exp2.png&diff=1423File:Yamaha exp2.png2006-08-23T17:07:03Z<p>Jcaceres: </p>
<hr />
<div></div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1422Mass project2006-08-23T16:56:24Z<p>Jcaceres: </p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
== Experiment 03 - Intelligibility ==<br />
=== Experiment design ===<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
=== Implementation ===<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Experiment 04 - Annoyance ==<br />
=== Experiment Design ===<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1421Mass project2006-08-23T16:50:07Z<p>Jcaceres: /* Experiment 04 */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
== Experiment 03 - Intelligibility ==<br />
=== Experiment design ===<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
== Experiment 04 - Annoyance ==<br />
=== Experiment Design ===<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== (updated) Efficiency Experiment design ==<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1420Mass project2006-08-23T16:49:19Z<p>Jcaceres: /* Experiment 03 - Intelligibility */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
== Experiment 03 - Intelligibility ==<br />
=== Experiment design ===<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
== Experiment 04 ==<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== (updated) Efficiency Experiment design ==<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1419Mass project2006-08-23T16:49:02Z<p>Jcaceres: /* Experiment 03 */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
== Experiment 03 - Intelligibility ==<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
== Experiment 04 ==<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== (updated) Efficiency Experiment design ==<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1418Mass project2006-08-23T16:48:21Z<p>Jcaceres: /* Experiment 02 */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 - Masker Refinement==<br />
=== Experiment design ===<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise.<br />
<br />
== Experiment 03 ==<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
== Experiment 04 ==<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== (updated) Efficiency Experiment design ==<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1417Mass project2006-08-23T16:45:33Z<p>Jcaceres: /* Parameters for the Noise Generation */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 ==<br />
what experiment to do<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise. <br />
<br />
== Experiment 03 ==<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
== Experiment 04 ==<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== FM Masking Noises ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3)<br />
<br />
== (updated) Efficiency Experiment design ==<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1416Mass project2006-08-23T16:43:47Z<p>Jcaceres: /* Experiments */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiment 01 - Beta Test ==<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
=== Strategies to define conditions for FM masing noise ===<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
=== Findings on the Beta Test ===<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Experiment 02 ==<br />
what experiment to do<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise. <br />
<br />
== Experiment 03 ==<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
== Experiment 04 ==<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== Parameters for the Noise Generation ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3) <br />
<br />
<br />
== (updated) Efficiency Experiment design ==<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1415Mass project2006-08-23T16:40:47Z<p>Jcaceres: </p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiments ==<br />
<br />
=== Experiment 01 - Beta Test ===<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
==== Strategies to define conditions for FM masing noise ====<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
==== Findings on the Beta Test ====<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
=== Experiment 02 ===<br />
what experiment to do<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise. <br />
<br />
=== Experiment 03 ===<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
=== Experiment 04 ===<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== Parameters for the Noise Generation ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3) <br />
<br />
<br />
== (updated) Efficiency Experiment design ==<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1414Mass project2006-08-23T16:37:28Z<p>Jcaceres: </p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiments ==<br />
<br />
=== Experiment 01 - Beta Test ===<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
==== Strategies to define conditions for FM masing noise ====<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
==== Findings on the Beta Test ====<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
<br />
== Parameters for the Noise Generation ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3) <br />
<br />
<br />
--[[User:Hiroko|Hiroko]] 18:27, 31 July 2006 (PDT)<br />
<br />
== Experiment design ==<br />
<br />
what experiment to do<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise. <br />
<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== (updated) Efficiency Experiment design ==<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1413Mass project2006-08-23T16:35:14Z<p>Jcaceres: /* Experiment 01 - Beta Test */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiments ==<br />
<br />
=== Experiment 01 - Beta Test ===<br />
<br />
[[Image:exp1GUI.png|thumb|GUI Experiment 1|250px|right|GUI Experiment 1]]<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
==== Strategies to define conditions for FM masing noise ====<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
==== Findings on the Beta Test ====<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Parameters for the Noise Generation ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3) <br />
<br />
<br />
--[[User:Hiroko|Hiroko]] 18:27, 31 July 2006 (PDT)<br />
<br />
== Experiment design ==<br />
<br />
what experiment to do<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise. <br />
<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== (updated) Efficiency Experiment design ==<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Atsuko's visit Agenda ==<br />
* Friday August 4th, <br />
*: 1pm Meeting (listening room)<br />
*: 5:30pm - Conference Call Japan<br />
* Saturday August 5th <br />
*: Noise, narrowing parameters.<br />
* Sunday August 6th<br />
*: Meetings with Jonathan Berger and Hiroko<br />
* MondayAugust 7th<br />
*: Pscychoacoustic generic tests (Hiroko)<br />
*: Brain storm spatializtion parts - experiments strategies<br />
*: Meeting, Jonathan Berger, Jason, Juan pablo, Atsuko and Hiroko.<br />
* Wednesday 9th, <br />
*: 1pm - Meeting<br />
*: 5:30pm - Conference Call Japan<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1412Mass project2006-08-23T16:26:37Z<p>Jcaceres: /* Beta Test */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiments ==<br />
<br />
=== Experiment 01 - Beta Test ===<br />
<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
[[Image:exp1GUI.png]]<br />
<br />
==== Strategies to define conditions for FM masing noise ====<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
==== Findings on the Beta Test ====<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Parameters for the Noise Generation ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3) <br />
<br />
<br />
--[[User:Hiroko|Hiroko]] 18:27, 31 July 2006 (PDT)<br />
<br />
== Experiment design ==<br />
<br />
what experiment to do<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise. <br />
<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== (updated) Efficiency Experiment design ==<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Atsuko's visit Agenda ==<br />
* Friday August 4th, <br />
*: 1pm Meeting (listening room)<br />
*: 5:30pm - Conference Call Japan<br />
* Saturday August 5th <br />
*: Noise, narrowing parameters.<br />
* Sunday August 6th<br />
*: Meetings with Jonathan Berger and Hiroko<br />
* MondayAugust 7th<br />
*: Pscychoacoustic generic tests (Hiroko)<br />
*: Brain storm spatializtion parts - experiments strategies<br />
*: Meeting, Jonathan Berger, Jason, Juan pablo, Atsuko and Hiroko.<br />
* Wednesday 9th, <br />
*: 1pm - Meeting<br />
*: 5:30pm - Conference Call Japan<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1411Mass project2006-08-22T01:09:17Z<p>Jcaceres: /* Conference Call Meetings */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiments ==<br />
<br />
=== Beta Test ===<br />
<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
Necessary ingredients: ('''x''' = done)<br />
# '''(x)''' ambient room sound recording from Tokyo <br />
# '''(x)''' 15 sec. recordings of FM noise masker with parameter variation<br />
# '''(x)''' 4 min. recordings of 4 conversations (animated / not-animated, crowd / pair, always 50% gender balance)<br />
# '''(x)''' 15 sec. clips cut from conversations<br />
# '''(x)''' convolved versions of 15 sec. files putting them "as if" in the hallway<br />
# '''(x)''' GUI for running randomized listening, A/B forced choice, logging results<br />
<br />
[[Image:exp1GUI.png]]<br />
<br />
==== Strategies to define conditions for FM masing noise ====<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
<br />
--[[User:Jcaceres|Jcaceres]] 17:09, 24 July 2006 (PDT)<br />
<br />
==== Beta test TODOs ====<br />
<br />
The beta-test of the experiment tool took longer than anticipated. Some minor fixes remain. The ones I remember from yesterday (Friday, 28th) and the ToDo list for Monday ('''x''' = done):<br />
# '''(x)''' delete input slider from bottom of GUI (in Qt Designer), final product should look like the picture above<br />
# '''(x)''' when user hits "OK, Next" button, clear all the radiobuttons, with radiobutton->setDown(false)<br />
#: This worked out with setChecked(false) (inside a method, not in the connection)<br />
# '''(x)''' comment out all the "cout" statements that are printing during trials, except for the one that says "behind"<br />
# '''(x)''' find a sticky way to keep machine speed at max during trial (automatic energy saving may be the reason for the occasional stuttering)<br />
#: Jason comments:<br />
#: /usr/bin/cpufreq-selector -g performance<br />
#: you will select the "performance" governor and the cpu speed should go to the max and stay there.<br />
#: /usr/bin/cpufreq-selector -g userspace<br />
#: will return the governor to the original "userspace" governor.<br />
#: And:<br />
#: /usr/bin/cpufreq-selector -f 1000000<br />
#: will get the processor to the slow idle speed.<br />
#: From there the speed should again be "on demand". Regretfully it looks like sometimes the background daemon ("cpuspeed") gets fooled by these changes and dies. At least you can control all of this manually.<br />
# '''(x)''' convert QString to const char for logger class file open (use const char * QString::latin1 ())<br />
# '''(x)''' create a "shuffle" sort method in MainDialog.cpp and apply it for the actual first test<br />
# I think each individual mono file repeating is ok, but I'm worried that they could slip out of sync. Don't know for sure. Better if the repeats for a group of four is from the first channel's repeat<br />
# add envelopes at all file starts, stops, repeats (with STK's Asymp class), pipe the file's output through it<br />
#: I still need to add this, but I think after the first experiments, what we really need to do is make much longer files so they don't repeat, and the listener don't get a queu from that repetition.<br />
# IF I've created a problem for disk files keeping up, you will see the message "behind" printed from FileWvIn and it will start stuttering, the next fix to try (and this might be important anyway for our sanity) is to go to quad files rather than 4 mono files for each layer.<br />
#: This doesn't look easy, I think I have to modify the entire Jukebox.cpp class in order to get this working...<br />
# '''(x)''' Add a dialog in case the user doesn't select an option.<br />
# '''(x)''' Change the silence always on A. Modify also the "correctness" of the selection, now is always set to be in A.<br />
# '''(x)''' Turn off Sounds (alternative A and B) when user goes to next case.<br />
# '''(x)''' Program is crashing at the end (it's quiting badly). If you go until the end, is not writing anything to the ouput files. If you stop it in the middle, it works. I may be probably a problem with some destructor...<br />
#: I get the problem with these test files:<br />
#: /usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/TEST/<br />
#: the message is:<br />
#: terminate called after throwing an instance of 'std::bad_alloc'<br />
#: what(): St9bad_alloc<br />
#: Aborted<br />
#: <br />
#: It works fine with these set of files:<br />
#: QString rootDir ("/usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/");<br />
#: FOUND IT!!! It was a problem reading a vector in MainDialog::setTrial (int n)<br />
<br />
<br />
<br />
_____________<br />
there are probably more things I'm forgetting, but this is close <br/><br />
GOOD LUCK!<br />
<br />
--[[User:Cc|Cc]] 09:42, 29 July 2006 (PDT)<br />
<br />
==== Findings on the Beta Test ====<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
<br />
==== Bottom lines ====<br />
<br />
# We're going to use just one room (Tokyo Office)<br />
# We keep the 4CH setup.<br />
# Spatialization ???<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Parameters for the Noise Generation ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3) <br />
<br />
<br />
--[[User:Hiroko|Hiroko]] 18:27, 31 July 2006 (PDT)<br />
<br />
== Experiment design ==<br />
<br />
what experiment to do<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise. <br />
<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== (updated) Efficiency Experiment design ==<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Atsuko's visit Agenda ==<br />
* Friday August 4th, <br />
*: 1pm Meeting (listening room)<br />
*: 5:30pm - Conference Call Japan<br />
* Saturday August 5th <br />
*: Noise, narrowing parameters.<br />
* Sunday August 6th<br />
*: Meetings with Jonathan Berger and Hiroko<br />
* MondayAugust 7th<br />
*: Pscychoacoustic generic tests (Hiroko)<br />
*: Brain storm spatializtion parts - experiments strategies<br />
*: Meeting, Jonathan Berger, Jason, Juan pablo, Atsuko and Hiroko.<br />
* Wednesday 9th, <br />
*: 1pm - Meeting<br />
*: 5:30pm - Conference Call Japan<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1410Mass project2006-08-22T01:08:41Z<p>Jcaceres: /* August 21, 2006 */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiments ==<br />
<br />
=== Beta Test ===<br />
<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
Necessary ingredients: ('''x''' = done)<br />
# '''(x)''' ambient room sound recording from Tokyo <br />
# '''(x)''' 15 sec. recordings of FM noise masker with parameter variation<br />
# '''(x)''' 4 min. recordings of 4 conversations (animated / not-animated, crowd / pair, always 50% gender balance)<br />
# '''(x)''' 15 sec. clips cut from conversations<br />
# '''(x)''' convolved versions of 15 sec. files putting them "as if" in the hallway<br />
# '''(x)''' GUI for running randomized listening, A/B forced choice, logging results<br />
<br />
[[Image:exp1GUI.png]]<br />
<br />
==== Strategies to define conditions for FM masing noise ====<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
<br />
--[[User:Jcaceres|Jcaceres]] 17:09, 24 July 2006 (PDT)<br />
<br />
==== Beta test TODOs ====<br />
<br />
The beta-test of the experiment tool took longer than anticipated. Some minor fixes remain. The ones I remember from yesterday (Friday, 28th) and the ToDo list for Monday ('''x''' = done):<br />
# '''(x)''' delete input slider from bottom of GUI (in Qt Designer), final product should look like the picture above<br />
# '''(x)''' when user hits "OK, Next" button, clear all the radiobuttons, with radiobutton->setDown(false)<br />
#: This worked out with setChecked(false) (inside a method, not in the connection)<br />
# '''(x)''' comment out all the "cout" statements that are printing during trials, except for the one that says "behind"<br />
# '''(x)''' find a sticky way to keep machine speed at max during trial (automatic energy saving may be the reason for the occasional stuttering)<br />
#: Jason comments:<br />
#: /usr/bin/cpufreq-selector -g performance<br />
#: you will select the "performance" governor and the cpu speed should go to the max and stay there.<br />
#: /usr/bin/cpufreq-selector -g userspace<br />
#: will return the governor to the original "userspace" governor.<br />
#: And:<br />
#: /usr/bin/cpufreq-selector -f 1000000<br />
#: will get the processor to the slow idle speed.<br />
#: From there the speed should again be "on demand". Regretfully it looks like sometimes the background daemon ("cpuspeed") gets fooled by these changes and dies. At least you can control all of this manually.<br />
# '''(x)''' convert QString to const char for logger class file open (use const char * QString::latin1 ())<br />
# '''(x)''' create a "shuffle" sort method in MainDialog.cpp and apply it for the actual first test<br />
# I think each individual mono file repeating is ok, but I'm worried that they could slip out of sync. Don't know for sure. Better if the repeats for a group of four is from the first channel's repeat<br />
# add envelopes at all file starts, stops, repeats (with STK's Asymp class), pipe the file's output through it<br />
#: I still need to add this, but I think after the first experiments, what we really need to do is make much longer files so they don't repeat, and the listener don't get a queu from that repetition.<br />
# IF I've created a problem for disk files keeping up, you will see the message "behind" printed from FileWvIn and it will start stuttering, the next fix to try (and this might be important anyway for our sanity) is to go to quad files rather than 4 mono files for each layer.<br />
#: This doesn't look easy, I think I have to modify the entire Jukebox.cpp class in order to get this working...<br />
# '''(x)''' Add a dialog in case the user doesn't select an option.<br />
# '''(x)''' Change the silence always on A. Modify also the "correctness" of the selection, now is always set to be in A.<br />
# '''(x)''' Turn off Sounds (alternative A and B) when user goes to next case.<br />
# '''(x)''' Program is crashing at the end (it's quiting badly). If you go until the end, is not writing anything to the ouput files. If you stop it in the middle, it works. I may be probably a problem with some destructor...<br />
#: I get the problem with these test files:<br />
#: /usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/TEST/<br />
#: the message is:<br />
#: terminate called after throwing an instance of 'std::bad_alloc'<br />
#: what(): St9bad_alloc<br />
#: Aborted<br />
#: <br />
#: It works fine with these set of files:<br />
#: QString rootDir ("/usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/");<br />
#: FOUND IT!!! It was a problem reading a vector in MainDialog::setTrial (int n)<br />
<br />
<br />
<br />
_____________<br />
there are probably more things I'm forgetting, but this is close <br/><br />
GOOD LUCK!<br />
<br />
--[[User:Cc|Cc]] 09:42, 29 July 2006 (PDT)<br />
<br />
==== Findings on the Beta Test ====<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
<br />
==== Bottom lines ====<br />
<br />
# We're going to use just one room (Tokyo Office)<br />
# We keep the 4CH setup.<br />
# Spatialization ???<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
<br />
=== August 28, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Parameters for the Noise Generation ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3) <br />
<br />
<br />
--[[User:Hiroko|Hiroko]] 18:27, 31 July 2006 (PDT)<br />
<br />
== Experiment design ==<br />
<br />
what experiment to do<br />
* efficiency test <br />
** Stimuli: speech is mixed at randomized places in a stream of masking noise<br />
** Task: "hit the space key when you hear a speech" <br />
** Speech: 5 numbers (one, two, three, four, eight) spoken by a male and a female of different accents. Numbers were chosen so that they cover five vowels. <br />
** Masker: Genetic algorithm approach with human response. We vary one parameter first and then find one or two "sweet spots." Fix the parameter to those found values and vary the next parameter. Choose the best two - repeat this process. <br />
** Analysis: Response rate (response rate is low when speech is masked, we expect.) Response time distribution (more response time when speech is better masked, we expect.) Both analyses can be done within-subject and across-subject. We can also observe what kind of speech is better masked with a particular masking noise. <br />
<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
** Task: a word list is presented at the start. A subject does mental math for 30 seconds (6-10 questions.) After the beep, the subject has to recall the word list presented at the start. Masking noise switches with fade in/out with an environmental noise. Do the same task with the next masking noise. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== (updated) Efficiency Experiment design ==<br />
<br />
I've programmed up experiment 3. This uses the Santa Barbara corpus clips in a design that produces a percentage measure of masker effectiveness. It's for one masker (the best one arrived at from experiment 2) at a fixed playback level.<br />
<br />
Jason has convolved the first SB dialog file, so it plays from the "hallway."<br />
<br />
The subject hears a 2 second clip which the app selects randomly from the convolved file. <br />
As it's playing the app records the maximum RMS of the first channel of the clip. <br />
The subject responds with "yes" or "no" buttons according to whether they heard voices.<br />
The app records the response and the maximum RMS played, and then loops, playing the next randomly chosen 2 second clip.<br />
<br />
This iterates a whole bunch of times over 5 minutes producing easily 50 trials per subject.<br />
The analysis plots the percentage of yes response vs. RMS. We should see a threshold RMS below which the clips were effectively masked. <br />
<br />
For the final "efficiency rating" we go back into the convolved dialog file and calculate the percentage of time the signal is below the threshold. <br />
<br />
== Atsuko's visit Agenda ==<br />
* Friday August 4th, <br />
*: 1pm Meeting (listening room)<br />
*: 5:30pm - Conference Call Japan<br />
* Saturday August 5th <br />
*: Noise, narrowing parameters.<br />
* Sunday August 6th<br />
*: Meetings with Jonathan Berger and Hiroko<br />
* MondayAugust 7th<br />
*: Pscychoacoustic generic tests (Hiroko)<br />
*: Brain storm spatializtion parts - experiments strategies<br />
*: Meeting, Jonathan Berger, Jason, Juan pablo, Atsuko and Hiroko.<br />
* Wednesday 9th, <br />
*: 1pm - Meeting<br />
*: 5:30pm - Conference Call Japan<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1396Mass project2006-08-10T01:16:07Z<p>Jcaceres: /* Conference Call Meetings */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiments ==<br />
<br />
=== Beta Test ===<br />
<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
Necessary ingredients: ('''x''' = done)<br />
# '''(x)''' ambient room sound recording from Tokyo <br />
# '''(x)''' 15 sec. recordings of FM noise masker with parameter variation<br />
# '''(x)''' 4 min. recordings of 4 conversations (animated / not-animated, crowd / pair, always 50% gender balance)<br />
# '''(x)''' 15 sec. clips cut from conversations<br />
# '''(x)''' convolved versions of 15 sec. files putting them "as if" in the hallway<br />
# '''(x)''' GUI for running randomized listening, A/B forced choice, logging results<br />
<br />
[[Image:exp1GUI.png]]<br />
<br />
==== Strategies to define conditions for FM masing noise ====<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
<br />
--[[User:Jcaceres|Jcaceres]] 17:09, 24 July 2006 (PDT)<br />
<br />
==== Beta test TODOs ====<br />
<br />
The beta-test of the experiment tool took longer than anticipated. Some minor fixes remain. The ones I remember from yesterday (Friday, 28th) and the ToDo list for Monday ('''x''' = done):<br />
# '''(x)''' delete input slider from bottom of GUI (in Qt Designer), final product should look like the picture above<br />
# '''(x)''' when user hits "OK, Next" button, clear all the radiobuttons, with radiobutton->setDown(false)<br />
#: This worked out with setChecked(false) (inside a method, not in the connection)<br />
# '''(x)''' comment out all the "cout" statements that are printing during trials, except for the one that says "behind"<br />
# '''(x)''' find a sticky way to keep machine speed at max during trial (automatic energy saving may be the reason for the occasional stuttering)<br />
#: Jason comments:<br />
#: /usr/bin/cpufreq-selector -g performance<br />
#: you will select the "performance" governor and the cpu speed should go to the max and stay there.<br />
#: /usr/bin/cpufreq-selector -g userspace<br />
#: will return the governor to the original "userspace" governor.<br />
#: And:<br />
#: /usr/bin/cpufreq-selector -f 1000000<br />
#: will get the processor to the slow idle speed.<br />
#: From there the speed should again be "on demand". Regretfully it looks like sometimes the background daemon ("cpuspeed") gets fooled by these changes and dies. At least you can control all of this manually.<br />
# '''(x)''' convert QString to const char for logger class file open (use const char * QString::latin1 ())<br />
# '''(x)''' create a "shuffle" sort method in MainDialog.cpp and apply it for the actual first test<br />
# I think each individual mono file repeating is ok, but I'm worried that they could slip out of sync. Don't know for sure. Better if the repeats for a group of four is from the first channel's repeat<br />
# add envelopes at all file starts, stops, repeats (with STK's Asymp class), pipe the file's output through it<br />
#: I still need to add this, but I think after the first experiments, what we really need to do is make much longer files so they don't repeat, and the listener don't get a queu from that repetition.<br />
# IF I've created a problem for disk files keeping up, you will see the message "behind" printed from FileWvIn and it will start stuttering, the next fix to try (and this might be important anyway for our sanity) is to go to quad files rather than 4 mono files for each layer.<br />
#: This doesn't look easy, I think I have to modify the entire Jukebox.cpp class in order to get this working...<br />
# '''(x)''' Add a dialog in case the user doesn't select an option.<br />
# '''(x)''' Change the silence always on A. Modify also the "correctness" of the selection, now is always set to be in A.<br />
# '''(x)''' Turn off Sounds (alternative A and B) when user goes to next case.<br />
# '''(x)''' Program is crashing at the end (it's quiting badly). If you go until the end, is not writing anything to the ouput files. If you stop it in the middle, it works. I may be probably a problem with some destructor...<br />
#: I get the problem with these test files:<br />
#: /usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/TEST/<br />
#: the message is:<br />
#: terminate called after throwing an instance of 'std::bad_alloc'<br />
#: what(): St9bad_alloc<br />
#: Aborted<br />
#: <br />
#: It works fine with these set of files:<br />
#: QString rootDir ("/usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/");<br />
#: FOUND IT!!! It was a problem reading a vector in MainDialog::setTrial (int n)<br />
<br />
<br />
<br />
_____________<br />
there are probably more things I'm forgetting, but this is close <br/><br />
GOOD LUCK!<br />
<br />
--[[User:Cc|Cc]] 09:42, 29 July 2006 (PDT)<br />
<br />
==== Findings on the Beta Test ====<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
<br />
==== Bottom lines ====<br />
<br />
# We're going to use just one room (Tokyo Office)<br />
# We keep the 4CH setup.<br />
# Spatialization ???<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
=== August 21, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
== Parameters for the Noise Generation ==<br />
<br />
Variables<br />
*modulation width (critical band or speech sounds)<br />
*modulation rate (0.01 - 0.1 fc) <br />
*sinusoidal or stochastic modulation<br />
<br />
Already fixed<br />
*with broadband noise (what shape, and how loud? - according to the speech)<br />
*band width of the noise (critical band)<br />
*amplitude of each channel (speech sounds spectral distribution)<br />
*number and frequency of center frequencies (3) <br />
<br />
<br />
--[[User:Hiroko|Hiroko]] 18:27, 31 July 2006 (PDT)<br />
<br />
== Experiment design ==<br />
* masking sounds with three variables<br />
* one sec. noise+speech, noise only comparison<br />
* all stimuli mixed and randomized<br />
<br />
problem of the beta-test<br />
* repeatable, long conversation -> Fixed length, no repeat<br />
* either one of two stimuli is expected with speech<br />
<br />
what experiment to do<br />
* efficiency test <br />
** Stimuli: Noise+speech (5 vowels w/ consonants) + no sound (noise only), one second stimuli, no repeat. <br />
** Question: "Is there a speech or not?" The answer is "Presence" or "Absence"<br />
** Masker: 30 Noises - Juan pablo defines these 30 kinds by whatever explainable strategy.<br />
** Speech: 5 words by a male and 5 words by a female. A word consists of a syllable and 5 words for 5 vowels. <br />
** Pros: very short and covers many stimuli.<br />
<br />
* intelligibility test<br />
** Speech sounds: Idiomatic phrases and isolated words (Each masking sound has a phrase and word) - TBD<br />
** 4 sec per stimuli (15 stimuli/min)<br />
** Measure audibility and intelligibility thresholds<br />
** Better masking noise / parameter region are chosen from the efficiency test. <br />
** The answers the subject choose from are: "I don't hear speech" "Speech is audible" "Speech is intelligible" <br />
** complete random order (beyond the group of (1) sentense/words (2) masking noise types and (3) playback level)<br />
** we prohibit a sequencial presentation of a same stimuli from intelligible to less intelligible. (Same masking noise being loud to quiet)<br />
** analysis - we do not check the correctness - we only measure the intelligible impression<br />
<br />
* annoyance test<br />
** ten kinds of masking noise, silence, white noise with intruding noise, presented from 4 loudspeakers around. <br />
** Each one goes on for 30 seconds (or any length) fading in and out for 5 seconds.<br />
** Fade in the masking noise. Start with the word list, mental math, beep and repeat the word list. Fade out and fade in some enviromental noise (office, traffic, college cafeteria etc.), then next masking noise. <br />
** Word list is presented to the subject from a loudspeaker in front at 60 dBA. <br />
<br />
* Final comparison<br />
** For best 3 masking noises, mix in the typical conference noise (speech, paper shuffle, chair noise, typing sounds, and intruding noise) and ask the subjects which one sounds more "inviting."<br />
<br />
== Atsuko's visit Agenda ==<br />
* Friday August 4th, <br />
*: 1pm Meeting (listening room)<br />
*: 5:30pm - Conference Call Japan<br />
* Saturday August 5th <br />
*: Noise, narrowing parameters.<br />
* Sunday August 6th<br />
*: Meetings with Jonathan Berger and Hiroko<br />
* MondayAugust 7th<br />
*: Pscychoacoustic generic tests (Hiroko)<br />
*: Brain storm spatializtion parts - experiments strategies<br />
*: Meeting, Jonathan Berger, Jason, Juan pablo, Atsuko and Hiroko.<br />
* Wednesday 9th, <br />
*: 1pm - Meeting<br />
*: 5:30pm - Conference Call Japan<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1375Mass project2006-08-05T15:17:38Z<p>Jcaceres: </p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
<br />
== Experiments ==<br />
<br />
=== Beta Test ===<br />
<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
Necessary ingredients: ('''x''' = done)<br />
# '''(x)''' ambient room sound recording from Tokyo <br />
# '''(x)''' 15 sec. recordings of FM noise masker with parameter variation<br />
# '''(x)''' 4 min. recordings of 4 conversations (animated / not-animated, crowd / pair, always 50% gender balance)<br />
# '''(x)''' 15 sec. clips cut from conversations<br />
# '''(x)''' convolved versions of 15 sec. files putting them "as if" in the hallway<br />
# '''(x)''' GUI for running randomized listening, A/B forced choice, logging results<br />
<br />
[[Image:exp1GUI.png]]<br />
<br />
==== Strategies to define conditions for FM masing noise ====<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
<br />
--[[User:Jcaceres|Jcaceres]] 17:09, 24 July 2006 (PDT)<br />
<br />
==== Beta test TODOs ====<br />
<br />
The beta-test of the experiment tool took longer than anticipated. Some minor fixes remain. The ones I remember from yesterday (Friday, 28th) and the ToDo list for Monday ('''x''' = done):<br />
# '''(x)''' delete input slider from bottom of GUI (in Qt Designer), final product should look like the picture above<br />
# '''(x)''' when user hits "OK, Next" button, clear all the radiobuttons, with radiobutton->setDown(false)<br />
#: This worked out with setChecked(false) (inside a method, not in the connection)<br />
# '''(x)''' comment out all the "cout" statements that are printing during trials, except for the one that says "behind"<br />
# '''(x)''' find a sticky way to keep machine speed at max during trial (automatic energy saving may be the reason for the occasional stuttering)<br />
#: Jason comments:<br />
#: /usr/bin/cpufreq-selector -g performance<br />
#: you will select the "performance" governor and the cpu speed should go to the max and stay there.<br />
#: /usr/bin/cpufreq-selector -g userspace<br />
#: will return the governor to the original "userspace" governor.<br />
#: And:<br />
#: /usr/bin/cpufreq-selector -f 1000000<br />
#: will get the processor to the slow idle speed.<br />
#: From there the speed should again be "on demand". Regretfully it looks like sometimes the background daemon ("cpuspeed") gets fooled by these changes and dies. At least you can control all of this manually.<br />
# '''(x)''' convert QString to const char for logger class file open (use const char * QString::latin1 ())<br />
# '''(x)''' create a "shuffle" sort method in MainDialog.cpp and apply it for the actual first test<br />
# I think each individual mono file repeating is ok, but I'm worried that they could slip out of sync. Don't know for sure. Better if the repeats for a group of four is from the first channel's repeat<br />
# add envelopes at all file starts, stops, repeats (with STK's Asymp class), pipe the file's output through it<br />
#: I still need to add this, but I think after the first experiments, what we really need to do is make much longer files so they don't repeat, and the listener don't get a queu from that repetition.<br />
# IF I've created a problem for disk files keeping up, you will see the message "behind" printed from FileWvIn and it will start stuttering, the next fix to try (and this might be important anyway for our sanity) is to go to quad files rather than 4 mono files for each layer.<br />
#: This doesn't look easy, I think I have to modify the entire Jukebox.cpp class in order to get this working...<br />
# '''(x)''' Add a dialog in case the user doesn't select an option.<br />
# '''(x)''' Change the silence always on A. Modify also the "correctness" of the selection, now is always set to be in A.<br />
# '''(x)''' Turn off Sounds (alternative A and B) when user goes to next case.<br />
# '''(x)''' Program is crashing at the end (it's quiting badly). If you go until the end, is not writing anything to the ouput files. If you stop it in the middle, it works. I may be probably a problem with some destructor...<br />
#: I get the problem with these test files:<br />
#: /usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/TEST/<br />
#: the message is:<br />
#: terminate called after throwing an instance of 'std::bad_alloc'<br />
#: what(): St9bad_alloc<br />
#: Aborted<br />
#: <br />
#: It works fine with these set of files:<br />
#: QString rootDir ("/usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/");<br />
#: FOUND IT!!! It was a problem reading a vector in MainDialog::setTrial (int n)<br />
<br />
<br />
<br />
_____________<br />
there are probably more things I'm forgetting, but this is close <br/><br />
GOOD LUCK!<br />
<br />
--[[User:Cc|Cc]] 09:42, 29 July 2006 (PDT)<br />
<br />
==== Findings on the Beta Test ====<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
<br />
==== Bottom lines ====<br />
<br />
# We're going to use just one room (Tokyo Office)<br />
# We keep the 4CH setup.<br />
# Spatialization ???<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
<br />
== Parameters for the Noise Generation ==<br />
<br />
*modulation width (critical band or speech frequencies)<br />
*band width of the noise (critical band)<br />
*modulation frequency (2, 5, 7 or more?) <br />
*number of center frequencies (0, 3, 8) <br />
*levels of the channels (low, mid, high)<br />
*with or without broadband noise (0, med, high)<br />
--[[User:Hiroko|Hiroko]] 18:27, 31 July 2006 (PDT)<br />
<br />
== Atsuko's visit Agenda ==<br />
* Friday August 4th, <br />
*: 1pm Meeting (listening room)<br />
*: 5:30pm - Conference Call Japan<br />
* Saturday August 5th <br />
*: Noise, narrowing parameters.<br />
* Sunday August 6th<br />
*: Meetings with Jonathan Berger and Hiroko<br />
* MondayAugust 7th<br />
*: Pscychoacoustic generic tests (Hiroko)<br />
*: Brain storm spatializtion parts - experiments strategies<br />
*: Meeting, Jonathan Berger, Jason, Juan pablo, Atsuko and Hiroko.<br />
* Wednesday 9th, <br />
*: 1pm - Meeting<br />
*: 5:30pm - Conference Call Japan<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1374Mass project2006-08-04T21:23:09Z<p>Jcaceres: /* Experiment 1 */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
The project is on-track to begin listening trials on the 21st of July.<br />
<br />
== Experiments ==<br />
<br />
=== Beta Test ===<br />
<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
Necessary ingredients: ('''x''' = done)<br />
# '''(x)''' ambient room sound recording from Tokyo <br />
# '''(x)''' 15 sec. recordings of FM noise masker with parameter variation<br />
# '''(x)''' 4 min. recordings of 4 conversations (animated / not-animated, crowd / pair, always 50% gender balance)<br />
# '''(x)''' 15 sec. clips cut from conversations<br />
# '''(x)''' convolved versions of 15 sec. files putting them "as if" in the hallway<br />
# '''(x)''' GUI for running randomized listening, A/B forced choice, logging results<br />
<br />
[[Image:exp1GUI.png]]<br />
<br />
==== Strategies to define conditions for FM masing noise ====<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
<br />
--[[User:Jcaceres|Jcaceres]] 17:09, 24 July 2006 (PDT)<br />
<br />
==== Beta test TODOs ====<br />
<br />
The beta-test of the experiment tool took longer than anticipated. Some minor fixes remain. The ones I remember from yesterday (Friday, 28th) and the ToDo list for Monday ('''x''' = done):<br />
# '''(x)''' delete input slider from bottom of GUI (in Qt Designer), final product should look like the picture above<br />
# '''(x)''' when user hits "OK, Next" button, clear all the radiobuttons, with radiobutton->setDown(false)<br />
#: This worked out with setChecked(false) (inside a method, not in the connection)<br />
# '''(x)''' comment out all the "cout" statements that are printing during trials, except for the one that says "behind"<br />
# '''(x)''' find a sticky way to keep machine speed at max during trial (automatic energy saving may be the reason for the occasional stuttering)<br />
#: Jason comments:<br />
#: /usr/bin/cpufreq-selector -g performance<br />
#: you will select the "performance" governor and the cpu speed should go to the max and stay there.<br />
#: /usr/bin/cpufreq-selector -g userspace<br />
#: will return the governor to the original "userspace" governor.<br />
#: And:<br />
#: /usr/bin/cpufreq-selector -f 1000000<br />
#: will get the processor to the slow idle speed.<br />
#: From there the speed should again be "on demand". Regretfully it looks like sometimes the background daemon ("cpuspeed") gets fooled by these changes and dies. At least you can control all of this manually.<br />
# '''(x)''' convert QString to const char for logger class file open (use const char * QString::latin1 ())<br />
# '''(x)''' create a "shuffle" sort method in MainDialog.cpp and apply it for the actual first test<br />
# I think each individual mono file repeating is ok, but I'm worried that they could slip out of sync. Don't know for sure. Better if the repeats for a group of four is from the first channel's repeat<br />
# add envelopes at all file starts, stops, repeats (with STK's Asymp class), pipe the file's output through it<br />
#: I still need to add this, but I think after the first experiments, what we really need to do is make much longer files so they don't repeat, and the listener don't get a queu from that repetition.<br />
# IF I've created a problem for disk files keeping up, you will see the message "behind" printed from FileWvIn and it will start stuttering, the next fix to try (and this might be important anyway for our sanity) is to go to quad files rather than 4 mono files for each layer.<br />
#: This doesn't look easy, I think I have to modify the entire Jukebox.cpp class in order to get this working...<br />
# '''(x)''' Add a dialog in case the user doesn't select an option.<br />
# '''(x)''' Change the silence always on A. Modify also the "correctness" of the selection, now is always set to be in A.<br />
# '''(x)''' Turn off Sounds (alternative A and B) when user goes to next case.<br />
# '''(x)''' Program is crashing at the end (it's quiting badly). If you go until the end, is not writing anything to the ouput files. If you stop it in the middle, it works. I may be probably a problem with some destructor...<br />
#: I get the problem with these test files:<br />
#: /usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/TEST/<br />
#: the message is:<br />
#: terminate called after throwing an instance of 'std::bad_alloc'<br />
#: what(): St9bad_alloc<br />
#: Aborted<br />
#: <br />
#: It works fine with these set of files:<br />
#: QString rootDir ("/usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/");<br />
#: FOUND IT!!! It was a problem reading a vector in MainDialog::setTrial (int n)<br />
<br />
<br />
<br />
_____________<br />
there are probably more things I'm forgetting, but this is close <br/><br />
GOOD LUCK!<br />
<br />
--[[User:Cc|Cc]] 09:42, 29 July 2006 (PDT)<br />
<br />
==== Findings on the Beta Test ====<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
<br />
==== Bottom lines ====<br />
<br />
# We're going to use just one room (Tokyo Office)<br />
# We keep the 4CH setup.<br />
# Spatialization ???<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
<br />
== Parameters for the Noise Generation ==<br />
<br />
*modulation width (critical band or speech frequencies)<br />
*band width of the noise (critical band)<br />
*modulation frequency (2, 5, 7 or more?) <br />
*number of center frequencies (0, 3, 8) <br />
*levels of the channels (low, mid, high)<br />
*with or without broadband noise (0, med, high)<br />
--[[User:Hiroko|Hiroko]] 18:27, 31 July 2006 (PDT)<br />
<br />
== Atsuko's visit Agenda ==<br />
* Friday August 4th, <br />
*: 1pm Meeting (listening room)<br />
*: 5:30pm - Conference Call Japan<br />
* Saturday August 5th <br />
*: Noise, narrowing parameters.<br />
* Sunday August 6th<br />
*: Meetings with Jonathan Berger and Hiroko<br />
* MondayAugust 7th<br />
*: Pscychoacoustic generic tests (Hiroko)<br />
*: Brain storm spatializtion parts - experiments strategies<br />
*: Meeting, Jonathan Berger, Jason, Juan pablo, Atsuko and Hiroko.<br />
* Wednesday 9th, <br />
*: 1pm - Meeting<br />
*: 5:30pm - Conference Call Japan<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1373Mass project2006-08-04T21:08:53Z<p>Jcaceres: /* Findings on the Beta Test */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
The project is on-track to begin listening trials on the 21st of July.<br />
<br />
== Experiments ==<br />
<br />
=== Experiment 1 ===<br />
<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
Necessary ingredients: ('''x''' = done)<br />
# '''(x)''' ambient room sound recording from Tokyo <br />
# '''(x)''' 15 sec. recordings of FM noise masker with parameter variation<br />
# '''(x)''' 4 min. recordings of 4 conversations (animated / not-animated, crowd / pair, always 50% gender balance)<br />
# '''(x)''' 15 sec. clips cut from conversations<br />
# '''(x)''' convolved versions of 15 sec. files putting them "as if" in the hallway<br />
# '''(x)''' GUI for running randomized listening, A/B forced choice, logging results<br />
<br />
[[Image:exp1GUI.png]]<br />
<br />
==== Strategies to define conditions for FM masing noise ====<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
<br />
--[[User:Jcaceres|Jcaceres]] 17:09, 24 July 2006 (PDT)<br />
<br />
==== Beta test TODOs ====<br />
<br />
The beta-test of the experiment tool took longer than anticipated. Some minor fixes remain. The ones I remember from yesterday (Friday, 28th) and the ToDo list for Monday ('''x''' = done):<br />
# '''(x)''' delete input slider from bottom of GUI (in Qt Designer), final product should look like the picture above<br />
# '''(x)''' when user hits "OK, Next" button, clear all the radiobuttons, with radiobutton->setDown(false)<br />
#: This worked out with setChecked(false) (inside a method, not in the connection)<br />
# '''(x)''' comment out all the "cout" statements that are printing during trials, except for the one that says "behind"<br />
# '''(x)''' find a sticky way to keep machine speed at max during trial (automatic energy saving may be the reason for the occasional stuttering)<br />
#: Jason comments:<br />
#: /usr/bin/cpufreq-selector -g performance<br />
#: you will select the "performance" governor and the cpu speed should go to the max and stay there.<br />
#: /usr/bin/cpufreq-selector -g userspace<br />
#: will return the governor to the original "userspace" governor.<br />
#: And:<br />
#: /usr/bin/cpufreq-selector -f 1000000<br />
#: will get the processor to the slow idle speed.<br />
#: From there the speed should again be "on demand". Regretfully it looks like sometimes the background daemon ("cpuspeed") gets fooled by these changes and dies. At least you can control all of this manually.<br />
# '''(x)''' convert QString to const char for logger class file open (use const char * QString::latin1 ())<br />
# '''(x)''' create a "shuffle" sort method in MainDialog.cpp and apply it for the actual first test<br />
# I think each individual mono file repeating is ok, but I'm worried that they could slip out of sync. Don't know for sure. Better if the repeats for a group of four is from the first channel's repeat<br />
# add envelopes at all file starts, stops, repeats (with STK's Asymp class), pipe the file's output through it<br />
#: I still need to add this, but I think after the first experiments, what we really need to do is make much longer files so they don't repeat, and the listener don't get a queu from that repetition.<br />
# IF I've created a problem for disk files keeping up, you will see the message "behind" printed from FileWvIn and it will start stuttering, the next fix to try (and this might be important anyway for our sanity) is to go to quad files rather than 4 mono files for each layer.<br />
#: This doesn't look easy, I think I have to modify the entire Jukebox.cpp class in order to get this working...<br />
# '''(x)''' Add a dialog in case the user doesn't select an option.<br />
# '''(x)''' Change the silence always on A. Modify also the "correctness" of the selection, now is always set to be in A.<br />
# '''(x)''' Turn off Sounds (alternative A and B) when user goes to next case.<br />
# '''(x)''' Program is crashing at the end (it's quiting badly). If you go until the end, is not writing anything to the ouput files. If you stop it in the middle, it works. I may be probably a problem with some destructor...<br />
#: I get the problem with these test files:<br />
#: /usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/TEST/<br />
#: the message is:<br />
#: terminate called after throwing an instance of 'std::bad_alloc'<br />
#: what(): St9bad_alloc<br />
#: Aborted<br />
#: <br />
#: It works fine with these set of files:<br />
#: QString rootDir ("/usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/");<br />
#: FOUND IT!!! It was a problem reading a vector in MainDialog::setTrial (int n)<br />
<br />
<br />
<br />
_____________<br />
there are probably more things I'm forgetting, but this is close <br/><br />
GOOD LUCK!<br />
<br />
--[[User:Cc|Cc]] 09:42, 29 July 2006 (PDT)<br />
<br />
==== Findings on the Beta Test ====<br />
<br />
# There is a low frequency of the voice that now is not beeing masked.<br />
# We need to use a really long conversation, that does never repeat during the experiment.<br />
# This corpus of conversations need to have "stationary" properties.<br />
<br />
<br />
==== Bottom lines ====<br />
<br />
# We're going to use just one room (Tokyo Office)<br />
# We keep the 4CH setup.<br />
# Spatialization ???<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
<br />
== Parameters for the Noise Generation ==<br />
<br />
*modulation width (critical band or speech frequencies)<br />
*band width of the noise (critical band)<br />
*modulation frequency (2, 5, 7 or more?) <br />
*number of center frequencies (0, 3, 8) <br />
*levels of the channels (low, mid, high)<br />
*with or without broadband noise (0, med, high)<br />
--[[User:Hiroko|Hiroko]] 18:27, 31 July 2006 (PDT)<br />
<br />
== Atsuko's visit Agenda ==<br />
* Friday August 4th, <br />
*: 1pm Meeting (listening room)<br />
*: 5:30pm - Conference Call Japan<br />
* Saturday August 5th <br />
*: Noise, narrowing parameters.<br />
* Sunday August 6th<br />
*: Meetings with Jonathan Berger and Hiroko<br />
* MondayAugust 7th<br />
*: Pscychoacoustic generic tests (Hiroko)<br />
*: Brain storm spatializtion parts - experiments strategies<br />
*: Meeting, Jonathan Berger, Jason, Juan pablo, Atsuko and Hiroko.<br />
* Wednesday 9th, <br />
*: 1pm - Meeting<br />
*: 5:30pm - Conference Call Japan<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1372Mass project2006-08-04T20:28:37Z<p>Jcaceres: /* Beta test TODOs */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
The project is on-track to begin listening trials on the 21st of July.<br />
<br />
== Experiments ==<br />
<br />
=== Experiment 1 ===<br />
<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
Necessary ingredients: ('''x''' = done)<br />
# '''(x)''' ambient room sound recording from Tokyo <br />
# '''(x)''' 15 sec. recordings of FM noise masker with parameter variation<br />
# '''(x)''' 4 min. recordings of 4 conversations (animated / not-animated, crowd / pair, always 50% gender balance)<br />
# '''(x)''' 15 sec. clips cut from conversations<br />
# '''(x)''' convolved versions of 15 sec. files putting them "as if" in the hallway<br />
# '''(x)''' GUI for running randomized listening, A/B forced choice, logging results<br />
<br />
[[Image:exp1GUI.png]]<br />
<br />
==== Strategies to define conditions for FM masing noise ====<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
<br />
--[[User:Jcaceres|Jcaceres]] 17:09, 24 July 2006 (PDT)<br />
<br />
==== Beta test TODOs ====<br />
<br />
The beta-test of the experiment tool took longer than anticipated. Some minor fixes remain. The ones I remember from yesterday (Friday, 28th) and the ToDo list for Monday ('''x''' = done):<br />
# '''(x)''' delete input slider from bottom of GUI (in Qt Designer), final product should look like the picture above<br />
# '''(x)''' when user hits "OK, Next" button, clear all the radiobuttons, with radiobutton->setDown(false)<br />
#: This worked out with setChecked(false) (inside a method, not in the connection)<br />
# '''(x)''' comment out all the "cout" statements that are printing during trials, except for the one that says "behind"<br />
# '''(x)''' find a sticky way to keep machine speed at max during trial (automatic energy saving may be the reason for the occasional stuttering)<br />
#: Jason comments:<br />
#: /usr/bin/cpufreq-selector -g performance<br />
#: you will select the "performance" governor and the cpu speed should go to the max and stay there.<br />
#: /usr/bin/cpufreq-selector -g userspace<br />
#: will return the governor to the original "userspace" governor.<br />
#: And:<br />
#: /usr/bin/cpufreq-selector -f 1000000<br />
#: will get the processor to the slow idle speed.<br />
#: From there the speed should again be "on demand". Regretfully it looks like sometimes the background daemon ("cpuspeed") gets fooled by these changes and dies. At least you can control all of this manually.<br />
# '''(x)''' convert QString to const char for logger class file open (use const char * QString::latin1 ())<br />
# '''(x)''' create a "shuffle" sort method in MainDialog.cpp and apply it for the actual first test<br />
# I think each individual mono file repeating is ok, but I'm worried that they could slip out of sync. Don't know for sure. Better if the repeats for a group of four is from the first channel's repeat<br />
# add envelopes at all file starts, stops, repeats (with STK's Asymp class), pipe the file's output through it<br />
#: I still need to add this, but I think after the first experiments, what we really need to do is make much longer files so they don't repeat, and the listener don't get a queu from that repetition.<br />
# IF I've created a problem for disk files keeping up, you will see the message "behind" printed from FileWvIn and it will start stuttering, the next fix to try (and this might be important anyway for our sanity) is to go to quad files rather than 4 mono files for each layer.<br />
#: This doesn't look easy, I think I have to modify the entire Jukebox.cpp class in order to get this working...<br />
# '''(x)''' Add a dialog in case the user doesn't select an option.<br />
# '''(x)''' Change the silence always on A. Modify also the "correctness" of the selection, now is always set to be in A.<br />
# '''(x)''' Turn off Sounds (alternative A and B) when user goes to next case.<br />
# '''(x)''' Program is crashing at the end (it's quiting badly). If you go until the end, is not writing anything to the ouput files. If you stop it in the middle, it works. I may be probably a problem with some destructor...<br />
#: I get the problem with these test files:<br />
#: /usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/TEST/<br />
#: the message is:<br />
#: terminate called after throwing an instance of 'std::bad_alloc'<br />
#: what(): St9bad_alloc<br />
#: Aborted<br />
#: <br />
#: It works fine with these set of files:<br />
#: QString rootDir ("/usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/");<br />
#: FOUND IT!!! It was a problem reading a vector in MainDialog::setTrial (int n)<br />
<br />
<br />
<br />
_____________<br />
there are probably more things I'm forgetting, but this is close <br/><br />
GOOD LUCK!<br />
<br />
--[[User:Cc|Cc]] 09:42, 29 July 2006 (PDT)<br />
<br />
==== Findings on the Beta Test ====<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
<br />
== Parameters for the Noise Generation ==<br />
<br />
*modulation width (critical band or speech frequencies)<br />
*band width of the noise (critical band)<br />
*modulation frequency (2, 5, 7 or more?) <br />
*number of center frequencies (0, 3, 8) <br />
*levels of the channels (low, mid, high)<br />
*with or without broadband noise (0, med, high)<br />
--[[User:Hiroko|Hiroko]] 18:27, 31 July 2006 (PDT)<br />
<br />
== Atsuko's visit Agenda ==<br />
* Friday August 4th, <br />
*: 1pm Meeting (listening room)<br />
*: 5:30pm - Conference Call Japan<br />
* Saturday August 5th <br />
*: Noise, narrowing parameters.<br />
* Sunday August 6th<br />
*: Meetings with Jonathan Berger and Hiroko<br />
* MondayAugust 7th<br />
*: Pscychoacoustic generic tests (Hiroko)<br />
*: Brain storm spatializtion parts - experiments strategies<br />
*: Meeting, Jonathan Berger, Jason, Juan pablo, Atsuko and Hiroko.<br />
* Wednesday 9th, <br />
*: 1pm - Meeting<br />
*: 5:30pm - Conference Call Japan<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcacereshttps://ccrma.stanford.edu/mediawiki/index.php?title=Mass_project&diff=1371Mass project2006-08-04T20:24:28Z<p>Jcaceres: /* Strategies to define conditions for FM masing noise */</p>
<hr />
<div>Welcome to the Masking Ambient Speech Sounds project Wiki.<br />
The project is on-track to begin listening trials on the 21st of July.<br />
<br />
== Experiments ==<br />
<br />
=== Experiment 1 ===<br />
<br />
The first listening tests will involve project staff members to check if things make sense. If it looks good we'll start working with non-project volunteers. ''Experiment 1'', in the the CCRMA "Pit," will take about 30 mins. and involve 30 trials. There will be 6 conditions of masking sound crossed with 5 conditions of speech sounds. The masker (FM noise) and the speech sounds will be presented as if the sources are outside the room. We'll use the measured room model from Tokyo and the exterior sound source position (hallway). The "as if" impression will be created by convolving with the measured impulse responses.<br />
<br />
Necessary ingredients: ('''x''' = done)<br />
# '''(x)''' ambient room sound recording from Tokyo <br />
# '''(x)''' 15 sec. recordings of FM noise masker with parameter variation<br />
# '''(x)''' 4 min. recordings of 4 conversations (animated / not-animated, crowd / pair, always 50% gender balance)<br />
# '''(x)''' 15 sec. clips cut from conversations<br />
# '''(x)''' convolved versions of 15 sec. files putting them "as if" in the hallway<br />
# '''(x)''' GUI for running randomized listening, A/B forced choice, logging results<br />
<br />
[[Image:exp1GUI.png]]<br />
<br />
==== Strategies to define conditions for FM masing noise ====<br />
<br />
To define the conditions of this first experiment, the approach will be to leave all the parameters fixed, except the modulation frequency.<br />
<br />
[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/experiment01_noises/ Noise set] Contains a complete technical documentation of the masking noise generation. It also contains the soundfiles.<br />
<br />
The conditions of the masking FM noise will be defined by the following criteria:<br />
* 3 bands of FM noise will be used (centered at 200 350 and 500 Hz):<br />
*: This bands are selected based on an analysis of speech voice '''recorded''' in the Tokyo office. The motivation behind this decision is to identify the relevant parameters in the leaking voice. For example, we know that the wall is filtering much of the high frequency components, so that's relevant in the selection of the main frequencies.<br />
* The amplitude (volume) of each band will be fixed:<br />
*: The amplitude was tuned in order to psychoacoustically balance the level of the three noise bands that will be used. This balance was done without modulation.<br />
* The amplitude of the modulation will be proportional to the modulation frequency:<br />
*: The motivation behind this choice is to minimize the annoyance effect. When the modulation rate is low, higher amplitudes are more noticed and annoying.<br />
* The relation between of modulation frequency of the 3 bands is then the main factor to define the conditions:<br />
*: For this experiment, 3 modulation rates are selected, 2, 5 and 7 Hz. The idea is to span some of the frequencies in the range of 2 to 7 Hz. Basically, all the combination of these 3 rates are used for each center frequency, plus a case with no modulation at all.<br />
<br />
<br />
--[[User:Jcaceres|Jcaceres]] 17:09, 24 July 2006 (PDT)<br />
<br />
==== Beta test TODOs ====<br />
<br />
The beta-test of the experiment tool took longer than anticipated. Some minor fixes remain. The ones I remember from yesterday (Friday, 28th) and the ToDo list for Monday ('''x''' = done):<br />
# '''(x)''' delete input slider from bottom of GUI (in Qt Designer), final product should look like the picture above<br />
# '''(x)''' when user hits "OK, Next" button, clear all the radiobuttons, with radiobutton->setDown(false)<br />
#: This worked out with setChecked(false) (inside a method, not in the connection)<br />
# '''(x)''' comment out all the "cout" statements that are printing during trials, except for the one that says "behind"<br />
# '''(x)''' find a sticky way to keep machine speed at max during trial (automatic energy saving may be the reason for the occasional stuttering)<br />
#: Jason comments:<br />
#: /usr/bin/cpufreq-selector -g performance<br />
#: you will select the "performance" governor and the cpu speed should go to the max and stay there.<br />
#: /usr/bin/cpufreq-selector -g userspace<br />
#: will return the governor to the original "userspace" governor.<br />
#: And:<br />
#: /usr/bin/cpufreq-selector -f 1000000<br />
#: will get the processor to the slow idle speed.<br />
#: From there the speed should again be "on demand". Regretfully it looks like sometimes the background daemon ("cpuspeed") gets fooled by these changes and dies. At least you can control all of this manually.<br />
# '''(x)''' convert QString to const char for logger class file open (use const char * QString::latin1 ())<br />
# '''(x)''' create a "shuffle" sort method in MainDialog.cpp and apply it for the actual first test<br />
# I think each individual mono file repeating is ok, but I'm worried that they could slip out of sync. Don't know for sure. Better if the repeats for a group of four is from the first channel's repeat<br />
# add envelopes at all file starts, stops, repeats (with STK's Asymp class), pipe the file's output through it<br />
#: I still need to add this, but I think after the first experiments, what we really need to do is make much longer files so they don't repeat, and the listener don't get a queu from that repetition.<br />
# IF I've created a problem for disk files keeping up, you will see the message "behind" printed from FileWvIn and it will start stuttering, the next fix to try (and this might be important anyway for our sanity) is to go to quad files rather than 4 mono files for each layer.<br />
#: This doesn't look easy, I think I have to modify the entire Jukebox.cpp class in order to get this working...<br />
# '''(x)''' Add a dialog in case the user doesn't select an option.<br />
# '''(x)''' Change the silence always on A. Modify also the "correctness" of the selection, now is always set to be in A.<br />
# '''(x)''' Turn off Sounds (alternative A and B) when user goes to next case.<br />
# '''(x)''' Program is crashing at the end (it's quiting badly). If you go until the end, is not writing anything to the ouput files. If you stop it in the middle, it works. I may be probably a problem with some destructor...<br />
#: I get the problem with these test files:<br />
#: /usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/TEST/<br />
#: the message is:<br />
#: terminate called after throwing an instance of 'std::bad_alloc'<br />
#: what(): St9bad_alloc<br />
#: Aborted<br />
#: <br />
#: It works fine with these set of files:<br />
#: QString rootDir ("/usr/ccrma/snd/jcaceres/yamaha/recordings/experiments/experiment01/");<br />
#: FOUND IT!!! It was a problem reading a vector in MainDialog::setTrial (int n)<br />
<br />
<br />
<br />
_____________<br />
there are probably more things I'm forgetting, but this is close <br/><br />
GOOD LUCK!<br />
<br />
--[[User:Cc|Cc]] 09:42, 29 July 2006 (PDT)<br />
<br />
== Conference Call Meetings ==<br />
<br />
=== July 18, 2006 ===<br />
*FM Modulation discussion (Yasushi's Comments, with Juan-Pablo's comment on answer A:):<br />
# Do you have any idea how to specify frequency modulation for each frequency band?<br />
#* A: based on speech freq, ~2-8 Hz<br />
# The period in time for each frequency should be the same?<br />
#* A: No, different. When it's the same the masking efficiency decreases. It seems also more anoying.<br />
# Modulation speed will be getting faster according to higher frequency, or<br />
#* A: I don't know yet, this is going to be the main parameter in the first experiment I think.<br />
# The frequency modulation considering the voice sound<br />
# We have to analyze how the voice sound is modulated in different frequency bands?<br />
#* A: I thiks this is the best way, and we have to consider that the wall is filtering almost all the high frequencies.<br />
<br />
*Discussion of the experiment setup.<br />
<br />
*Look at the documentation, the new example of impulse responses, and delay of arrival.<br />
<br />
=== July 24, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
<br />
* Discuss Experiment 1.<br />
* Ask Atsuko about calibration files and SPL meeter.<br />
* Comment diffusion in the Pit with PZM system (Hiroko).<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
<br />
=== July 31, 2006 ===<br />
Tuesday 9:30AM '''Japan''' - Monday 5:30PM '''Stanford'''<br />
* Discuss Experiment Design writen by Hiroko and Atsuko.<br />
* Explain experiment setup.<br />
* Discuss Atsuko's agenda at CCRMA.<br />
* Goals for this week are to finsih the setup (C++ and pit room) and collect and analyse some data in a couple of subjects.<br />
<br />
<br />
== Parameters for the Noise Generation ==<br />
<br />
*modulation width (critical band or speech frequencies)<br />
*band width of the noise (critical band)<br />
*modulation frequency (2, 5, 7 or more?) <br />
*number of center frequencies (0, 3, 8) <br />
*levels of the channels (low, mid, high)<br />
*with or without broadband noise (0, med, high)<br />
--[[User:Hiroko|Hiroko]] 18:27, 31 July 2006 (PDT)<br />
<br />
== Atsuko's visit Agenda ==<br />
* Friday August 4th, <br />
*: 1pm Meeting (listening room)<br />
*: 5:30pm - Conference Call Japan<br />
* Saturday August 5th <br />
*: Noise, narrowing parameters.<br />
* Sunday August 6th<br />
*: Meetings with Jonathan Berger and Hiroko<br />
* MondayAugust 7th<br />
*: Pscychoacoustic generic tests (Hiroko)<br />
*: Brain storm spatializtion parts - experiments strategies<br />
*: Meeting, Jonathan Berger, Jason, Juan pablo, Atsuko and Hiroko.<br />
* Wednesday 9th, <br />
*: 1pm - Meeting<br />
*: 5:30pm - Conference Call Japan<br />
<br />
== Links ==<br />
<br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/ MASS Technical documentation], we are generating this documentation from the Matlab scripts. All the functions created are also documented.<br />
*[http://ccrma.stanford.edu/~hiroko/yamaha/ Mass project - support materials by Hiroko], with pictures, sounds and PDF documents on psychoacoustic experiment. <br />
*[http://ccrma.stanford.edu/~jcaceres/yamaha/documentation/expy_cpp/html/inherits.html Experiment C++ Source Code Documentation]</div>Jcaceres