Britt Carr (Miami University)
Abstract
Originally conceived to help music students conquer stage fright, the virtual audience — a life-sized video audience — is a unique combination of webcam technology, artificial intelligence and Flash streaming video. The virtual audience listens, watches and reacts to students as they perform music or drama, or give speeches. This presentation, which includes a demonstration, will focus on the tool’s simple construction and the outcome of the first semester of real-use and future implementations. For more information and a free download of the Virtual Audience, please visit my blog: http://learningactivities.wordpress.com/.

A still image of the Virtual Audience reacting to a student’s performance.
The Problem: Stage Fright
I was first approached by Dr. Harvey Thurmer and Professor Michele Gingras, two faculty members in the Miami University’s Department of Music, about making a digital audience for their students. Both were in a Faculty Learning Community, and both were looking to further integrate technology into their teaching. They both had also witnessed the phenomenon of their students suffering from crippling stage fright, even when well-practiced and prepared for performances. They were looking for a way to combat the problem digitally.
As explained to me by Professor Gingras, some music students suffer from well-established glossophobia (performance anxiety). They may have practiced their musical piece a thousand times, knowing it so well that they could carry-on a conversation with a close friend while performing. They can simply relax and let their mind wander while their muscles replay the piece when they are in the privacy of a practice studio, in their room, or at home.
However, once students perform, problems arise. While on stage and very self-aware, students try to allow muscle memory, breathing, fingertips, and/or lips to work in synchrony, and replicate all that they practiced. But once an audience member moves or sneezes or coughs, performers lose their concentration and a world of self-consciousness takes over. Performers realize that the audience is there, watching them. Rather than relaxing and letting the body carry out its assigned duties, musicians (or actors or speakers) may begin to over-think and become hyper-focused on their behavior or appearance. Worse, individuals with stage fright commonly start imagining how the audience is judging their abilities.
Gingras and Thurmer’s goal was to use technology to help students overcome this kind of anxiety by helping students avoid becoming distracted. Essentially, they were trying to find ways to desensitize performers to the things that commonly occur during a performance so that they would not be thrown off balance.
As a performer myself, I am familiar with these distractions. On several occasions, my train of thought has been derailed (usually during a conference presentation) by someone’s persistent cough, or worse, hearing a person’s cell phone launch into the first verse of Sir Mix-a-lot’s “Baby Got Back!” They MUST have forgotten to turn it off. Ummm. So where was I?
To help students learn to avoid losing focus during a performance, Thurmer and Gingras suggested that it would be useful to videotape an audience acting out a number of “distractions.” Initially, we discussed creating a DVD that would loop endlessly. Performers, in this case, music performance students, could watch the DVD on a TV at home, or on a computer or laptop, and have a full audience at their disposal, complete with a number of distracting mannerisms, at any time.
I had two immediate concerns about using DVD as the primary medium for delivery.
First, a television or laptop screen is too limiting. For performance students, the Virtual Audience (VA) would be more realistic if the viewing size was much larger, or at least somewhat close to life-size. I suggested that projecting it on a wall would be a nicer option.
Second and more importantly, the static nature of a looping DVD is problematic. Although a DVD can play for an infinite amount of time (or as long as the performance requires) the audience distractions occur in the same order every time. So, as students practice performing with the DVD, they would begin to memorize and anticipate what the audience would do next. For example, a student may notice that a person in the third row is sneezing at a particular point in the DVD, which is followed soon after by someone in the first row turning his head and whispering to his neighbor. Students would learn to anticipate these distractions, which would greatly diminish the effectiveness of the learning tool.
Randomization for Mastery
After some reflection on these concerns, I decided I’d like to try to create a version of the Virtual Audience that was large and dynamic.
During the course of my work creating learning objects, I developed what I refer to as “randomization theory.” This theory is based on the assumption that randomization helps with mastery of a topic. In the context of this learning object, randomization serves as a constant series of unpredictable distractions.
In the DVD version of the Virtual Audience, the audience performs the same distractions in the same order. Humans learn by recognizing the patterns in a given situation. When information is no longer novel, it is expected or taken for granted.
When new information is introduced, the brain has to reanalyze the situation to determine the differences. In a dynamic version of the VA, randomizing the distraction clips will serve this purpose by mimicking real situations that occur in an audience setting. This forces a performer to refocus during each distraction, or eventually, learn to ignore them.
To re-create an audience in as realistic a fashion as possible, I decided it was necessary to mix up the different distractions. This could be done rather easily using Adobe Flash’s ActionScript to programmatically assign numbers to video clips of distractions, then shuffle those numbers in a random order, and load the associated video clip when it is chosen.
V-ROOM
After discussing my idea of a larger, more dynamic version of the Virtual Audience, Professor Gingras decided to add the VA to a practice room the school of Music was currently upgrading with technology. The room, eight feet square, would house a computer with piano accompanist software and an effects-processing system used to amplify a student’s instrument with the ambience of a variety of performance spaces. A student could choose “Carnegie Hall” or any other large (or small) venue on the effects processor and it would virtually give the practice room the acoustics of that performance hall.
The addition involved simply installing the Virtual Audience software on the practice room’s computer, and adding a ceiling-mounted projector that displays the VA on one full wall of the room. All of these things combined (including the heat from the projector bulb simulating the warmth of stage lights) helps give the feel of a live performance setting.
Analyzing Audience Behavior
To determine how a digital audience might be assembled, I started by analyzing audience behavior. I watched a number of online and in-person performances to research how audiences act. I determined that making a dynamic audience required more than just throwing together random clips. After all, some behavior is expected and does happen in a consistent fashion. In a performance situation, especially a music recital, there are standard phases the audience assumes:
- Waiting Phase.:Audience sits and talks while they wait for the performer to take the stage.
- Welcoming Phase. Audience claps when performer comes out on stage (about 8 seconds, unless the performer is a pop icon, or person of notoriety).
- Watching Phase. Audience watches performance (this is where distractions occur) and waits for a visual and/or audible cue when the performer is done (usually the musician will stop playing, put an instrument down, or step away from the podium or microphone).
- Clapping Phase. Depending upon the mood of the collective, there is a polite clap, a partial ovation, or a complete standing ovation.

Phases of Audience Behavior
The Inspiration
About a year before Thurmer and Gingras approached me about this project, I had been awestruck by an interactive advertising campaign designed to sell software. It was called “Studio 8: Meet Your Match”. The ad took a user’s information and played back video in different sequences depending upon how the user clicked through the information.
Just a few months later, I became inspired by Guy Watson’s work with Flash’s “bitmapData()” method as a motion detection system.
Here’s how it works. A webcam takes 15-30 digital pictures per second. Flash analyzes each picture’s digital data as a series of numbers. Simply put, if the data is different from one picture to the next, the program interprets that there must be movement in front of the camera, and a different action is required.
It occurred to me that if a webcam can help function as the sensory ‘eyes and ears,’ ActionScript could be the ‘brain’ to help the VA come to life and be the dynamic, video solution I was hoping for. A webcam can watch and listen to the performer, and as the performer starts or stops playing, the program can detect changes in the performance and move the audience into a different phase. For example, when the webcam sees a performer step in front of the computer, it can tell the audience to clap, to welcome the performer. Likewise, when the webcam hears the music stop, it can tell the audience to clap.
Recording Audience Behavior
While watching a number of different audiences in different situations, I noted that phases 1, 2 and 4 (“waiting,” “welcome clapping,” and “end clapping”) occur with more consistent behavior, and the greatest number of distractions typically occurs during the “watching phase” of a performance. To capture these patterns, we recorded a real audience acting out these phases.
Michele Gingras assembled roughly 120 students in an open recital hall. We set a video camera center-stage, and I picked audience members willing to participate as actors for the distracting behavior. We shot the basic phases first.
- Talking (waiting)
- Obligatory Welcome Clap
- Watching (being still, but not frozen, but not distracting)
- Obligatory End Clap (the audience is not impressed, but polite)
- Partial Standing Ovation (Some of the audience is very impressed)
- Full Standing Ovation (All of the audience is very impressed)
After the basic phases were completed, using a list of distractions, I would call out the scene, and have a random volunteer actor do the distraction, while others sat still. The distracting behaviors to be taped included:
- Coming in late
- Leaving early
- Cell phone ringing and an actor trying to stop it
- Whispering
- Laughing
- Falling asleep
- Sneezing
- Coughing
- Crossing legs
- Clapping out of turn**
** As someone who doesn’t always know the proper etiquette of a music recital, I have on numerous occasions clapped during the break in between movements of a piece, thinking that the piece was over, and trying to show my support. Usually, I was the only one who did, and very quickly realized that I needed to wait.
http://media.nmc.org/2009/proceedings/dist_leaving.flv http://media.nmc.org/2009/proceedings/coughing.flvAssembling the Phases and Distractions as the Virtual Audience
After taping the audience performing different phases and distractions, we used Flash’s ActionScript to “move” the VA form phase to phase.
We used the webcam to trigger the audience so that it would transition automatically from the first phase, waiting, to welcoming the subject. The webcam sees the performer “take the stage” by walking in front of the webcam (motion detection). After the obligatory eight second clap, the audience settles into phase three: watching mode.
For the watching phase, we begin calling up random clips, and let them play for random amounts of time while the audience “listens” to the subject perform (through the webcam’s microphone). This continues infinitely, or until the performer stops playing. The audience continues doing distracting things such as coughing, sleeping, or sneezing, until the program senses 2.5 seconds of silence.
When Flash detects the absence of sound for more than the allotted time, a random ending clip is loaded, which initiates the “clapping phase.” Here, three alternate endings were taped. The first option is obligatory clapping. The audience is polite, but not too generous. The second option is a partial ovation. Some people in the audience are moved to the point of standing up to show their support for the performer. The third option is a total standing ovation. Everyone in the audience is awestruck and stands and gives an extended applause.

Assembling the phases and distractions
Issues and Challenges and Solutions
Conceptualizing which pieces needed to exist and imagining how all of these pieces would work together seemed fairly intuitive. However, once the project was assembled we began to encounter a few barriers that kept it from feeling polished.
After some reflection, I was able to determine an easy fix for most of the issues we encountered. Here are the issues we faced and the solutions that helped to make the VA more realistic:
Issue: Counting Movements: In a lot of recital instances, songs have more than one movement and there is usually a break between movements.
Solution: An interface element was added to allow the subject to specify how many movements exist in the piece of music being performed. Logic needed to be added to count those breaks as movements and to trigger the clapping once the final movement had finished.
Issue: Some clips should only play once: Like the movement counts, some distractions should only happen once. If an audience member leaves early, or comes in late, those particular distractions shouldn’t play again. Likewise, the “first-movement clapper” usually learns from the mistake the first embarrassing time around, and waits for others to start clapping first, the next time.
Solution: After these mentioned clips play, they are removed from the shuffled stack of available clips, to ensure they only play once.
Issue: Visual Continuity: Drastic visual changes occur as audience movie clips switch from distraction to distraction. Referred to as jump cuts, these are caused by audience members not being in the same precise location from clip to clip.
The above Flash file shows the visual discontinuity of two various scenes. Click and release the mouse to view.
Two partial solutions, when applied together, can help solve this problem:
1. Loading the new clip in underneath the existing clip, then fading the existing clip out (a dissolve) makes the switch more gradual, and less apparent.
2. Aligning audience members at the beginning and end of each clip. Using the Snap-Cap tool (a custom-built Flash Application that takes a picture as I yell “Cut!” to the audience) allows them to reposition themselves with reasonable accuracy to the place they were in as the shot ended.
Click the flash movie above, then move, and try to re-align yourself. Click again! Note: This interactive tool requires a web cam.
Looking Ahead
The original prototype was shot in standard 4:3 NTSC video at a resolution of 720×480 — standard television resolution. When the image is projected to fill an 8’x8’ area, artifacts in the digital video are apparent. Re-shooting in HD will help improve the clarity when projected at such a large size.
Use of the Snap-Cap tool during shooting will help with audience alignment and soften the visual transition between clips.
Some usability improvements are necessary, such as letting the user attenuate their instrument for loud and soft sounds, and room acoustics. Currently the VA works best in a quiet setting.
An optimized version for online delivery will allow the VA to be used in a wider variety of settings. Miami University’s Speech Communication program uses a tool that helps students practice delivering speeches (The Impromptu Speech Widget). Currently, students see a mirrored image of themselves when practicing a speech. Future versions of the VA would allow students to give their speech to a “live” distance audience.
Other Uses
In addition to music, it occurred to me that this application could be used in almost any situation where public performance could be the outcome. Among those:
- Theatre – soliloquy
- Forensics and Debate
- Marketing Presentations
- Speech Pathology and Audiology
It was also suggested by audience members at the 2009 NMC Summer Conference to let new faculty or Teacher Ed students use this as a tool to help them focus on lecturing. Additional features such as allowing a faculty member to control the distractions at will were also suggested.
Conclusion
The Virtual Audience’s planned deployment date was mid-February 2009. Due to some administrative hurdles, the opening of the practice room housing the Virtual Audience was delayed until fall 2009.
Since the initial blog posts and presentations, I have been contacted by two independent psychologists who sought permission to use it with their patients. To accommodate these and similar requests, I offer a free download of the Virtual Audience through my blog. In return, I ask for end-user feedback and research results, if there are any.
Acknowledgements:
Programming and ActionScript Development: Ryan Davidson
Video Editing and Encoding: Adam Baumgartner
Videography: Craig Rouse
V-Room Graphic Design: Yvonne Yau
Contact Information:
Britt Carr
carrbc@muohio.edu
AIR File: http://www.academic.muohio.edu/virtualaudience
Blog: http://learningactivities.wordpress.com


No Comments so far ↓
There are no comments yet...Kick things off by filling out the form below.