Audio-Video Experiments Margaret Pinson Arthur Webster
1993 Audio-Video Proposal “Proposed Framework for Subjective Audiovisual Testing,” ANSI contribution Audio quality in isolation Video quality in isolation Audio quality in presence of video Video quality in presence of audio Audio-video quality
1998 Audio-Video Experiment “Development of Opinion-Based Audiovisual Quality Models for Desktop Video- Teleconferencing,” IEEE Workshop Video only session audio only session audio-video session ITU-T P.910 ACR 5-point scale
1998 Audio-Video Experiment ½” component video tape playback 17” computer screen & speakers SVGA (800x600) resolution Sound isolation booth Coding only impairments sˆ = video audio audio & video sessionsρ 0.29 audio and audio-video sessionsρ 0.41 video and audio-video sessionsρ 0.97
Interactive Experiment 1999 Two sound isolation booths Interactive task building Lego toy Rate communication experience Network impairments – no documentation available
2008 to 2009 ITS Multimedia #1 ACR 5-point scale Rate quality of audio-video sequence Computer monitor & speakers ITU-T P.910 Lighting Sound isolation booth 12-sec sequences, CIF resolution Mono audio
ITS Multimedia #1 Two full matrices (4 audio x 4 video) Set A = voice only Set B = music and mixed Coding only impairments Goal = weighted combination of audio & video quality audio-video quality Objective video & audio metrics
ITS Multi-Media #1 Video quality dominated Single person talking too easy In-progress, with different viewers Same material Audio-only session Video-only session
2008 to 2009 ITS Multi-Media #2 ACR 5-point scale Also: Yes/No acceptability scale Computer monitor & speakers ITU-T P.910 Lighting Sound isolation booth 12-sec sequences, CIF resolution Mono audio
Multi-Media #2 Audio-video de-synchronization impairments Audio delay -0.5 to 0.5 sec Explore impact of differential delay between audio and video Audio lag video is better 90ms delay preferred
2009 Multi-Media #3 ACR 5-point scale Rate quality of audio-video sequence Monitor & speakers undecided ITU-T P.910 Lighting Sound isolation booth likely 15-sec sequences, HDTV resolution Stereo audio
Multimedia #3 Two full matrices (4 audio x 4 video) Set A = H.264 coding only Set B = MPEG-2 coding only Audio mixture of speech & music & noise Goal = weighted combination of audio & video quality audio-video quality In-progress
Audio-Video Market Needs Objective audio model applicable for a wide range of content Single person talking Music Background noise Mixture of above Publicly available audio databases to encourage research on this topic Preferably with accompanying video