Presentation is loading. Please wait.

Presentation is loading. Please wait.

User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Similar presentations


Presentation on theme: "User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research."— Presentation transcript:

1 User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research

2 Introduction Time compression: key to browse AV content We focus on informational content Audio time compression algorithms  Linear: speed up audio uniformly  Non-linear: exploit fine-grain structure of human speech (e.g. pause, phonemes) How much more do users gain from more complex algorithms?

3 Methodology Conduct user listening test  One Linear TC algorithm  Two Non-linear TC algorithms  Simple: Pause-removal followed by Linear TC  Sophisticated: Adaptive TC Compare objective and subjective measurements

4 Time Compression Algorithms

5 Linear Time Compression Classic algorithms  Overlap Add (OLA) and Synchronized OLA (SOLA)  We use SOLA

6 Non-Linear Time Compression Algorithm 1: Pause removal plus TC  Energy and Zero Crossing Rate analysis  Leave 150ms untouched  Shorten >150ms to 150ms  Apply SOLA algorithm  PR shortens speech by 10-25%

7 Non-Linear Time Compression (cont.) Algorithm 2: Adaptive TC  Mimics people when talking fast  Pauses and silences are compressed the most  Stressed vowels are compressed the least  Consonants are compressed more than vowels  Consonants are compressed based on neighboring vowels

8 System Implications Computational complexity  Adaptive TC 10x more costly than Linear TC Complexity in client-server implementation  Buffer management required for non-linear TC Audio-video synchronization quality

9 User Study Method

10 User Study Goals Highest intelligible speed Comprehension Subjective preference Sustainable speed

11 Experiment Method 24 subjects 4 tasks for each subject 3 time compression algorithms  Linear TC using SOLA (Linear)  Pause removal plus Linear TC (PR-Lin)  Adaptive TC (Adapt) Each test takes approximately 30 minutes

12 Highest Intelligible Speed Task 3 clips from technical talks Find the highest speed when most of words are understandable

13 Comprehension Task 3 clips at 1.5x and 3 clips at 2.5x Clips from TOEFL listening test Answer 4 multiple choice questions

14 Subjective Preference Task 3 pairs of clips at 1.5x 3 pairs of clips at 2.5x Each pair contains the same clip compressed with 2 of the 3 TC algorithms Indicate preference on 3-point scale

15 Sustainable Speed Task 3 clips each 8 minute along Clips from a CD audio book Find the maximum comfortable speed Write a 4-5 sentence summary at the end

16 User Study Results

17 Highest Intelligible Speed Task PR-Lin is significantly better than Adapt (p<.01)

18 Comprehension Task Adapt is better than PR-Lin (p=.083) at 2.5x

19 Preference Task at 1.5x Slight preference for PR-Lin (p=.093) 1.5x Prefer Former Prefer None Prefer Latter Linear vs. PR-Lin 6513 PR-Lin vs. Adapt 1356 Adapt vs. Linear 888

20 Preference Task at 2.5x PR-Lin and Adapt do significantly better than Linear 2.5x Prefer Former Prefer None Prefer Latter Linear vs. PR-Lin 2814 PR-Lin vs. Adapt 4911 Adapt vs. Linear 2130

21 Sustainable Speed Task

22 Conclusions

23 Previous Works Mach1 (Covell et. al. ICASSP 98)  Comprehension and preference tasks  Comparing Linear and Mach1 (Adapt) at 2.6-4.2x  Comprehension scores 17% better w/ Mach1  95% prefers Mach1 to Linear  No data on < 2.0x Other works (Harrigan, Omoigui, Li, Foulke)  1.2-1.7x is the sustainable listening speed

24 Conclusions Trade off in TC algorithms is task-related  Listening: Linear TC is sufficient  Fast Forwarding: Non-linear TC is more suitable Adapt TC is close to the way people talk fast  Limit lies in the human-listening and comprehension


Download ppt "User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research."

Similar presentations


Ads by Google