Presentation on theme: "Presenter Ivan Chiou. All come from Electrical and Computer Engineering, Carnegie Mellon University – Zheng Sun, PhD student in CyLab Mobility Research."— Presentation transcript:
All come from Electrical and Computer Engineering, Carnegie Mellon University – Zheng Sun, PhD student in CyLab Mobility Research Center – Aveek Purohit, Ph.D. candidate – Raja Bose, Microsoft Silicon Valley, KarMode LLC – Pei Zhang, Assistant Research Professor
Spartacus – a mobile system that enables spatially-aware neighboring device interactions with zero prior configuration. – Using built-in microphones and speakers – Doppler effect to enable an interaction through a pointing gesture. – audio-based lower-power listening mechanism to trigger the gesture detection service. Experiment – 90% device selection accuracy within 3m – lower energy consumption
Recent research still require initial channel of communication such as Wi-Fi or Bluetooth Spartacus Key contribution: – a novel acoustic technique based on the Doppler effect – a novel undersampling audio signal processing pipeline – low-power listening(reduces energy consumption) and without any manual actions from users – Experimentally validation
How it works: – Spartacus interact by quickly pointing her mobile phone towards the targeting device. – low-power listening using their built-in microphones. an audio beacon with a short duration as an initiator does not require any extra hardware. – implemented on the Android mobile platform without extra hardware.
High Resolution Doppler-Shift Detection – pointing gestures of average users are usually transient (shorter than 0.5s) – increases the frequency-domain resolution by 5X than traditional FFT-based approaches High-Accuracy Device Selection – Accurately estimate the peak frequency shifts – implement a bandpass audio signal processing pipeline to intermit high frequency acoustic noises Energy-effect Interaction Trigger – a low-power audio listening protocol to trigger incoming interaction
How Spartacus detects the maximum peak frequency shifts among those candidate target devices? – Since the user made the gesture directionally towards the target device, the target device would be able to observe the maximum Doppler shift and to be selected.
Deriving Angular Resolution – where fA is the observed tone frequency of DA, f0 the frequency of the original tone, Fs the sampling rate, NFFT the number of FFT points, and the calculated frequency shift expressed in terms of FFT points.
Assume the target device is stationary during the course of the gesture
Improving Resolution using Undersampling – increasing the original tone frequency f 0 stronger energy degradation – increasing the number of FFT points N FFT higher computational burden – decreasing the sampling rate F s. Spartacus at a very high frequency(18KHz) Undersampling technique can significantly reduce it
Determining Undersampling Parameters – A higher n – a higher fL Avoided using fL higher than 19KHz since it will cause greater energy degradation – Commodity Device limits audio sampling rates include 8KHz, 16KHz, 32KHz, 44.1KHz, and 48KHz only when n=5, 6, or 7 given Fs = 44.1KHz, or when n = 4 given Fs = 48KHz Angular resolution improved 26.7 degrees to 10 degrees.
Bandpass Signal Processing Pipeline – since the new sampling rate is much lower than the Nyquist rate, aliasing arises in the original sampled audio signals.
We found that M = 1.5 led to robust performance in various indoor environments. After each device detects the Doppler frequency shifts, all the devices report their frequency shift to the sender device, along with the devices ID information. The sender device then compares all the received Doppler shifts and determines the target device.
Angular Gain through Pointing Gestures – the number of FFT points is 2048, the smallest angular resolution is 10 degrees when the undersampling factor n is equal to 7. – when candidate devices are close to the user (i.e. within 3m), the device selection accuracy is better than the analysis. This angular change is significant when the candidate devices DA and DB are close to D0. Assuming the users arm is 60cm, the effective angular difference is increased to 55", which makes the two devices much easier to be differentiated.
How Spartacus Design for saving energy? Low-Power Audio Listening – Advantages Ubiquitous Hardware Support – No extra hardware and Only need Microphones and speakers Limited Range – Easy to detecting neighboring devices within the same space Energy Efficient – designed for continuous discovery. – Protocol Two major modes » Periodic Listening wake up (every Trx) Record sound for duration (drx). » Beaconing After receive the beacon, switch to continuous listening mode to record the gesture a short beacon duration consumes more energy Tradeoff between energy consumption vs. duty cycles Encodes the device ID using the Reed-Solomon coding – Using a 16 Frequency Shift-Keying (FSK) scheme with a central frequency at 19KHz. » Keys are using a 50Hz » the transmission of the device ID is at least 200Hz lower than the gesture tone - NO ambiguities
Dealing with Wakeup Jitter – It can be observed between when an API starts recording sound and when the system actually begins recording. – average jitter: 70ms, standard deviation: 15ms – empirical measurements to solve this problem
Dealing with Wakeup Jitter – due to the existence of the wakeup jitter, an additional guard band is used in the beacons.
Hardware – Android platform on Galaxy Tab, Nexus 7, Galaxy Nexus, and HTC One S. Software implementation – 4 components GestureSensing – GestureSensing.makeGesture(); – GestureSensing.analyzeGesture(); LowPowerListening – LPL.start(); AudioModem GUI.
In Spartacus, we use tone frequencies higher than 20KHz : inaudible – quantize the energy degradation of sound Devices: – Sennheiser MKE 2P microphone – Yamaha NX-U10 speaker energy degradation higher than 15KHz – Mobile phone usually designed for human conversations and music that is lower than 15KHz – increases every 1KHz, the degradation of sound energies increases 5dB on speakers – average 3.2dB/m energy decrease of sound from 1m to 6m
These results indicate that, to reduce energy degradation and increase interaction range, audio tones with lower frequencies should be leveraged.
Challenging questions: – How diversely do users point their phones, and how fast can a user point? – If the user points fast enough, how often does the target device observe the highest frequency shift, thus the highest velocity, of the gesture? – If we want to estimate the frequency shifts, how much frequency- and time-domain resolution do we need to successfully capture the peak frequency shift inside of a gesture? Participator – 12 participants (6 females) – briefed the participants on the idea of Spartacus before the experiment – 10 gestures towards a target device 2m away from them, using a Galaxy Nexus phone. – detected hand trajectories of the participants using image processing techniques
Finding 1 – Three types of gesture – most of the participants fully stretched out their arms – Focusing on evaluating this vertically downward gesture trajectory in the current design of Spartacus.
Finding 2 – facing towards the target device, with an average ±7.5 " angular bias. – precisely point the phones towards the target device – selecting the target device using the maximum velocity
Finding 3 – The peak velocity of the gestures of all participants was 3.4m/s on average – Most of the gestures lasted less than one second, and the peak velocities appeared and diminished within 25ms. – Spartacus needs a high time-domain resolution to position the peak frequency shifts
Galaxy Nexus phone 25 times towards the target device a peak velocity of about 3m/s. Select 20 from 25 gestures for analysis. captured at the two candidate devices at 44.1KHz, undersampled 7 times to 6.3KHz
Performance with Distances and Angles – As the distances between devices increase, the device selection accuracy drops gradually Since tones and other frequency bands decreases as the distances increase – as decreases, the accuracy of device selection drops.
Evaluation metals sounds – played a piece of rock music (i.e. Burn It Down of Linkin Park) – metal clangs can hardly reach frequencies above 18KHz, which has limited effect to Spartacus.
limited space in these scenarios – Only test to 1.5m with 30 degrees. Distance increase, the performance slight decreases due to the stronger multi-path effects in the Cubicles and Hallway. All three cases, achieved higher than 85% accuracy.
Spartacus: 2014-point FFT processing – takes 1.5s to process a 1s gesture traditional FFT: 8192-point FFT processing – takes 8.7s
compare the performance under different duty cycles – fixed each listening session to 200ms Hardware – Galaxy Nexus mobile phones Each test time – running low-power listening task for 5min Result – 4X lower energy consumption than WiFi Direct – 5.5X lower than the latest Bluetooth 4.0 protocols
Audio Processing in Mobile Sensing – Microphones on Mobile sensing Miluzzo – human conversation snippets for analyzing social activities SurroundSense – combined with other sensing modalities » accelerometers, cameras, and magnetometers to detect locations of users for social context inferences Lu – unknown social events can be automatically identified and easily labeled – Microphones on Energy-efficient JigSaw and Darwin Phones – enabling energy-efficient continuous sensing and collaborative learning techniques MoVi – multiple participants to create integrated social event records SwordFight – Provide distance ranging technique using time difference of sound arrivals
Spatially-Aware Device Interactions – Point & Connect (P&C) proposed an interaction technique based on time difference of sound arrivals. Enabling P&C may prevent the users from using their default WiFi networks. launched the related service and continuously waiting for interaction requests – consume significant energy. – SoundWave Single-device interactions – the laptop is both the transmitter and the receiver of Doppler effect, the generated frequency shift is doubled. – PANDAA No extra infrastructure and no extra effort from users to initiate interactions only supports devices in stationary placements – Polaris Support spatially-aware indoor device interactions dealt with only absolute directional relationships of devices
Energy-Efficient Interaction Triggers – Be enabled on demand when the energy constraint is not a major concern. – Triggered by other traditional communication schemes, such as Bluetooth or WiFi Direct. To solve that user has to wait for a couple of seconds for a warmup beaconbefore doing the gesture in Spartacus Security Issues – malicious device standing close by could pretend to have detected higher Doppler shifts than other devices, so that it deceives the sender into thinking it was the receiver. – Only trusted and authenticated devices could be allowed to report their Doppler shifts. After the users device determines the potential receiver who has reported the maximal Doppler shifts, the name and identity of receivers owner would be shown on the users device. Contentions Among Interaction Sessions – Used in a crowded scenario(ex. airport) contentions could be an issue for device pairing techniques – Need a contention coordination mechanism
Spartacus, a spatially-aware interaction system – High accuracy – Low latency – Low energy consumption – No extra hardware – Zero prior noisy configuration – Use in various conditions. – Experimental evaluations for Spartacus performance
This paper only document the initial gesture in its experiments? How about other gestures detection that receiver can recognize difference meanings of senders? If there are many children and adults who have different height and stand close in crowded scenario, how could the system to separate tallest and shortest from all selection targets?