Download presentation
Presentation is loading. Please wait.
1
Introduction to Wavelet Transform
2
Time Series are Ubiquitous!
A random sample of 4,000 graphics from 15 of the world’s newspapers published from 1974 to 1989 found that more than 75% of all graphics were time series (Tufte, 1983).
3
Why is Working With Time Series so Difficult?
Answer: We are dealing with subjective notions of similarity. The definition of similarity depends on the user, the domain and the task at hand. We need to be able to handle this subjectivity.
4
Wavelet Transform - Overview
History Fourier (1807) Haar (1910) Math World
5
Wavelet Transform - Overview
What kind of Could be useful? Impulse Function (Haar): Best time resolution Sinusoids (Fourier): Best frequency resolution We want both of the best resolutions Heisenberg (1930) Uncertainty Principle There is a lower bound for (An intuitive prove in [Mac91])
6
Wavelet Transform - Overview
Gabor (1945) Short Time Fourier Transform (STFT) Disadvantage: Fixed window size
7
Wavelet Transform - Overview
Constructing Wavelets Daubechies (1988) Compactly Supported Wavelets Computation of WT Coefficients Mallat (1989) A fast algorithm using filter banks
8
Discrete Fourier Transform I
Basic Idea: Represent the time series as a linear combination of sines and cosines, but keep only the first n/2 coefficients. Why n/2 coefficients? Because each sine wave requires 2 numbers, for the phase (w) and amplitude (A,B). X X' 20 40 60 80 100 120 140 Jean Fourier 1 2 3 4 5 6 7 Excellent free Fourier Primer Hagit Shatkay, The Fourier Transform - a Primer'', Technical Report CS , Department of Computer Science, Brown University, 1995. 8 9
9
Discrete Fourier Transform II
Pros and Cons of DFT as a time series representation. Good ability to compress most natural signals. Fast, off the shelf DFT algorithms exist. O(nlog(n)). (Weakly) able to support time warped queries. Difficult to deal with sequences of different lengths. Cannot support weighted distance measures. X X' 20 40 60 80 100 120 140 1 2 3 4 5 6 7 Note: The related transform DCT, uses only cosine basis functions. It does not seem to offer any particular advantages over DFT. 8 9
10
History…
11
Discrete Wavelet Transform I
Basic Idea: Represent the time series as a linear combination of Wavelet basis functions, but keep only the first N coefficients. Although there are many different types of wavelets, researchers in time series mining/indexing generally use Haar wavelets. Haar wavelets seem to be as powerful as the other wavelets for most problems and are very easy to code. 20 40 60 80 100 120 140 Haar 0 Haar 1 Haar 2 Haar 3 Haar 4 Haar 5 Haar 6 Haar 7 X X' DWT Alfred Haar Excellent free Wavelets Primer Stollnitz, E., DeRose, T., & Salesin, D. (1995). Wavelets for computer graphics A primer: IEEE Computer Graphics and Applications.
12
Wavelet Series
13
Discrete Wavelet Transform III
Pros and Cons of Wavelets as a time series representation. Good ability to compress stationary signals. Fast linear time algorithms for DWT exist. Able to support some interesting non-Euclidean similarity measures. Signals must have a length n = 2some_integer Works best if N is = 2some_integer. Otherwise wavelets approximate the left side of signal at the expense of the right side. Cannot support weighted distance measures. 20 40 60 80 100 120 140 Haar 0 Haar 1 Haar 2 Haar 3 Haar 4 Haar 5 Haar 6 Haar 7 X X' DWT
14
Singular Value Decomposition I
Basic Idea: Represent the time series as a linear combination of eigenwaves but keep only the first N coefficients. SVD is similar to Fourier and Wavelet approaches, we represent the data in terms of a linear combination of shapes (in this case eigenwaves). SVD differs in that the eigenwaves are data dependent. SVD has been successfully used in the text processing community (where it is known as Latent Symantec Indexing ) for many years. Good free SVD Primer Singular Value Decomposition - A Primer. Sonia Leach X X' SVD James Joseph Sylvester 20 40 60 80 100 120 140 eigenwave 0 eigenwave 1 eigenwave 2 eigenwave 3 eigenwave 4 eigenwave 5 eigenwave 6 eigenwave 7 Camille Jordan ( ) Eugenio Beltrami
15
Singular Value Decomposition II
How do we create the eigenwaves? We have previously seen that we can regard time series as points in high dimensional space. We can rotate the axes such that axis 1 is aligned with the direction of maximum variance, axis 2 is aligned with the direction of maximum variance orthogonal to axis 1 etc. Since the first few eigenwaves contain most of the variance of the signal, the rest can be truncated with little loss. X X' SVD 20 40 60 80 100 120 140 eigenwave 0 eigenwave 1 eigenwave 2 eigenwave 3 eigenwave 4 eigenwave 5 eigenwave 6 eigenwave 7 This process can be achieved by factoring a M by n matrix of time series into 3 other matrices, and truncating the new matrices at size N.
16
Singular Value Decomposition III
Pros and Cons of SVD as a time series representation. Optimal linear dimensionality reduction technique . The eigenvalues tell us something about the underlying structure of the data. Computationally very expensive. Time: O(Mn2) Space: O(Mn) An insertion into the database requires recomputing the SVD. Cannot support weighted distance measures or non Euclidean measures. X X' SVD 20 40 60 80 100 120 140 eigenwave 0 eigenwave 1 eigenwave 2 eigenwave 3 eigenwave 4 eigenwave 5 eigenwave 6 eigenwave 7 Note: There has been some promising research into mitigating SVDs time and space complexity.
17
Piecewise Linear Approximation I
Basic Idea: Represent the time series as a sequence of straight lines. Lines could be connected, in which case we are allowed N/2 lines If lines are disconnected, we are allowed only N/3 lines Personal experience on dozens of datasets suggest disconnected is better. Also only disconnected allows a lower bounding Euclidean approximation X Karl Friedrich Gauss X' 20 40 60 80 100 120 140 Each line segment has length left_height (right_height can be inferred by looking at the next segment) Each line segment has length left_height right_height
18
Problem with Fourier sinusoids of different frequencies.
· Fourier analysis -- breaks down a signal into constituent sinusoids of different frequencies. · A serious drawback in transforming to the frequency domain, time information is lost. When looking at a Fourier transform of a signal, it is impossible to tell when a particular event took place.
19
Function Representations
sequence of samples (time domain) finite difference method pyramid (hierarchical) polynomial sinusoids of various frequency (frequency domain) Fourier series piecewise polynomials (finite support) finite element method, splines wavelet (hierarchical, finite support) (time/frequency domain)
20
What Are Wavelets? In general, a family of representations using:
hierarchical (nested) basis functions finite (“compact”) support basis functions often orthogonal fast transforms, often linear-time
21
Function Representations – Desirable Properties
generality – approximate anything well discontinuities, nonperiodicity, ... adaptable to application audio, pictures, flow field, terrain data, ... compact – approximate function with few coefficients facilitates compression, storage, transmission fast to compute with differential/integral operators are sparse in this basis Convert n-sample function to representation in O(nlogn) or O(n) time
22
Wavelet History, Part 1 1805 Fourier analysis developed
1965 Fast Fourier Transform (FFT) algorithm … 1980’s beginnings of wavelets in physics, vision, speech processing (ad hoc) … little theory … why/when do wavelets work? 1986 Mallat unified the above work 1985 Morlet & Grossman continuous wavelet transform … asking: how can you get perfect reconstruction without redundancy?
23
Wavelet History, Part 2 1985 Meyer tried to prove that no orthogonal wavelet other than Haar exists, found one by trial and error! 1987 Mallat developed multiresolution theory, DWT, wavelet construction techniques (but still noncompact) 1988 Daubechies added theory: found compact, orthogonal wavelets with arbitrary number of vanishing moments! 1990’s: wavelets took off, attracting both theoreticians and engineers
24
Time-Frequency Analysis
For many applications, you want to analyze a function in both time and frequency Analogous to a musical score Fourier transforms give you frequency information, smearing time. Samples of a function give you temporal information, smearing frequency. Note: substitute “space” for “time” for pictures.
25
Comparison to Fourier Analysis
Basis is global Sinusoids with frequencies in arithmetic progression Short-time Fourier Transform (& Gabor filters) Basis is local Sinusoid times Gaussian Fixed-width Gaussian “window” Wavelet Frequencies in geometric progression Basis has constant shape independent of scale
26
Wavelets are faster than ffts!
27
· The results of the CWT are many wavelet coefficients, which are a function of scale and position
28
Gabor’s Proposal: Short Time Fourier Transform
Requirements: Signal in time domain: require short time window to depict features of signal. Signal in frequency domain: require short frequency window (long time window) to depict features of signal.
29
What are wavelets? Haar wavelet
· Wavelets are functions defined over a finite interval and having an average value of zero. Haar wavelet
30
What is wavelet transform?
· The wavelet transform is a tool for carving up functions, operators, or data into components of different frequency, allowing one to study each component separately. · The basic idea of the wavelet transform is to represent any arbitrary function ƒ(t) as a superposition of a set of such wavelets or basis functions. · These basis functions or baby wavelets are obtained from a single prototype wavelet called the mother wavelet, by dilations or contractions (scaling) and translations (shifts).
31
The continuous wavelet transform (CWT)
· Fourier Transform FT is the sum over all the time of signal f(t) multiplied by a complex exponential.
32
· Similarly, the Continuous Wavelet Transform (CWT) is defined as the sum over all time of the signal multiplied by scale , shifted version of the wavelet function : where * denotes complex conjugation. This equation shows how a function ƒ(t) is decomposed into a set of basis functions , called the wavelets. Z=r+iy, z*=r-iy The variables s and t are the new dimensions, scale and translation (position), after the wavelet transform.
33
· The wavelets are generated from a single basic wavelet , the so-called mother wavelet, by scaling and translation: s is the scale factor, t is the translation factor and the factor s-1/2 is for energy normalization across the different scales. · It is important to note that in the above transforms the wavelet basis functions are not specified. · This is a difference between the wavelet transform and the Fourier transform, or other transforms.
34
·Scale · Scaling a wavelet simply means stretching (or compressing) it.
35
·Scale and Frequency · Low scale a · High scale a ·Translation (shift)
Compressed wavelet Rapidly changing details High frequency · High scale a stretched wavelet slowly changing details low frequency ·Translation (shift) · Translating a wavelet simply means delaying (or hastening) its onset.
36
Haar wavelet
37
Discrete Wavelets · Discrete wavelet is written as j and k are integers and s0 > 1 is a fixed dilation step. The translation factor t0 depends on the dilation step. The effect of discretizing the wavelet is that the time-scale space is now sampled at discrete intervals. We usually choose s0 = 2 If j=m and k=n others
38
A band-pass filter · The wavelet has a band-pass like spectrum From Fourier theory we know that compression in time is equivalent to stretching the spectrum and shifting it upwards: Suppose a=2 This means that a time compression of the wavelet by a factor of 2 will stretch the frequency spectrum of the wavelet by a factor of 2 and also shift all frequency components up by a factor of 2.
39
Subband coding · If we regard the wavelet transform as a filter bank, then we can consider wavelet transforming a signal as passing the signal through this filter bank. · The outputs of the different filter stages are the wavelet- and scaling function transform coefficients. · In general we will refer to this kind of analysis as a multiresolution. · That is called subband coding.
40
· Splitting the signal spectrum with an iterated filter bank.
HP LP 4B f HP LP 4B 2B B f HP LP 4B 2B · Summarizing, if we implement the wavelet transform as an iterated filter bank, we do not have to specify the wavelets explicitly! This is a remarkable result.
41
The Discrete Wavelet Transform
· Calculating wavelet coefficients at every possible scale is a fair amount of work, and it generates an awful lot of data. What if we choose only a subset of scales and positions at which to make our calculations? · It turns out, rather remarkably, that if we choose scales and positions based on powers of two -- so-called dyadic scales and positions -- then our analysis will be much more efficient and just as accurate. We obtain just such an analysis from the discrete wavelet transform (DWT).
42
Approximations and Details
· The approximations are the high-scale, low-frequency components of the signal. The details are the low-scale, high-frequency components. The filtering process, at its most basic level, looks like this: · The original signal, S, passes through two complementary filters and emerges as two signals .
43
Downsampling · Unfortunately, if we actually perform this operation on a real digital signal, we wind up with twice as much data as we started with. Suppose, for instance, that the original signal S consists of 1000 samples of data. Then the approximation and the detail will each have 1000 samples, for a total of 2000. · To correct this problem, we introduce the notion of downsampling. This simply means throwing away every second data point.
44
An example:
45
Reconstructing Approximation and Details
Upsampling
46
Wavelet Decomposition
Multiple-Level Decomposition The decomposition process can be iterated, with successive approximations being decomposed in turn, so that one signal is broken down into many lower-resolution components. This is called the wavelet decomposition tree.
47
DWT · Scaling function (two-scale relation) · Wavelet
· The signal f(t) can be expresses as DWT
50
Wavelet Reconstruction (Synthesis)
Perfect reconstruction :
51
(4,0) y1 x1 (1,0)
52
original L H · 2-D Discrete Wavelet Transform
· A 2-D DWT can be done as follows: Step 1: Replace each row with its 1-D DWT; Step 2: Replace each column with its 1-D DWT; Step 3: repeat steps (1) and (2) on the lowest subband for the next scale Step 4: repeat steps (3) until as many scales as desired have been completed original L H LH HH HL LL One scale two scales
53
Image at different scales
54
Correlation between features at different scales
55
Wavelet construction – a simplified approach
Traditional approaches to wavelets have used a filterbank interpretation Fourier techniques required to get synthesis (reconstruction) filters from analysis filters Not easy to generalize
56
Wavelet construction – lifting
3 steps Split Predict (P step) Update (U step)
57
Example – the Haar wavelet
S step Splits the signal into odd and even samples even samples odd samples
58
Example – the Haar wavelet
P step Predict the odd samples from the even samples l For the Haar wavelet, the prediction for the odd sample is the previous even sample :
59
Example – the Haar wavelet
Detail signal : l l
60
Example – the Haar wavelet
U step Update the even samples to produce the next coarser scale approximation The signal average is maintained :
61
Summary of the Haar wavelet decomposition
Can be computed ‘in place’ : ….. ….. -1 -1 P step 1/2 1/2 U step
62
Inverse Haar wavelet transform
Simply run the forward Haar wavelet transform backwards! Then merge even and odd samples Merge
63
General lifting stage of wavelet decomposition
+ Split P U -
64
Multi-level wavelet decomposition
We can produce a multi-level decomposition by cascading lifting stages … lift lift lift
65
General lifting stage of inverse wavelet synthesis
- U P Merge +
66
Multi-level inverse wavelet synthesis
We can produce a multi-level inverse wavelet synthesis by cascading lifting stages lift lift lift …...
67
Advantages of the lifting implementation
Inverse transform Inverse transform is trivial – just run the code backwards No need for Fourier techniques Generality The design of the transform is performed without reference to particular forms for the predict and update operators Can even include non-linearities (for integer wavelets)
68
Example 2 – the linear spline wavelet
A more sophisticated wavelet – uses slightly more complex P and U operators Uses linear prediction to determine odd samples from even samples
69
The linear spline wavelet
P-step – linear prediction Linear prediction at odd samples Detail signal (prediction error at odd samples) Original signal
70
The linear spline wavelet
The prediction for the odd samples is based on the two even samples either side :
71
The linear spline wavelet
The U step – use current and previous detail signal sample
72
The linear spline wavelet
Preserves signal average and first-order moment (signal position) :
73
The linear spline wavelet
Can still implement ‘in place’ -1/2 P step -1/2 -1/2 -1/2 U step 1/4 1/4 1/4 1/4
74
Summary of linear spline wavelet decomposition
Computing the inverse is trivial : The even and odd samples are then merged as before
75
Wavelet decomposition applied to a 2D image
detail approx lift .
76
Wavelet decomposition applied to a 2D image
detail approx lift approx
77
Why is wavelet-based compression effective?
Allows for intra-scale prediction (like many other compression methods) – equivalently the wavelet transform is a decorrelating transform just like the DCT as used by JPEG Allows for inter-scale (coarse-fine scale) prediction
78
Why is wavelet-based compression effective?
Original 1 level Haar 1 level linear spline 2 level Haar
79
Why is wavelet-based compression effective?
Wavelet coefficient histogram
80
Why is wavelet-based compression effective?
Coefficient entropies Entropy Original image 7.22 1-level Haar wavelet 5.96 1-level linear spline wavelet 5.53 2-level Haar wavelet 5.02 2-level linear spline wavelet 4.57
81
Why is wavelet-based compression effective?
Wavelet coefficient dependencies X
82
Why is wavelet-based compression effective?
Lets define sets S (small) and L (large) wavelet coefficients The following two probabilities describe interscale dependancies
83
Why is wavelet-based compression effective?
Without interscale dependancies
84
Why is wavelet-based compression effective?
Measured dependancies from Lena 0.886 0.529 0.781 0.219
85
Why is wavelet-based compression effective?
Intra-scale dependencies X1 X X8
86
Why is wavelet-based compression effective?
Measured dependancies from Lena 0.912 0.623 0.781 0.219
87
Why is wavelet-based compression effective?
Have to use a causal neighbourhood for spatial prediction
88
Example image compression algorithms
We will look at 3 state of the art algorithms Set partitioning in hierarchical sets (SPIHT) Significance linked connected components analysis (SLCCA) Embedded block coding with optimal truncation (EBCOT) which is the basis of JPEG2000
89
The SPIHT algorithm Coefficients transmitted in partial order msb lsb
Coeff. number ……. msb 1 … x 5 4 3 2 1 lsb
90
The SPIHT algorithm 2 components to the algorithm Refinement pass
Sorting pass Sorting information is transmitted on the basis of the most significant bit-plane Refinement pass Bits in bit-planes lower than the most significant bit plane are transmitted
91
The SPIHT algorithm N= msb of (max(abs(wavelet coefficient)))
for (bit-plane-counter)=N downto 1 transmit significance/insignificance wrt bit-plane counter transmit refinement bits of all coefficients that are already significant
92
The SPIHT algorithm Insignificant coefficients (with respect to current bitplane counter) organised into zerotrees
93
The SPIHT algorithm Groups of coefficients made into zerotrees by set paritioning
94
The SPIHT algorithm SPIHT produces an embedded bitstream bitstream
… ……… ….
95
The SLCCA algorithm Wavelet transform Quantise coefficients
Cluster and transmit significance map Bit-plane encode significant coefficients
96
The SLCCA algorithm The significance map is grouped into clusters
97
The SLCCA algorithm Clusters grown out from a seed Seed
Significant coeff Insignificant coeff
98
The SLCCA algorithm Significance link symbol Significance link
99
Image compression results
Evaluation Mean squared error Human visual-based metrics Subjective evaluation
100
Image compression results
Mean-squared error Usually expressed as peak-signal-to-noise (in dB)
101
Image compression results
102
Image compression results
103
Image compression results
SPIHT 0.2 bits/pixel JPEG 0.2 bits/pixel
104
Image compression results
SPIHT JPEG
105
EBCOT, JPEG2000 JPEG2000, based on embedded block coding and optimal truncation is the state-of-the-art compression standard Wavelet-based It addresses the key issue of scalability SPIHT is distortion scalable as we have already seen JPEG2000 introduces both resolution and spatial scalability also An excellent reference to JPEG2000 and compression in general is “JPEG2000” by D.Taubman and M. Marcellin
106
EBCOT, JPEG2000 Resolution scalability is the ability to extract from the bitstream the sub-bands representing any resolution level bitstream … ……… ….
107
EBCOT, JPEG2000 Spatial scalability is the ability to extract from the bitstream the sub-bands representing specific regions in the image Very useful if we want to selectively decompress certain regions of massive images bitstream … ……… ….
108
Introduction to EBCOT JPEG2000 is able to implement this general scalability by implementing the EBCOT paradigm In EBCOT, the unit of compression is the codeblock which is a partition of a wavelet sub-band Typically, following the wavelet transform,each sub-band is partitioned into small blocks (typically 32x32)
109
Introduction to EBCOT Codeblocks – partitions of wavelet sub-bands
110
Introduction to EBCOT ……
A simple bit stream organisation could comprise concatenated code block bit streams …… Length of next code-block stream
111
Introduction to EBCOT This simple bit stream structure is resolution and spatially scalable but not distortion scalable Complete scalability is obtained by introducing quality layers Each code block bitstream is individually (optimally) truncated in each quality layer Loss of parent-child redundancy more than compensated by ability to individually optimise separate code block bitstreams
112
Introduction to EBCOT … … …
Each code block bit stream partitioned into a set of quality layers … … …
113
EBCOT advantages Multiple scalability Efficient compression
Distortion, spatial and resolution scalability Efficient compression This results from independent optimal truncation of each code block bit stream Local processing Independent processing of each code block allows for efficient parallel implementations as well as hardware implementations
114
EBCOT advantages Error resilience
Again this results from independent code block processing which limits the influence of errors
115
Performance comparison
A performance comparison with other wavelet-based coders is not straightforward as it would depend on the target bit rates which the bit streams were truncated for With SPIHT, we simply truncate the bit stream when the target bit rate has been reached However, we only have distortion scalability with SPIHT Even so, we still get favourable PSNR (dB) results when comparing EBCOT (JPEG200) with SPIHT
116
Performance comparison
We can understand this more fully by looking at graphs of distortion (D) against rate (R) (bitstream length) D R-D curve for continuously modulated quantisation step size Truncation points R
117
Performance comparison
Truncating the bit stream to some arbitrary rate will yield sub-optimal performance D R
118
Performance comparison
119
Performance comparison
Comparable PSNR (dB) results between EBCOT and SPIHT even though: Results for EBCOT are for 5 quality layers (5 optimal bit rates) Intermediate bit rates sub-optimal We have resolution, spatial, distortion scalability in EBCOT but only distortion scalability in SPIHT
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.