Spherical Wavelets A new tool for 3D shape representation

Technical Report
Pattern Recognition and Image Processing Group
Institute of Computer Aided Automation
Vienna University of Technology
Favoritenstr. 9/183-2
A-1040 Vienna AUSTRIA
Phone:
+43 (1) 58801-18351
Fax:
+43 (1) 58801-18392
{me}@mail.com
E-mail:
URL:
http://www.prip.tuwien.ac.at/
October 16, 2008
PRIP-TR-118
Spherical Wavelets
A new tool for 3D shape representation
Schwartz Ernst
Abstract
Wavelets are a common tool for the analysis of signals in one or two dimensions. Extending
those findings to be used in a three-dimensional setting has proven more difficult. This
report is concerned with a recent promising approach to wavelet analysis in 3 dimensions on
genus-0 objects, the so-called spherical wavelets. Using examples from the field of medical
imaging, the concepts underlying their construction are introduced and their usefullness
for a variety of applications is demonstrated. Further, an overview of publicly available
toolboxes is given and an evaluation comparing the capabilities of spherical wavelets to
different shape modelling approaches is performed.
Contents
1 Introduction
2
2 Preliminaries
2.1 The Fourier transform . . . . . . . . . . . . . .
2.2 Introducing Wavelets . . . . . . . . . . . . . . .
2.3 2nd generation wavelets . . . . . . . . . . . . .
2.3.1 Polyphase representation . . . . . . . . .
2.3.2 Factoring filters using the lifting scheme
2.3.3 Construction of wavelets using the lifting
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
scheme
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
4
5
8
9
11
12
3 Spherical Wavelet Shape Representation
3.1 State-of-the art representations of 3D objects . . .
3.2 Spherical representation of 3D objects . . . . . . .
3.3 The construction of spherical wavelets . . . . . .
3.4 Using spherical wavelets for shape representation
3.4.1 Filtering . . . . . . . . . . . . . . . . . . .
3.4.2 Band-wise grouping . . . . . . . . . . . . .
3.4.3 Dimensionality reduction . . . . . . . . . .
3.5 Implementation . . . . . . . . . . . . . . . . . . .
3.5.1 Matlab Wavelet Toolbox . . . . . . . . . .
3.5.2 YAWTB; yet another wavelet toolbox . . .
3.5.3 Gabriel Peyres toolboxes . . . . . . . . . .
3.5.4 Spherical wavelet ITK filter . . . . . . . .
3.6 Putting it all together . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
14
14
16
19
20
20
21
21
23
23
23
23
24
24
4 Applications & Experiments
4.1 Applications . . . . . . . . .
4.1.1 Signal description . .
4.1.2 3D modelling . . . .
4.1.3 Segmentation . . . .
4.2 Experiments . . . . . . . . .
4.2.1 Compression . . . . .
4.2.2 Band-wise grouping .
4.2.3 Shape description . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
26
26
26
26
26
28
28
30
31
.
.
.
.
.
.
.
.
5 Conclusion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
33
1
1
Introduction
Modern imaging techniques for medical applications have reached a level of
sophistication during the last 10 years that allows for a sub-millimeter precision on MRIs or CTs. While this is of great help for professionals and
scientists in the field of medicine, the precision achieved is maybe of even
greater help for automatic analysis using image processing and machine vision.
Today, most analysing of x-rays or other images is still done by evaluating
2d slices separately. Because of the 3d nature of the anatomical structures,
this can be very ineffective, nevertheless professionals are still reluctant to
use the 3d data directly, because of the high complexity of the stuctures, and
the lack of intuitive and efficient interfaces.
One important research area to improve the acceptance of 3d imaging in
the medical community is the (semi-)automatic segmentation of the data to
eliminate the possibility of one structure hiding others or the like.
In this report, we will investigate a novel approach to representing 3d data
that has provided promising results for signal analysis, shape description and
consequently segmentation purposes; the spherical wavelet representation.
To gain a better understanding of what exactly such a representation is, how
it is computed and how it can be used in a medical imaging context, we will
start not with a 3d signal but a simple 1d one, such as a time series. After presenting the idea of a harmonic expansion using the standard Fourier
transform in section 2.1, in the following section 2.2 we will look more in
depth at the oldest, the so called ”Haar” wavelet, before investigating ways
of constructing more sophisticated representations in section 2.3. Once the
reader is familiar with the basics of wavelet theory, we will look at how to
describe 3d objects in a way that enables us to deal with them in a signal
processing framework (sections 3.1 to 3.4) and present, in section 3.5, different available software packages for (spherical) wavelet analysis.
In section 4, we will give a quick overview over some applications of spherical
wavelets today, which range from astrophysics to medical applications and
do a comparitive study of the spherical wavelet approach and other commonly used compressing signal representations. In the experimental results
(section 4.2), we will demonstrate in a step-by-step manner how to compute
the so-called spherical wavelets from these representations.
2
3
2
2.1
Preliminaries
The Fourier transform
In signal analysis, one often reaches the limits of gainable knowledge about
the contained information in a specific representation of the signal.
By applying a transform to the signal, it is often possible to find properties
of the signal that where hidden before.
For instance, the complexity of language stems from the fact that human
hearing works by processing different frequencies of the sounds that reach
the inner ear and combining that information into a representation that the
human brain is able to interpret. This type of analysis can either be achieved
by separating different frequencies of the signal using a so called filter bank
or by the Fourier transform, one of the earliest types of transforms that was
shown to be able to fully describe the information contained in the signal,
e.g. to be fully invertible.
The Fourier decomposes a function into a series of coefficients associated to
trigonometric functions, that is a series of sines and cosines with a certain
amplitude and phase.
Z
∞
x(t) =
X(f )ei2πf t dt
(1)
−∞
When someone mentions a sound having a high frequency, what this actually means are larger Fourier coefficients in the upper part of the so-called
spectrum, the amplitude of the totallity of the signals components.
The Fourier transform is by no means limited to the analysis of one-dimensional
signals. There exist analysis methods based on the Fourier transform for 2d
and even 3d signals, then more appropriately called images or shapes.
One serious drawback of the Fourier transform though is its inability to
localize components in the signal. Because it decomposes the signal into an
integral of infinite functions (sines / cosines), the analysis of a signal that
changes with time (or space, for images, eg. is not a pattern) can become
useless as there is, in the Fourier domain, no way of telling when in time a
component of a certain frequency starts and stops. To counter this effect, the
only way to gain useful temporal information from the Fourier (or frequency)
representation of a signal was to first decompose it into small parts and an4
alyze those separately. This is called the windowed Fourier transform [8]
because one uses a windowing function (in the simplest case a box function
of a certain width) to cut the signal. By using more complexly constructed
windows, one can achieve a better separation and less distortion. In fact, the
transform this report is mainly concerned with, the wavelet transform, can
be seen as a sort of formalization of the windowed Fourier transform, as we
will see next.
2.2
Introducing Wavelets
The first wavelet transform wasnt actually called such, the term being coined
by Stephane Mallat [9] . Long before that, in 1909, Alfred Haar proposed
what became known as the Haar wavelet, a very simple form of a wavelet
that works by dyadic subsampling of the signal and using the differences
between samples.
To perform a Haar decomposition of a signal of length 2n (also called Haar
wavelet transform), one first splits the signal into even and odd samples.
The two resulting signals now have the same length, n.
Listing 1: Naive implementation of the Haar-transform of a 1d-signal (in
Matlab)
n = 100;
x = ( sin ( linspace ( 1 , 2 ∗ pi , n ) ) + rand ( 1 , n ) )
even = x ( 2 : 2 : n ) ;
odd = x ( 1 : 2 : n ) ;
% ( scaling for perfect reconstruction )
avg = ( odd + even ) / 2∗ sqrt ( 2 ) ;
d i f = ( odd − even ) / sqrt ( 2 ) ;
hold on ; plot ( x , ’ r ’ ) ;
plot ( i n t e r p ( avg , 2 ) , ’ b ’ ) ;
plot ( i n t e r p ( d i f , 2 ) , ’ g ’ ) ;
hold o f f ;
y = [ avg d i f ] ;
5
Figure 1: toy example for a Haar decomposition: original signal in red,
(upsampled) averages in blue, details in green
The next step in the transform is to add the two signals together and
normalize on one hand (averages), and subtract and normalize on the other
hand (detail).
Note that the resulting average and detail coefficients retain the complete
information about the signal.
A complete Haar-transform of a signal consists in repeating the above
steps for the averages until only a single scalar describing the averages (a
measure of the DC component of the signal) and the detail coefficients of
each step are left.
It should be noted here that the Haar transform (and other wavelet transforms too) are not reducing the size of the data directly. In fact, the size
of the wavelet decomposition (e.g. the concatenation of the average and the
detail coefficients) is of exactly the same length as that of the original signal.
What makes wavelet decomposition interesting for data compression (it has
for example replaced DCT-based coding in the jpeg2000 image standard [20])
6
is the fact that many of the detail coefficients, especially at higher decomposition levels, turn out to be rather small in magnitude and can thus be
neglected (zeroed out), by which a quite impressive data reduction rate can
be achieved without distorting the original signal too much.
Before plunging into the mathematical description of the wavelet transform, lets take a quick look at some common names and notations used in
the literature.
As we have seen, the wavelet transform decomposes a signal into averages
and detail coefficients. The name average might be well fitted for the simple Haar-transform, but for more elaborate wavelets, it can be misleading.
The functions to compute the averages are commonly called scaling functions
whilst the one for the details are being called wavelet functions.
With this in mind, lets take a look at the mathematical description of the
wavelet transform and some of the properties that make it interesting for
signal analysis.
The general formula of a wavelet transform is
t−b
1
)
ψa,b (t) = √ ψ(
a
a
Z Z
xa (t) =
x(t)ψa,b (t)dt
(2)
(3)
Comparing it to (1), one can see the similarities. A wavelet transform
is comparable to a windowed Fourier transform with scaled and shifted windows, thus capturing low and high frequency components of the signal at
specific locations in time or space.
The Haar-wavelet being a very simplistic transform it has some undesirable properties such as aliasing effects due to its binary nature. Thus,
extensions have been proposed for constructing more sophisticated wavelets.
For a detailed description of these as well as corresponding derivations the
interrested reader is reffered to the classical text in the field by Daubechies
[1].
These constructions are far from trivial. To ensure such properties as full
reconstructability and orthogonality of the composing functions, one needs
to perform a thorough mathematical analysis of the wavelet in question.
7
A somewhat simpler and more elegant method for constructing new (and
common) wavelets and enforcing some desired properties onto them is the
so called lifting scheme proposed by Sweldens in [2], [19] which will be
described next.
2.3
2nd generation wavelets
The description of the lifting scheme will start with a slight change of notation. As we have seen, each step in the wavelet transform splits the signal
into two parts, which can be interpreted as a low frequency one and a high
frequency one that can be recombined to reconstruct the original signal.
As the transform can be described in terms of frequency information extracted from the signal, and remembering the filtering approach to signal
analysis from chapter 2.1, it is but logical to try and define it as (a cascade
of) filtering operations.
Listing 2: Filter implementation of Haar transform
n = 100;
x = ( sin ( linspace ( 1 , 2 ∗ pi , n ) ) + rand ( 1 , n ) )
h = [1 1] / 2;
g = [−1 1 ] ;
% a v e r a g i n g f i l t e r sums t o 1
% d i f f e r e n c i n g f i l t e r to 0
avg = f i l t e r ( h , 1 , x ) / sqrt ( 2 ) ;
avg = avg ( 2 : 2 : end ) ’ ;
d i f = f i l t e r ([ −1 1 ] , 1 , x ) / sqrt ( 2 ) ;
d i f = d i f ( 2 : 2 : end ) ’ ;
From now on, we will denote the filters used for that decomposition as h̃
and g̃ and those for reconstruction as h and g respectively. A transform can
thus by represented like in figure 2
where the filters have to fullfil the conditions (4) and (5:
8
Figure 2: wavelet decomposition; filtering interpretation
2.3.1
h(z)h̃(z −1 ) + g(z)g̃(z −1 ) = 2
(4)
h(z)h̃(z −1 ) + g(z)g̃(−z −1 ) = 0
(5)
Polyphase representation
Looking at Figure 2, one can immediately spot an inefficiency in the implementation. As half the samples are thrown away each step, it would be
much more efficient to apply the filters only to those samples that are actually needed and disregard the others. This is achieved by the so-called
polyphase representation.
Figure 3: polyphase decomposition
9
That representation can be put into matrix form, which leaves us with
P̃ (z) =
h˜e (z) h˜o (z)
g˜e (z) g˜o (z)
(6)
describing the decomposition part of the wavelet transform. We can proceed analogously with the reconstruction part, which gets us
P (z) =
he (z) ge (z)
ho (z) go (z)
(7)
and thus
Figure 4: polyphase wavelet decomposition
Listing 3: Polyphase implementation of Haar transform
odd = x ( end−1: −2:1);
even = x ( end : − 2 : 2 ) ;
s = conv ( h ( 1 : 2 : end ) , odd ) + conv ( h ( 2 : 2 : end ) , even )
d = conv ( g ( 1 : 2 : end ) , odd ) + conv ( g ( 2 : 2 : end ) , even )
s = [ s (1)+ s ( end ) s ( end − 1 : − 1 : 2 ) ] ;
d = [ d(1)+d ( end ) d ( end − 1 : − 1 : 2 ) ] ;
As it turns out, the
matrix is a matrix of Laurent polynomials
Pq polyphase
−k
of the form h(z) = k=p hk z , which have the property that sums, differences, products and even divisions (with rest) of Laurent polynomials are
themselves Laurent polynomials.
10
Because such Laurent polynomials can be interpreted as discrete filters, this
property can be exploited in the so-called Lifting scheme.
2.3.2
Factoring filters using the lifting scheme
It is possible to factor the filters involved in the wavelet decomposition and
reconstruction in simpler ones, thus reducing costly convolutions to shorter
ones or even simple multiply- and add-operations by solving small systems of
equations. Depending on the original complexity of the scaling and wavelet
filters, this results in a speed increase of up to 100% for certain types of
wavelets.
For the reconstruction part, this corresponds to solving for the primary lifting
coefficient and rests in
new
1 s̃(z)
h˜e (z) h˜o (z)
P̃ (z) =
∗
(8)
0 1
g˜e (z) g˜o new (z)
or for the dual lifting coefficients
new
h˜e (z)
P̃ (z) =
g˜e new (z)
and rest in
1 0
h˜o (z)
∗
t̃(z) 1
g˜o (z)
(9)
The construction is analogous for the analysis part.
The following applies the lifting scheme decomposition to reformulate the
Haar transform. Note that, because the filters involved are allready of length
2, this does not result in a speed increase. It is nonetheless a usefull example
to understand the workings of the lifting scheme.
In a polyphase formulation, the now familiar Haar transform coresponds to
1/2 1/2
P̃ (z) =
−1 1
The lifting scheme can be applied to this, leaving us to solve
new
1 0
h˜e (z) 1/2
P̃ (z) =
∗
t̃(z) 1
g˜e new (z) 1
11
and thus

 h˜ new (z) + 1 ∗ t̃(z) = 1
e
2
2

new
g˜e (z) + t̃(z) = −1
new
Setting h˜e (z) = 1 gives
1 1/2
1 0
P̃ (z) =
∗
0 1
−1 1
which results in the follwing implementation
Listing 4: Lifted haar transform
odd = x ( 1 : 2 : end ) ;
even = x ( 2 : 2 : end ) ;
% second m a t r i x
d = odd − even ;
% f i r s t matrix
s = . 5 ∗ d + even
2.3.3
Construction of wavelets using the lifting scheme
On the other hand, the factorability of Laurent polynomials allows to generate more sophisticated wavelets (lift the original wavelet) from really basic
ones by multiplying the decomposition filter by a carefully selected Laurent
polynomial
g new (z) = g(z) + h(z)s(z 2 )
12
(10)
P
new
(z) =
he (z) he (z)s(z) + ge (z)
ho (z) ho (z)s(z) + go (z)
= P (z)
1 s(z)
0 1
(11)
and analogously in the reconstruction part
h̃new (z) = h̃(z) + g̃(z)s̃(z 2 )
P̃
new
(z) =
h˜e (z) + g˜e (z)s̃(z) h˜o (z) + g˜o (z)s̃(z)
g˜e (z)
g˜o (z)
(12)
=
1 s̃(z)
0 1
P̃ (z)
(13)
Figure 5: polyphase lifted wavelet decomposition
See figure 5 for a schematic representation of the primal lifting decomposition and reconstruction.
On the other hand, it is also possible to multiply the reconstruction filter
first and find the appropriate Laurent polynomial to use in the decomposition part afterwards. This procedure is then called dual lifting (opposed to
primal lifting in the first case).
Using the lifing scheme, basically all that is needed to perform a wavelet
analysis on a signal are concise split and merge operators and some notion
of neighbourhood of samples fo the signal.
13
That way, the lifting scheme turns out to be a very powerful method of adapting the wavelet bases to the conditions encountered in the signal, and also to
other realms than the one-dimensional case by defining split and merge operations (cf. the notes on multi-resolution later in this report) and carefully
choosing the lifting steps.
Having now gained a firm understanding of how to decompose a signal using
wavelets and how to implement this process in a series of lifting steps, we
can now move to the core of this report, the spherical wavelets.
After quickly reviewing current shape representation techniques, we will describe in detail how shapes can be represented as spherical signals and how
these can in turn be decomposed by a wavelet analysis. These explanations
will be followed by an instructive example of the presented methods applied
to real-life data.
3
3.1
Spherical Wavelet Shape Representation
State-of-the art representations of 3D objects
Our work is concerned with the analysis of 3D objects and we will now
give a quick overview over current methods of describing 3D objects in a
usefull and efficient manner. We will begin with fundamental methods and
continue with describing current, more elaborate approaches before arriving
at the representation that is essential for building (spherical) wavelets in a
3D setting, the so-called multi-resolution framework.
Piecewise linear surface representation Maybe the most fundamental
method for 3D-modelling, along with voxel methods, is what is commonly
known as piecewise linear surface representation. Here, objects are approximated by a collection of surface polygons that generally are all of the same
type, eg. triangles or squares. The information on the structure is contained
in a list of all points, the vertex list and a list of the connectivities between
these, the face list or surface mesh.
If one is looking for a lower-dimensional analogy to this setting, one could
compare the surface polygon approach to a 2D-vector-image, while the voxel
representation would correspond to the pixel-image.
While this representation is amongst the oldest in computer graphics, it is
also still the most commonly used because of its nice mathematical proper14
ties and simple representation. All of the methods we are going to describe
next are using it as their foundation.
Spline-based representation Splines are a well-known method for curve
fitting using piecewise polynomial curves. Their use has mainly been motivated by their easy construction and manipulation as well as their small
storage requirements even for ”complicated” shapes. As such, it is possible
to represent a shape by a collection of control points, called knots for which
parameters are defined, the combination of which describing the shape of
the curve. The points between two knots are then interpolated using those
parameters, which explains the small storage requirements.
Because of their small number of control points, spline-based models have
found applications in tracking problems, as for instance in [14] where they
have been combined with multi-resolution methods for real-time performance.
Planar parametrizations Representing 3D-objects in 2 dimensions is an
age old problem originating from map-making. Representing an object in
such a way has numerous advantages, from texture mapping to the possibility of regular remeshing in the plane. Allthough, because of an - arguably smaller distortion, the algorithm described in this paper uses not a mapping
onto the plane but onto the unit sphere, the classical and recent developements in planar mapping still provide usefull insights into the problem of
describing 3-dimensional structures. Various good overviews of the different
techinques and mappings exist, for instance in [3].
Geometry images In a 3D-mesh, every point, or vertex, is fully described
by its coordinates in 3D-space, x, y and z. In 2D-imaging, color information
is also stored in a 3D-representation, namely the values for the red, green
and blue components. This analogy led Sweldens et al. [7] to build a correspondence between those two representations which they termed geometry
image. Here, every pixel in the image actually is a vertex of a 3D-model, and
the color information represents the spacial information of that vertex.
To build a geometry image from a 3D object, one needs to sample this spacial
information for a number of points on the mesh. This sampling needs to be
regular so as not to produce under- or oversampled regions of the geometry.
For this, different sampling schemes that are also very usefull for spherical
resampling have been proposed. We will describe those in more detail in a
15
later section of this report.
Laplacian framework One reason why mesh-representations of 3D-objects
are so widely used is the availability of well-studied mathematical models in
such a framework. If a model is described by a mesh, methods from graphtheory can be applied to it. In recent years, spectral graph theory has gained
a lot of interest in the computer science community for its ability to represent
the information contained in a graph in usefull ways.
The Laplacian framework is the result of this. The Laplacian matrix of a
mesh is a variant to the well-known connectivity matrix, with some additional
information on the structure of the underlying mesh. On this matrix, one can
perform an eigenanalysis to extract the main components of the mesh. This
approach can then be used for different processing operations on 3D-objects,
from approximation to editing and watermarking. A detailed discussion is
behond the scope of this report, and interested readers are referred to [13]
for more information.
Multi-resolution analysis As seen before, wavelet analysis is based on
computing differences between neighbouring samples of a signal. To be able
to perform this operation on a 3D-mesh, the representation of the mesh
somehow needs to contain this neighbourhood information. Multi-resolution
analysis provides extactly this - it represents an object as a series of meshes
on different levels of detail. Going from coarse to fine in this framework
means inserting new vertices between existing ones in a well-defined manner.
For a detailed description of methods for multi-resolution analysis, see [5]
or [23].
3.2
Spherical representation of 3D objects
Until now, we only analyized signals in a one or two-dimensional setting.
Since our ultimate goal is to develop methods to work on three-dimensional
objects, one needs to settle for a representation of these objects that allows
for the required mathematical operations to be performed without any inconsistencies or undefined behaviour.
To achieve this, we will consider a spherical signal representation, meaning
that we will be concerned with signals sampled on a sphere. There are different ways to sample from a sphere, (cf fig 6), but because of its regularity,
16
which eliminates over- and undersampled regions, an icosahedron-based subdivision is most commonly used. It is built by starting with an icosahedron
and recursively subdividing each of its faces recursively into 4 new triangles
(cf. figure 6).
Figure 6: regular and icosehedron-based sampling on the sphere
Mesh representations of three-dimensional objects can be mapped to the
sphere if they are of genus-0, e.g. topologically equivalent to a sphere and
do not contain holes.
The way this is done is to first find a minimally distorted mapping of the
faces of the mesh onto the sphere while retaining the original coordinates for
each vertex.
Finding this mapping is in itself far from trivial. Early approaches were
based on efforts to extend the method of barycentric coordinates for planar
parametrizations to the 3D case, but were unable to generate a bijective
mapping. Later, Sweldens and proposed what he called a progressive mesh
construction, which consists of removing vertices one by one from the mesh
whilst storing the connectivity information of each removed vertex until only
an icosahedron (or another platonic solid which can be inscribed in a sphere)
remains and then adding the vertices back into the mesh, positioning them
on the sphere and moving them to minimize distortions and eliminate foldovers.
17
While that construction achieves satisfying results, it is quite slow and the
optimization procedure is difficult to influence.
Lately, approaches based on spectral graph and Laplacian methods have
shown usefull for the task. However, they involve solving large systems of
non-linear equations and can thus be computationally expensive.
A good overview over all these methods can be gained from [6] or [4].
Figure 7: Icosahedron based subdivision of the Stanford bunny
Once the points are mapped, the x, y and z coordinates are considered
to be functions on the sphere (e.g. to fully describe a 3d object, there are
actually 3 spheres). As these functions are unevenly sampled, one now needs
to resample them on a regular grid on the sphere as seen before.
In clinical studies a specific organ or anatomical structure is observed over
a range of patients, or during a period of time for a single patient. The topology of the observed structure can be assumed consistent over all examples. A
widely used approach to study the variation of shapes is to first compute the
mean shape from the remeshed shape population. Subsequently a specific
example can be encoded by describing the vertex-wise deviation from the
mean shape.
To recapitulate, we have started out with a mesh consisting of vertices in
a three-dimensional euclidean space, which we mapped to a sphere by a
distortion-minimizing method. Now, this representation is resampled using a regular, subdivisible mesh defined on the sphere, resulting in a multiresolution representation of the original shape.
18
3.3
The construction of spherical wavelets
After defining how to map a given mesh onto a sphere and how to sample
a signal from it, we can begin describing the spherical wavelet analysis per
se. This analysis will allow us to efficiently represent the deviation from the
mean shape as spherical wavelet components.
A wavelet decomposition of a signal always consists of a subsampling and
a differencing step. The same is now done on the sphere. Subsampling is
achieved by retaining only a broader mesh representation of the sphere, which
must be a real subset of the original mesh. That way, we perform a so-called
multi-resolution analysis, and we are left with a representation of a signal on
the sphere composed of elements with decreasing resolution. For a detailed
derivation, see [16] and [15].
As we have seen, this by itself is not very helpful because the information
between two sampling levels is simply lost (this simple subsampling is commonly called the lazy transform).
To recover that information, one could simply use a Haar-like operator, which
retains the differences between two stages. That approach, however, introduces some undesired aliasing effects. Here, the lifting scheme is used to
generate more complex, better-adapted filters on the sphere, called stencils
(cf. figure 8), that can be interpreted as analogues to the filter taps of the
scaling- and wavelet filters in a lower-dimensional setting. Such stencils warry
from simple neighbourhoods to wider ones, the later resulting in a smoother
decomposition.
Figure 8: linear, quadratic and butterfly stencils for center (red) point
Using these stencils, for each point a neighbourhood is defined that can
be used to compute the spherical wavelet coefficient at that location.
19
Figure 9: Spherical wavelets at the same location at 5 different scales
3.4
Using spherical wavelets for shape representation
Working in a spherical wavelet framework, we now describe a combination of
methods that allow for a very compact description of 3D shapes. For calculating the proposed shape representation, methods from spectral graph theory
and statistics are applied to the spherical wavelet coefficients to achieve high
compression rates with minimal distortion.
Namely, after beeing filtered, the remaining spherical wavelet components are
grouped into bands by spectral graph partitioning and subjected to a dimensionality reduction algorithm for further compression, as will be elaborated
more extensively in this section.
3.4.1
Filtering
As aformentioned, wavelets can be used effectively for (lossy) signal compression. Nain et al [11] formulated a signal power measure using
1
p(n) =
K
X
vix (n)2 + viy (n)2 + viz (n)2
2
(14)
i=1
where vix (n) is the variation of vertex n of shape i from the mean of the
shape population as
c(k) = pT Φm (:, k)Γp (k)
(15)
for each wavelet component, which corresponds to the information about
the shape added by that coefficient.
20
Using this information, it is possible to filter out (eg. set to zero) those coefficients that do not carry a significant amount of information about the shape
(on average, over all shapes in the set) thus greatly reducing the data load
while maintaining most of the information (Nain et. al report a compression
of around 50% with an average distortion of only about 0.2 mm).
3.4.2
Band-wise grouping
After filtering the wavelet coefficient, correlational analysis on the remaining
coefficients can be used to group together those that also vary together. For
this, a covariance matrix over the coefficients in the training set is built using
PK
j=1 (uj (n) − Un )(Uj (m) − Um )
(16)
rn,m =
(K − 1)σUn σUm
∗z2
∗y2
where ui (n) = Γ∗x2
vi (n) + Γvi (n) + Γvi (n) and Un = [u1 (n)...uK (n)].
After zeroing out those values that have a low p-value, the resulting matriximage is rearranged using a spectral clustering technique originating from
graph theory known as normalized cuts [18]. This results in groups of
covarying wavelet coefficients representing regions of the object that share
some variation.
Figure 8 shows an example of such a grouping process, starting from the
initial covariance matrix of the first 12 wavelet coefficients on the left to the
rearranged coefficients grouped in bands on the right.
3.4.3
Dimensionality reduction
After grouping together the filtered wavelet coefficients, the dimensionality
of those bands can be further reduced by applying standard PCA on the
coefficients of each band, retaining either as many principal components as
there were training shapes or as there were coefficients, whichever number is
smaller.
Thus, from a population of shapes, a shape prior can be built that contains
global information about the training set on the one hand - the mean shape
and co-varying regions (”bands”) of the shape - and encodes a specific shape
as coefficients of the principal components of the individual bands of covarying deviations from the mean (band coefficients).
Thus, a significant amount of redundant information has been removed from
the training set and we are given an efficient tool to generate new shapes by
21
Figure 10: grouping by normalized cuts
22
varying the band coefficients. As in standard ASMs, the variation of these
band coefficients can be limited to a range learned from the training set
(usually +/- 3 standard deviations) as not to construct invalid shapes.
Additionally, band-wise grouping also increases significantly the amount of
variability of the shape model as the number of modes is multiplied by a
factor equal to the number of these bands.
3.5
3.5.1
Implementation
Matlab Wavelet Toolbox
The Matlab Wavelet Toolbox provides by far the most complete implementation of first- and second-generation wavelet methods, along with lifting
schemes. The toolbox is written with extensability in mind, and allows for
easy addition of new wavelet methods.
However, no adaptation to dimensions higher than 2 are foreseen in the implementation.
3.5.2
YAWTB; yet another wavelet toolbox
This toolbox contains matlab methods for computing 1- to 3-dimensional
and spherical wavelet transforms in continuous and discrete forms.
It comes with implementations of some popular wavelets, such as Mexican
hat, Morlet or difference of Gaussians. It does not, however, contain the
classical Daubechies wavelets.
Also, the spherical wavelet feature is somewhat limited, as it is based on phiand theta-coordinates of the sphere, which results in over-/undersampling
problems mentioned before.
3.5.3
Gabriel Peyres toolboxes
Peyre provides a collection of methods for algorithms on graphs, spherical
mapping and the spherical wavelet transform.
In fact, the wavelet part is a pretty straightforward implementation of Sweldens
early papers on spherical wavelets [17]. There isnt a great deal of options in
the toolbox. Only the butterfly wavelet is implemented and there are overall
not that many options to guide the mapping or transform.
On the other hand, as Peyre uses the same notations as the authors in [?],
23
which makes implementing own extensions quite intuitive.
3.5.4
Spherical wavelet ITK filter
ITK is short for Insight Segmentation and Registration Toolkit and is a collection of open source software written in C++ targeted for researches in the
field of medical science on the one hand and applications on the other.
Spherical wavelets are implemented in that framework in a somewhat still
experimental way. Nonetheless, they are fully usable, although not well extendable because of the lacking descriptions and comments in the code.
The implementation uses an object of the type SphericalMeshSource to model
the support mesh of the function to analyse. The object contains a method
to define a scalar function on the spherical mesh that can be transformed
into wavelet domain by a spherical wavelet transform function also defined
in the same object.
All in all, the code is easy to use and fast, but quite hard to extend.
3.6
Putting it all together
To build a shape prior as described by Nain et al in [12], one needs to combine some methods from the aforementioned software toolboxes.
Gathering the shape information Starting with a voxel-image, a mesh
is built by simply transorming the side of each voxel into 2 triangular patches.
Using the ITK toolbox, a smoothed version of this is computed and subsequently parametrized on the sphere and remeshed.
For each shape in the population, the resulting file is read into Matlab using
a custom mex-function for further processing.
Building the wavelet transform matrix As mentioned, a multi-resolution
representation is needed to be able to perform wavelet analysis on the meshes.
Such a representation is computed from the (common) face-list of the shapes
and stored.
Then, a mean shape is computed, stored seperately and only the deviations
from that mean are retained for each shape.
24
Sticking with the formulation in Nain et al. [12] and also to profit from
Matlabs optimized handling of them, the major steps in the algorithm were
implemented in matrix-form.
To compute the wavelet transform matrix that is defined on the mean shape,
a series of inverse wavelet transforms are performed.
For each vertex, a vector the size of the number of vertices is built and all
elements but the position of the current vertex are set to zero. This vector
is then transformed using the inverse wavelet transform provided by Gabriel
Peyre’s toolbox and the resulting vector, representing one basis function of
the wavelet transform on the mean shape, is stored as a coloumn vector,
resulting in the wavelet transform matrix.
Performing power analysis Using the formula mentioned beforehand,
one can measure the amount of information contained in one basis function
over the whole set of shapes. This can easily be implemented using Matlab’s
matrix notation and yields a filter matrix as described in Nain et al. [12]
that can achieve a quite remarkable compression.
Building the shape prior Once the number of basis function is reduced,
the covarying bands can be computed from the shape population. For this,
native Matlab functions are used to first compute the covariance matrix and
the respective p-values. The resulting matrix is automatically segmented
into the allready described bands by a normalized cuts technique. The resulting grouped information is then further processed by eigenanalysis using
Matlab’s built in function for that purpose to yield the desired shape prior.
25
4
4.1
Applications & Experiments
Applications
A multiresolution representation of an object is very useful in many applications. They range from faster computation of lighting effects to distributed
transmission of signals to locally descriptive features for object recognition.
4.1.1
Signal description
Spherical wavelets have received considerable attention in the field of astrophysics because the spherical representation arises naturally from many
physical signals measured in the field [10]. One can thus study astrophysical
phenomena at different levels of resolution and at different locations with one
mathematical tool.
4.1.2
3D modelling
As we have seen, any genus-0 object can be represented by a spherical mapping and thus by a wavelet representation. This can be very useful for 3d
modelling if speed is an issue, because the resolution of an object can be
dynamically adapted to the processing power.
The authors in [17] for example use a spherical wavelet representation to
compute complicated reflections or lighting on spherical surfaces, with convincing results.
Recently, a study concerned with the cortical folding in neonates [21] exploited the multi-resolution properties of the spherical wavelet representation
to model the developement of the brains of the subjects.
In the same line of research, [22] evaluated a spherical wavelet method for
measuring changes in cortical thickness, which is considered an early indicator for diseases such as Alzheimeir’s disease.
4.1.3
Segmentation
There exist a variety of methods for finding structures In 2d images and segmenting them accordingly. Especially for medical applications, it would be
of great benefit to have an effective tool to segment not only 2d images but
26
also 3d structures, as the ground data from most modern imaging techniques,
such as MRI or CT, are 3d voxels.
Spherical wavelets have been used to address this issue. Instead of segmenting
the data slice per slice using for example normal wavelets, a wavelet representation of the object is learned from a set of three-dimensional training
data directly and is used in an active shape framework to locate the desired
objects in unseen test data. In that context, the multi-resolution nature of
the wavelet transform can be exploited in a variety of ways.
Generally, a first step is to reduce the number of basis function by eliminating (e.g. zeroing out) those that contain only a small amount of information
about the signal (the object). The amount of data reduction often reaches
around 70%, which can be of big help for speed issues. Figure 10 gives an
example of this. Please note that only the deviation from the mean shape
of a population of 30 shapes is compressed using spherical wavelets. Thus,
a small number of coefficients still results in a shape similar to the original
one. In the light of this, the compression properties of the spherical wavelet
transform can be best appreciated by observing the shapes on the left of
figure 10.
Figure 11: Reconstructed shapes (hippocampi) with increasing compression
As for the proper modelling of shapes by spherical wavelet coefficients,
27
there exist a number of advantages that can be exploited for segmentation.
On the one hand, ASMs are inherently limited in their variability by the
number of training sets with which they are constructed. Thus, models from
a small number of training shapes can perform poorly if the object to be
segmented is somewhat different from everything in the training set.
Another problem with ASMs arises from the fact that they rely on eigenvalues of the model parameters to restrain the model search from deviating too
much from the original shape. These eigenvalues describe the major modes
of variation of the model parameters, thus possibly suppressing finer, local
variations that can be of great importance in medical applications.
To deal with these problems of classical modelling, multi-resolution approaches
based on spherical wavelets have been proposed. How these help to virtually extend the training set - thus providing greater variability to the model
while eliminating the possibility of large scale variations hiding smaller, highfrequency variations - will be the topic of the next section.
4.2
Experiments
In this section, we will conduct a series of experiments that demonstrate the
compressive and descriptive powers of the proposed shape representation.
For this, we will use a set of 3d meshes deduced from pre-segmented voxelimages of human hippocampi.
4.2.1
Compression
Using the wavelet component power measure mentioned in section 5.1, the
analyzed shape can be compressed by zeroing out certain basis functions.
Ffigure 12 shows an example of different levels of compression for a given
shape and figure 13 gives a reconstruction error for different levels of compression.
28
Figure 12: Shape compression at different ratios using component power
measure
Figure 13: Reconstruction error using compression
29
4.2.2
Band-wise grouping
As the wavelet coefficients can be grouped into co-varying bands using the
spectral technique mentioned in section 5.2, it is possible to build a variation
prior on each band at each level in the decomposition. Figures 14 and 15
give an example of varying the coefficients at level 1 for band 1 and 2.
Figure 14: Effect of varying band 1 at level 1 from -3 std. to +3 std.
Figure 15: Effect of varying band 2 at level 1 from -3 std. to +3 std.
30
4.2.3
Shape description
To evaluate the performance of the shape priors proposed by Nain et al., those
where compared to the standard point distribution models (PDM) built on
the training set by simple PCA on the one hand and by KPCA on the other.
For each of these shape representation, versions where learnt from 5, 10, 15
and 20 training shapes from a population of 30, tested on the remaining
unseen data and the deviation measured. Figures 16 and 17 show examples
of such reconstructions and results of randomized trials are given in figure
18.
Figure 16: Reconstruction of unseen shapes using spherical wavelet priors
trained on 5,10,15 and 20 shapes respectively
31
Figure 17: Reconstruction of unseen shapes using PCA priors trained on
5,10,15 and 20 shapes respectively
Figure 18: Performance comparision of spherical wavelet and classical PDM
reconstruction
32
5
Conclusion
While spherical wavelets provide a promising tool for the analysis of shapes
and signals defined on them, they suffer from some inherrent limitations.
The most fundamental one is the imposed topology constraint, as spherical
wavelets can only be computed for genus-0 type objects. For signals that
need to be mapped onto the sphere, as is the case for nearly all objects, such
a mapping can introduce a significant amount of distortion.
An interesting line of research to counter these problems are manifold methods - most notably those based on diffusion operators - capable of representing shapes of arbitrary topology, for which multi-resolution analysis has
recently been proposed.
As these methods rely on the same mathematics as does the laplacian parametrization method, they could provide a more concise framework for the analysis
of shapes than the spherical wavelets and will be subject to further investigation.
33
References
[1] I. Daubechies. Ten Lectures on Wavelets. SIAM, 1992.
[2] I. Daubechies and W. Sweldens. Factoring wavelet transforms into lifting
steps. J. Fourier Anal. Appl., 4(3):245–267, 1998.
[3] Michael S. Floater and Kai Hormann. Surface parameterization: a tutorial and survey. In N. A. Dodgson, M. S. Floater, and M. A. Sabin,
editors, Advances in multiresolution for geometric modelling, pages 157–
186. Springer Verlag, 2005.
[4] Ilja Friedel, Peter Schröder, and Mathieu Desbrun. Unconstrained spherical parameterization. In SIGGRAPH ’05: ACM SIGGRAPH 2005
Sketches, page 134, New York, NY, USA, 2005. ACM.
[5] Michael Garland. State of the art reports multiresolution modeling:
Survey future opportunities, 1999.
[6] Craig Gotsman, Xianfeng Gu, and Alla Sheffer. Fundamentals of spherical parameterization for 3d meshes. ACM Trans. Graph., 22(3):358–363,
2003.
[7] Xianfeng Gu, Steven J. Gortler, and Hugues Hoppe. Geometry images.
ACM Trans. Graph., 21(3):355–361, 2002.
[8] F. J. Harris. On the use of windows for harmonic analysis with the
discrete fourier transform. Proceedings of the IEEE, 66(1):51–83, 1978.
[9] S.G. Mallat. A theory for multiresolution signal decomposition: The
wavelet representation. PAMI, 11(7):674–693, July 1989.
[10] E. Martinez-Gonzalez, J. E. Gallegos, F. Argueso, L. Cayon, and J. L.
Sanz. The performance of spherical wavelets to detect non-gaussianity
in the cmb sky. MON.NOT.ROY.ASTRON.SOC., 336:22, 2002.
[11] Delphine Nain, Steven Haker, Aaron F. Bobick, and Allen Tannenbaum.
Multiscale 3d shape analysis using spherical wavelets. In MICCAI (2),
pages 459–467, 2005.
34
[12] Delphine Nain, Steven Haker, Aaron F. Bobick, and Allen Tannenbaum.
Multiscale 3d shape analysis using spherical wavelets. In MICCAI (2),
pages 459–467, 2005.
[13] Andrew Nealen, Takeo Igarashi, Olga Sorkine, and Marc Alexa. Laplacian mesh optimization. In GRAPHITE ’06: Proceedings of the 4th international conference on Computer graphics and interactive techniques
in Australasia and Southeast Asia, pages 381–389, New York, NY, USA,
2006. ACM.
[14] F. Orderud and S. I. Rabben. Real-time 3d segmentation fo the left
ventricle using deformable subdivision surfaces. In Proceedings of IEEE
Conference on Computer Vision and Pattern Recognition, 2008.
[15] P. Schröder and W. Sweldens. Spherical wavelets: Texture processing.
In P. Hanrahan and W. Purgathofer, editors, Rendering Techniques ’95.
Springer Verlag, Wien, New York, August 1995.
[16] Peter Schröder and Wim Sweldens. Spherical wavelets: efficiently representing functions on the sphere. In SIGGRAPH ’95: Proceedings of
the 22nd annual conference on Computer graphics and interactive techniques, pages 161–172, New York, NY, USA, 1995. ACM.
[17] Peter Schröder and Wim Sweldens. Spherical wavelets: efficiently representing functions on the sphere. Computer Graphics, 29(Annual Conference Series):161–172, 1995.
[18] Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence
(PAMI), 2000.
[19] W. Sweldens. The lifting scheme: A construction of second generation
wavelets. SIAM J. Math. Anal., 29(2):511–546, 1997.
[20] David S. Taubman and Michael W. Marcellin. JPEG 2000: Image Compression Fundamentals, Standards and Practice. Kluwer Academic Publishers, Norwell, MA, USA, 2001.
[21] P. Yu, B.T.T. Yeo, P.E. Grant, B. Fischl, and P. Golland. Cortical
folding development study based on over-complete spherical wavelets.
In MMBIA07, pages 1–8, 2007.
35
[22] Peng Yu, Xiao Han, Florent Segonne, Rudolph Pienaar, Randy L. Buckner, Polina Golland, P. Ellen Grant, and Bruce Fischl. Cortical surface
shape analysis based on spherical wavelet transformation. In CVPRW
’06: Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop, page 60, Washington, DC, USA, 2006. IEEE
Computer Society.
[23] D. Zorin and P. Schrder. Siggraph 2000 course on subdivision for modeling and animation, 2000.
36