|
[In this Intel-sponsored feature, part of Gamasutra's Visual Computing section, Muthyalampalli examines how you can use the fast Fourier
transform (FFT) for image processing.]
The discrete
Fourier transform (DFT) is a specific
form of
Fourier analysis to convert one function (often
in the time
or spatial domains) into another (frequency
domain). DFT
is widely employed in signal processing and
related fields
to analyze frequencies contained in a sample
signal,
solve partial differential equations, and perform
other
operations, such as convolutions.
The fast Fourier
transform (FFT)
is an efficient implementation of DFT and
is used,
apart from other fields, in digital image processing.
FFT is
applied to convert an image from the image (spatial)
domain to the
frequency domain. Applying filters to images
in the
frequency domain is computationally faster than to
do
the same in the image domain.
This article
will not go into the theory of FFT but will
address the
implementation of the algorithm in converting
a 2D image
to the frequency domain and back to the image
domain
(inverse FFT). Once the image is transformed into
the
frequency domain, filters can be applied to the image
by
convolutions. FFT turns the complicated convolution
operations
into simple multiplications. An inverse transform
is then
applied in the frequency domain to get the result of
the
convolution.
The sample
application was developed in DirectX 10
and
demonstrates the forward and inverse transforms of
the image to
the frequency domain and back. Applying the
filters is
straightforward once the transform takes place
and hence is
not discussed here.
The
application uses ping-pong textures, a common
technique
used in many GPGPU (general-purpose
computing on
graphics processing units) applications.
Ping-pong
textures involve a pair of texture surfaces that
a shader
uses both as input and output data. The shader
program uses
one texture as input to do some computation
and writes
the output to the second texture. Subsequent
iterations
will swap the input and output textures (thus the
input from a
previous iteration becomes the output in the
current
iteration and so on).
Fast
Fourier Transform
The Fourier
transform decomposes an image into its
real and
imaginary components, which is a representation
of the image
in the frequency domain. If the input signal
is an image,
the number of frequencies in the frequency
domain is
equal to the number of pixels in the image or
spatial
domain. The inverse transform re-transforms the
frequencies
to the image in the spatial domain. The FFT
and its
inverse of a 2D image are given by the following
equations.
where f(m,n)
is the pixel at coordinates (m,n), F(x,y) is the
value of the
image in the frequency domain corresponding
to the
coordinates x and y, and M and N are the dimensions
of the
image.
As the
equations show, a naive implementation of this
algorithm is
very expensive. But the beauty of FFT is that
it is
separable; namely, the 2D transform can be split into
two 1D
transforms-one in the horizontal direction and the
other in the
vertical direction. The equation below shows
the 1D
transform in the horizontal direction.2 The
end
result is
equivalent to performing the 2D transform in the
frequency
space.
The FFT that
is implemented in the application here
requires
that the dimensions of the image be a power
of two.
Another interesting property of FFT is that the
transform of
N points can be rewritten as the sum of two
N/2
transforms (divide and conquer).3 This
is important
because some
of the computations can be reused, which
eliminates
expensive operations.
The output
of the Fourier transform is a complex
number and
has a much greater range than the image in
the spatial
domain. Therefore to accurately represent these
values, they
are stored as fl oats. Furthermore, the dynamic
range of the
Fourier coefficients are too large to be displayed
on the
screen, and hence these values are scaled [usually
by dividing
by the (Width × Height) of the image] to bring
them within
the range of values that can be displayed.4
The next
section describes the implementation details
of the FFT
algorithm and its inverse in a GPGPU application.
---
2 Sumanaweera,
Thilaka, and Donald Liu, "Medical Image Reconstruction with the FFT." Chapter
48 in GPU Gems 2: Programming Techniques
for High-Performance Graphics and
General-Purpose Computation,
Matt Pharr, ed., Addison Wesley, 2005.
3 Cooley,
J. W., and Tukey, O. W., "An Algorithm for the Machine Calculation of Complex
Fourier Series." Math. Comput. 19 (1965): 297-301.
4 Mitchell,
Jason L., Marwan Y. Ansari, and Evan Hart, "Advanced Image Processing with
DirectX® 9 Pixel Shaders." Section 4 in ShaderX2:
Shader Programming Tips and Tricks
with DirectX 9, Wolfgang F. Engel, ed., Plano, TX: Wordware Publishing, 2003.
|