The book Matrix Computations by Gene Golub and Charles
Van Loan sits on the bookshelves of many MathWorks employees.
Professor Golub, one of the founding members of the Stanford
University computer science department, is a revered figure in the
area of numerical analysis and matrix computations.
We were very sorry at MathWorks to hear that Professor Golub passed
away last month at the age of 75. His contributions will live
on in modern scientific and engineering computing.
Earlier this year I exchanged e-mail with blog reader Craig
Doolittle. Craig was writing MATLAB scripts to clean up scanned
pages from old manuscripts. One of the samples he sent me was a
page from "Fragmentation of Service Projectiles,"
N.F. Mott, J. H. Wilkinson, and T.H. Wise, Ministry of Supply,
Armament Research Department, Theoretical Research Report No.
37/44, December 1944.
The image is too big to show at full resolution in this blog, so
here's a thumbnail view.
Craig wanted suggestions on how to clean up isolated
"noise" dots without removing small dots that are part of
characters. Let's look closely at a cropped portion of the
page.
bw = page(735:1280, 11:511);
imshow(bw)
We could start by using bwareaopen to remove small
dots. For For example:
Unfortunately, this approach has removed portions of some of the
characters. Here's a method using bwlabel and
regionprops to highlight the pixels that were removed.
removed = xor(bw2, bw3);
L = bwlabel(removed);
s = regionprops(L, 'Centroid');
centroids = cat(1, s.Centroid);
imshow(bw)
hold on
plot(centroids(:,1), centroids(:,2), 'ro')
hold off
You can see that some of the removed dots were noise, while
others were parts of the characters "e", "i",
"m", etc.
My suggestion to Craig was to restore removed dots that are
"close" to the characters remaining after
bwareaopen. We can do this using dilation and some logical
operators.
I also suggested using morphological recontruction to get all
the pixels connected to the overlapping pixels found above. It
doesn't seem to be really necessary here, though, so I'm
going to save this technique for a future blog post, using a better
example.
I'm sure there are lots of different ways to approach this
text clean-up problem. Does anyone have suggestions for other
approaches?
I get a lot e-mail asking for help with MATLAB GUI programming.
Since GUI programming is not my primary focus and expertise, I
almost always reply with a form letter containing links to useful
GUI programming resources. You can now see these links, if
you're interested, by clicking "GUI programming
resources" in the sidebar (see right).
The notion of neighbor connectivity is discussed in most image
processing textbooks. Specifically, what is the set of neighbors of
a pixel? For example, a commonly-used neighborhood connectivity is
4-connected, where each pixel has four neighbors. The neighborhood
looks like this:
1
1 1 1
1
Another common connectivity is 8-connected, where each pixel has
eight neighbors. The neighborhood looks like this:
1 1 1
1 1 1
1 1 1
Many Image Processing Toolbox functions depend on a definition
of connectivity. Most of these functions allow you to specify the
desired connectivity in a very general, flexible way.
Consider bwperim. This function computes perimeter
pixels by finding the foreground pixels that are connected to
background pixels. Some definition of connectivity is therefore
required. Let's look at an example.
In addition to specifying connectivity as "4" or
"8", you can also use a 3-by-3 matrix of 0s and 1s to
specify other kinds of connectivity. The 3-by-3 connectivity matrix
below says that a pixel has only two neighbors: The pixel above and
the pixel below.
Note that the (4,2) and (4,6) pixels are not considered to be
perimeter pixels with this connectivity. That's because their
upper and lower neighbors are both part of the foreground, not the
background.
Sometimes it's useful to define 6-connectivity. There are
two different flavors, both of which can be expressed using the
3-by-3 matrix form:
For processing three-dimensional arrays, you specify
6-connectivity (face-adjacent pixels), 18-connectivity (face- or
edge-adjacent pixels), or 26-connectivity (face-, edge-, or
vertex-adjacent pixels). But you can specify your own flavor of
connectivity by using a 3-by-3-by-3 matrix. Here's an
example.
There's only one three-dimensional connected component. But
that's using the default connectivity. Let's define a
three-dimensional connectivity such that each pixel has neighbors
only in the same plane.
In image processing textbooks, you often see low-level image
processing operations grouped into two categories:
Point processes
Neighborhood processes
In a point process, the value of each output pixel is a
function of only the corresponding input pixel. Examples include
adding a constant to all image pixels and gamma correction.
In a neighborhood process, the value of each output
pixel is a function of the corresponding input pixel and a fixed
set of neighborhood pixels around it. For example,
filtering with a 3-by-3 filter is a neighborhood process. Each
output pixel depends on the values of the corresponding input
pixel, as well as that pixel's eight neighbors. Dilation is
another example of a neighborhood process.
Here are two more types of operations that I've found to be
common:
Transforms
Queue processes
Lots of operations have the word transform in them,
such as Fourier transform, Hough transform, and geometric or
spatial transform. But it's a little hard to define what makes
something a transform. I think transforms tend to have these
characteristics in common:
They are invertible, or at least approximately invertible under
certain conditions.
The transform pairs involve different coordinate systems. For
example, the Fourier transform involves a spatial and a frequency
coordinate system, and the Radon transform involves a spatial and a
rho-theta coordinate system. Even geometric transforms involve two
different spatial coordinate systems.
I'm hoping to find a better term for queue
processes, but that's the best I've come up with so
far. A queue process works by "spreading out" in some
fashion from a set of starting pixels. I use the word
"queue" here because implementations often involve a
queue data structure, like this:
% Initialize the queue
for each pixel p in image:
if p satisfies some condition
push p onto queue
end if
end for
while queue not empty
get next pixel k from queue
record something about k in the output image
for each neighbor pixel j of pixel k
if j satisfies some condition
push j onto queue
end if
end for
end while
"Flood filling," which many of you can probably
visualize, can be implemented this way. There are several
algorithms in the Image Processing Toolbox that use queues in this
fashion, including morphological reconstruction
(imreconstruct) and the watershed transform
(watershed).
I have three questions for you:
Do you have better insight into why we call some operations
"transforms"?
Can you think of a better term than "queue
process"?
Can you think of other useful classes of image processing
operations?
When I started this blog in early 2006, one of my first series
of articles was called "All About Pixel Colors." These
articles discussed how MATLAB associates matrix values with
specific screen pixel colors. The series talked about the two
fundamental image display models (indexed and truecolor); the
scaled indexed image variation; using indexed scaling to program
brightness/contrast controls; and the grayscale and binary image
conventions in the Image Processing Toolbox.
This material might be useful for people who have started
reading this blog since then, so I've used our new category
archive pages to gather the material together. In the sidebar on
the right, under Categories, click on "Pixel
colors."
Or, if you prefer, you can look at my
MATLAB Digest article that was based on the blog postings.
I am slowly making more use of categories in my blog. If you
have category suggestions, please let me know.
I'd like to welcome back guest blogger Stan Reeves, professor of Electrical and Computer Engineering at Auburn University.
Stan will be writing a few blogs here about image deblurring.
In my last blog, I looked at image deblurring using an inverse filter and some variations. The inverse filter does a terrible job due to
the fact that it divides in the frequency domain by numbers that are very small, which amplifies any observation noise in
the image. In this blog, I'll look at a better approach, based on the Wiener filter.
I will continue to assume a shift-invariant blur model with independent additive noise, which is given by
y(m,n) = h(m,n)*x(m,n) + u(m,n)
where * is 2-D convolution, h(m,n) is the point-spread function (PSF), f(m,n) is the original image, and u(m,n) is noise.
The Wiener filter can be understood better in the frequency domain. Suppose we want to design a frequency-domain filter G(k,l)
so that the restored image is given by
We can choose G(k,l) so that we minimize
E[] is the expected value of the expression. We assume that both the noise and the signal are random processes and are independent
of one another. The minimizer of this expression is given by
This filter gives the minimum mean-square error estimate of X(k,l). Now that sounds really great until we realize that we
must supply the signal and noise power spectra (or equivalently the autocorrelation functions of the signal and noise).
The noise power spectrum is fairly easy. We usually assume white noise, which makes the noise power spectrum a constant.
But how do we determine what this constant is? If
then the noise power spectrum is given by
for an MxN image. This noise variance may be known based on knowledge of the image acquisition process or may be estimated
from the local variance of a smooth region of the image.
The signal power spectrum is a little more challenging in principle, since it is not flat. However, we have two factors working
in our favor: 1) most images have fairly similar power spectra, and 2) the Wiener filter is insensitive to small variations
in the signal power spectrum.
Consider two very different images -- the cameraman and house.
Let's see what happens if we restore the cameraman with the actual power spectrum.
h = fspecial('disk',4);
cam_blur = imfilter(cam,h,'circular');
% 40 dB PSNR
sigma_u = 10^(-40/20)*abs(1-0);
cam_blur_noise = cam_blur + sigma_u*randn(size(cam_blur));
cam_wnr = deconvwnr(cam_blur_noise,h,numel(cam)*sigma_u^2./cf);
subplot(1,1,1)
imshow(cam_wnr)
colormap(gray(256))
title('restored image with exact spectrum')
For comparison purposes, we restore the cameraman using the power spectrum obtained from the house image.
cam_wnr2 = deconvwnr(cam_blur_noise,h,numel(cam)*sigma_u^2./hf);
imshow(cam_wnr2)
colormap(gray(256))
title('restored image with house spectrum')
Visually, the two are very similar in quality. In terms of mean-square error (MSE), the former is better (lower), as the
theory predicts:
format shorte
mse1 = mean((cam(:)-cam_wnr(:)).^2)
mse2 = mean((cam(:)-cam_wnr2(:)).^2)
mse1 =
2.9905e-003
mse2 =
3.9191e-003
In both cases, the spectrum is concentrated in the low frequencies and falls away at higher frequencies. A smoothed version
of the spectra would look even more similar. Rather than using the power spectrum from a specific image, one can either average
a large number of images or use a simple model of the power spectrum or autocorrelation function.
A common model for the image autocorrelation function is
We calculate these parameters for the cameraman, averaging over horizontal and vertical shifts to get a single parameter for
the correlation coefficient. Then we calculate an autocorrelation function.
The restored image from this method is better than the restored image using the house spectrum but not quite as good as the
one using the exact cameraman spectrum. All in all, we can see that the exact noise-to-signal spectrum isn't necessary to
yield good results.
- by Stan Reeves, Department of Electrical and Computer Engineering, Auburn University
In September I
wrote about customer-reported speed issues related to TIFF
files containing a very large number (thousands or tens of
thousands) of images. I've started to call these things MMTs
(massively multipage TIFFs).
If you have experienced this kind of performance problem and you
have an MMT that you'd be willing to share with us, please let
me know. We can synthesize files here, of course, but we prefer to
get our hands on "real-world" files whenever
possible.
Just send me an e-mail, and then we'll work out a method for
transferring the file.
The Image Processing Toolbox function bwselect is
perfect for this task. You can use it interactively or
noninteractively.
If you use the interactive syntax:
bw2 = bwselect(bw)
the function displays bw and waits for you to click on
the image with the mouse. You can click on one or multiple points.
To finish your selection, double-click on the last point, or just
press RETURN. bwselect then returns a binary image
containing only the objects ("blobs") that you clicked
on.
You can also use the noninteractive syntax, in which you pass in
the point locations as input arguments instead of using the
mouse.
Back in March I started writing about an algorithm
implementation experiment for computing upslope area.
Given an "image" whose pixel values are terrain
elevations, the upslope area of a pixel is the area of the uphill
terrain that drains through that pixel. I chose a
paper that looked promising, and I wrote
about my progress as I implemented the paper's techniques in
MATLAB. Those implementations are available in an "upslope
toolbox" on the MATLAB Central File Exchange.
Some of you have probably been scratching your heads, wondering
why I've been writing so much about water flow and terrain
analysis in an image processing blog. I want to conclude with some
thoughts on that very point.
First, this series illustrates quite a few different image
processing algorithm implementation techniques. If you implement
image processing algorithms in MATLAB, I think it will be well
worth your time to take a look. Here are a few of the useful
nuggets you'll find:
Using morphological functions to find pixels with certain
relationships to their neighbors. For example, using
imerode and relational operators to find all pixels that
have no downhill neighbors.
Using a binary image as a logical mask to index into another
image. I use this technique all the time, and you should, too.
Using linear indexing to process sets of pixels.
(Essential!)
Using linear index offsets to find neighbors of a set of pixels
(
"neighbor indexing").
Identifying connected groups of pixels that touch the image
border.
Using a variety of Image Processing Toolbox functions, such as
imregionalmin, roifill, bwselect, etc.,
to achieve different effects.
Using sparse linear systems to solve for pixel values that are
linearly related to neighboring pixel values.
Visualizing algorithm output:
By superimposing a quiver plot on an image.
By superimposing one image transparently on another.
I also believe that terrain analysis techniques like this will
continue to find application to other kinds of image processing and
image analysis problems, just like the watershed transform did.
Much of mathematical morphology applied to image processing is
about ordering relationships between pixels and their neighbors,
and the upslope area problem fits right into that category. In
particular, I would guess that the influence and dependence maps
will be useful for a variety of problems.
And hey, it was fun! And if I'm going to keeping writing new
content here every week, it's just gotta be fun!